Fio:PathFinding

From Fio

(Difference between revisions)
(Path Finding)
Line 8: Line 8:
===A*===
===A*===
-
===When A* Fails===
+
When adapting A* algorithm to your game, you may find that sometimes it consumes too much CPU ticks – everything just stuck on the screen for a while (especially when path finding fails). Though A* itself has been there and well studied for quite a long time, the implementation flexibility has given us many choices which might lead to inefficiency. This article describes several bottlenecks emerged when developing path finding for a tile based RPG.
-
it should fail early.
+
-
the 4 steps approach:
+
I didn’t start the algorithm from scratch – I just begin with Steve Rabin’s code in ‘Game Programming Gems I’ (3.5 A* Speed Optimizations). For using a priority queue (binary heaps) to implement open table is widely discussed, it’ll not be covered here.
-
====poking====
+
-
====closuring====
+
And let’s check the first one…
-
====blocking====
+
===Check Cell Blocking===
 +
After fetching a node from open table, A* will iterate thru all neighbor cells of the node, trying to add walkable ones to open table. For the neighbor who is already in open or close table, if the new path is better (a lower G value than the old one), A* will just refresh the state of the node (G value, in open or close table, and parent). How can we tell whether a cell is blocked or walkable?
-
====embedding====
+
In tile-based game, path finding is usually performed on the fringe layer (walls, trees, sceneries, and npcs etc.) where a cell might contain multiple game objects. For example, an npc can step into a cell and drop several items there. Each holdee may affect the cell’s blocking state:
 +
<span lang="EN-US" style='FONT-FAMILY:"Courier New"; color:Teal'><pre>
 +
for each object in current cell
 +
  if object can’t walk thru then
 +
  return cell is blocked
 +
  end
 +
end
 +
return cell is walkable</pre></span>
 +
This procedure has two drawbacks that may slow down the whole A* algorithm:
 +
i) If a cell contains multi objects, the number of loops executed may depend on the order of those objects packed in the cell.
 +
ii) Object’s walkable state checking might be slow. It might be set by the map designer and can be only accessed thru script.
 +
 
 +
During game playing, we’ll notice that most cells’ blocking state is stable most of the time. We can eliminate the cost of querying by moving the blocking state to cell - if someone leaves, steps in, drops something into, take some item from, explode down a wall, lock or unlock a door in a cell, he is responsible for updating the cell’s blocking state (is_blocked).
 +
 
 +
And now to check if a cell is blocked is simple:
 +
<span lang="EN-US" style='FONT-FAMILY:"Courier New"; color:Teal'><pre>
 +
return cell.is_blocked
 +
</pre></span>
 +
 
 +
===Bank: Find Node===
 +
Another step of A* is to determine if a neighbor cell is already in open or close table (find a node that match the neighbor cell’s coords).If neither of the two tables contains the neighbor, a new node will be created based on cell’s coords and cost then inserted into open table. As the path finding going on, open and close table'll grow bigger. It’s not practical to iterate thru the whole table bruteforcely to find a node. We need some indexing mechanism to speed up. Also we want to reuse those hundreds of nodes malloced during the last path finding.
 +
 
 +
A node bank is introduced into A* to attack the two issues above. It acts like a node pool to lighten the memory allocation burden of creating node. A node in the bank can be indexed by coords quickly enough. We’ll focus on how to optimizing node finding in this section.
 +
 
 +
A bank is usually implemented in hash map where the coords is the key and the node is the value (though hash_map is not a standard part of c++ STL, it’s available as an extension in major STL implementations – SGI, Microsoft, etc.). If you don’t want to write the compare functor for the key (coords (x, y)) of hash_map, you can pack them into a double use a trick (assume sizeof(double) == 2*sizeof(int)):
 +
<span lang="EN-US" style='FONT-FAMILY:"Courier New"; color:Teal'><pre>
 +
union TO_DOUBLE{
 +
  double dbl;
 +
  struct two_int{
 +
  int x, y;
 +
  } ints;
 +
};
 +
</pre></span>
 +
and the bank will be defined as hash_map < double, ANode* >. To find a node:
 +
<span lang="EN-US" style='FONT-FAMILY:"Courier New"; color:Teal'><pre>
 +
find x y
 +
    TO_DOUBLE.two_int.x = x
 +
    TO_DOUBLE.two_int.y = y
 +
    return find TO_DOUBLE.dbl in bank
 +
</pre></span>
 +
hash map has a promise of constant time indexing, but that ‘constant time’ consists of several parts:
 +
i) key hashing
 +
ii) bucket list iterating
 +
iii) accurate matching in bucket
 +
While finding long paths, a profile may show that finding node in hash map is the bottleneck.
 +
 
 +
We can use direct indexing – a 2D array instead of a hash map to speed up:
 +
<span lang="EN-US" style='FONT-FAMILY:"Courier New"; color:Teal'><pre>
 +
vector < vector < ANode* > > bank;
 +
</pre></span>
 +
And to find the node which is indexed by (x, y):
 +
<span lang="EN-US" style='FONT-FAMILY:"Courier New"; color:Teal'><pre>
 +
bank[y][x]
 +
</pre></span>
 +
For a map of, say, 300*300 cells, the bank itself will cost about 300*300*sizeof(ANode*) which is about 350KB. For A* is a singleton, this memory impact is acceptable.
 +
 
 +
===Bank: Create Node and Reset===
 +
We do not fill up the bank with newed node at first. In A*’s constructor we just resize the bank to the size of map and initialize it with 0. During path finding, A* will ask the bank to create new node. The bank will use the x, y coords passed in to check if the according slot  is occupied by a node created in previous A*s. It’ll have to create new node if the slot is empty or just return that node in the slot – this process will warm up the bank with the newly created nodes.
 +
 
 +
Before perform a new path finding, we have to reset the nodes created by the bank during the last A* to make them available for this round. A hash map is still an appealing candidate to record the nodes in the bank to avoid brute-forcely iterating thru the 2D array of bank to reset. But we don’t need the quick-indexing ability (or cost) of hash map. Here a list will be better. Following code will create a node with coords <font color=Teal>(x, y)</font>, G value <font color=Teal>g</font>, and open/close table position <font color=Teal>place</font>:
 +
<span lang="EN-US" style='FONT-FAMILY:"Courier New"; color:Teal'><pre>
 +
ANode* Bank::createNode(int x, int y, int g, int place){
 +
ANode*& slot = bank_[y][x];
 +
if(slot){
 +
  //already in the bank,activate to make node available
 +
  slot->activate(x, y, g, place);
 +
}else{
 +
  //not in, create a new node
 +
  ANode* node = new ANode(x, y, g, place);
 +
  //set into bank
 +
  slot = node;
 +
  //record all nodes in bank
 +
  ptr_list_.push_back(&slot);
 +
}
 +
//record node used in this round of path finding
 +
ptr_list_busy_.push_back(&slot);
 +
return slot;
 +
}
 +
</pre></span>
 +
We use two lists above: ptr_list_ holds all nodes in the bank and ptr_list_busy_ just holds nodes used in this round of path finding.
 +
 
 +
To reset the bank, we only need iterate thru ptr_list_busy_ (instead of the whole bank) to deactivate all busy nodes. Then the ptr_list_busy_ itself is cleared to be used in next round. As the path finding going on, the bank is warmed up by the newed nodes - the allocating overhead gets lower and the memory cost becomes higher. A threshold can be set here to limit the total number of nodes in the bank to gain a balance between speed and space.
 +
 
 +
Another improvement is to define ptr_list_busy_ as a vector instead of a list. When packing pointers at the tail, a vector will double its size if the capacity reaches, while a list will have to allocate a new list node for each pointer. Using a vector will reduce some malloc overhead. But the improvement might be varying for different length of path, make sure you’d profiled.
 +
 
 +
===Iterating Neighbors===
 +
Iterating neighbors of current node might be another performance issue if not implemented properly. For the number of neighbors is different for some nodes (e.g. nodes on the edge of map), you might chose to new a vector or list of neighbors to iterating on. After profiling, you may find that the new operation is costly.
 +
 
 +
The solution is simple - use a fixed-size container that can hold max number of neighbors (usually 4~8, depending on cell’s connectivity) to avoid allocation. For iterating neighbors is heavily used in A*, the improvement can be significant.
 +
 
 +
Of course if you’d hard coded the neighbor-iterating code in main loop of A*, you’ll not be bothered by this issue (by losing some flexibility).
 +
 
 +
===Profile===
 +
While doing optimizing, it’s fatal to locate the hot spots by profiling instead of guessing. And after each modification we need to re-profile the code to make sure we do improve the performance.
 +
 
 +
You can use the clock() (in time.h) function to build a simple profiler – just divide the algorithm into several segments, count how many ticks each one cost, and how much they contribute to the overall ticks cost. But as you fine tune the bottleneck code and the performance improves, you may reach the point where clock() is not fine-grained enough to reveal the hot spot – clock() report that each part of the code just cost 0 ticks.
 +
 
 +
Commercial tools like DevPartners’ run-time performance analysis can do a much better work (by instrumenting stub into the code) like thorough percentages of ticks spent in call graph, profile result comparing etc.
 +
 
 +
===Next?===
 +
After these micro level techniques applied on A*, we can try some macro level path finding optimizing. Several of them are:
 +
 
 +
i) limit search spaces
 +
  limiting player or npcs’ moving range from whole map to several screens.
 +
ii) cache
 +
  recording searched path to avoid duplicate searching.
 +
iii) pre calculate
 +
iv) two-tiered or multi-tiered search
 +
    searching on a lower resolution version of a map then refine the result.
 +
 
 +
Of course these high level strategies will benefit from the A* optimizing we’d done. Check reference for more information.
 +
 
 +
And that’s it. Hope you might find the strategies discussed here useful for your projects. Questions, comments, etc, please [mailto:abrash_han@hotmail.com email me].
 +
 
 +
===Reference===
 +
[1] [http://www.gameprogramminggems.com/ Game Programming Gems 1], Section 3 Artificial Intelligence contains a few great articles on A* ranged from basic to advanced topics like aesthetic optimizations and speed optimizations.
 +
 
 +
[2] GameDev.net: [http://www.gamedev.net/reference/list.asp?categoryid=18#94" Path finding and Searching], In Justin Heyes-Jones and Patrick Lester’s article, they also give some tips on optimization.
 +
 
 +
[3] GameAI.com: [http://www.gameai.com/pathfinding.html Pathfinding],
 +
 
 +
[4] [http://www.policyalmanac.org/games/twoTiered.htm Two-Tiered A* Path finding],
 +
 
 +
[5] Amit’s Game Programming Page: [http://www-cs-students.stanford.edu/~amitp/gameprog.html A*].

Revision as of 14:09, 13 March 2007

"I always knew what the right path was.

Without exception, I knew, but I never took it." - Scent Of A Woman, Lieutenant Colonel Frank Slade

Contents

Path Finding

A*

When adapting A* algorithm to your game, you may find that sometimes it consumes too much CPU ticks – everything just stuck on the screen for a while (especially when path finding fails). Though A* itself has been there and well studied for quite a long time, the implementation flexibility has given us many choices which might lead to inefficiency. This article describes several bottlenecks emerged when developing path finding for a tile based RPG.

I didn’t start the algorithm from scratch – I just begin with Steve Rabin’s code in ‘Game Programming Gems I’ (3.5 A* Speed Optimizations). For using a priority queue (binary heaps) to implement open table is widely discussed, it’ll not be covered here.

And let’s check the first one…

Check Cell Blocking

After fetching a node from open table, A* will iterate thru all neighbor cells of the node, trying to add walkable ones to open table. For the neighbor who is already in open or close table, if the new path is better (a lower G value than the old one), A* will just refresh the state of the node (G value, in open or close table, and parent). How can we tell whether a cell is blocked or walkable?

In tile-based game, path finding is usually performed on the fringe layer (walls, trees, sceneries, and npcs etc.) where a cell might contain multiple game objects. For example, an npc can step into a cell and drop several items there. Each holdee may affect the cell’s blocking state:

 for each object in current cell
  if object can’t walk thru then
   return cell is blocked
  end
 end
 return cell is walkable

This procedure has two drawbacks that may slow down the whole A* algorithm: i) If a cell contains multi objects, the number of loops executed may depend on the order of those objects packed in the cell. ii) Object’s walkable state checking might be slow. It might be set by the map designer and can be only accessed thru script.

During game playing, we’ll notice that most cells’ blocking state is stable most of the time. We can eliminate the cost of querying by moving the blocking state to cell - if someone leaves, steps in, drops something into, take some item from, explode down a wall, lock or unlock a door in a cell, he is responsible for updating the cell’s blocking state (is_blocked).

And now to check if a cell is blocked is simple:

 return cell.is_blocked

Bank: Find Node

Another step of A* is to determine if a neighbor cell is already in open or close table (find a node that match the neighbor cell’s coords).If neither of the two tables contains the neighbor, a new node will be created based on cell’s coords and cost then inserted into open table. As the path finding going on, open and close table'll grow bigger. It’s not practical to iterate thru the whole table bruteforcely to find a node. We need some indexing mechanism to speed up. Also we want to reuse those hundreds of nodes malloced during the last path finding.

A node bank is introduced into A* to attack the two issues above. It acts like a node pool to lighten the memory allocation burden of creating node. A node in the bank can be indexed by coords quickly enough. We’ll focus on how to optimizing node finding in this section.

A bank is usually implemented in hash map where the coords is the key and the node is the value (though hash_map is not a standard part of c++ STL, it’s available as an extension in major STL implementations – SGI, Microsoft, etc.). If you don’t want to write the compare functor for the key (coords (x, y)) of hash_map, you can pack them into a double use a trick (assume sizeof(double) == 2*sizeof(int)):

 union TO_DOUBLE{
  double dbl;
  struct two_int{
   int x, y;
  } ints;
 };

and the bank will be defined as hash_map < double, ANode* >. To find a node:

 find x y
     TO_DOUBLE.two_int.x = x
     TO_DOUBLE.two_int.y = y
     return find TO_DOUBLE.dbl in bank

hash map has a promise of constant time indexing, but that ‘constant time’ consists of several parts: i) key hashing ii) bucket list iterating iii) accurate matching in bucket While finding long paths, a profile may show that finding node in hash map is the bottleneck.

We can use direct indexing – a 2D array instead of a hash map to speed up:

 vector < vector < ANode* > > bank;

And to find the node which is indexed by (x, y):

 bank[y][x]

For a map of, say, 300*300 cells, the bank itself will cost about 300*300*sizeof(ANode*) which is about 350KB. For A* is a singleton, this memory impact is acceptable.

Bank: Create Node and Reset

We do not fill up the bank with newed node at first. In A*’s constructor we just resize the bank to the size of map and initialize it with 0. During path finding, A* will ask the bank to create new node. The bank will use the x, y coords passed in to check if the according slot is occupied by a node created in previous A*s. It’ll have to create new node if the slot is empty or just return that node in the slot – this process will warm up the bank with the newly created nodes.

Before perform a new path finding, we have to reset the nodes created by the bank during the last A* to make them available for this round. A hash map is still an appealing candidate to record the nodes in the bank to avoid brute-forcely iterating thru the 2D array of bank to reset. But we don’t need the quick-indexing ability (or cost) of hash map. Here a list will be better. Following code will create a node with coords (x, y), G value g, and open/close table position place:

ANode* Bank::createNode(int x, int y, int g, int place){
ANode*& slot = bank_[y][x];
 if(slot){
  //already in the bank,activate to make node available
  slot->activate(x, y, g, place);
 }else{
  //not in, create a new node
  ANode* node = new ANode(x, y, g, place);
  //set into bank
  slot = node;
  //record all nodes in bank
  ptr_list_.push_back(&slot);
 }
 //record node used in this round of path finding
 ptr_list_busy_.push_back(&slot);
 return slot;
}

We use two lists above: ptr_list_ holds all nodes in the bank and ptr_list_busy_ just holds nodes used in this round of path finding.

To reset the bank, we only need iterate thru ptr_list_busy_ (instead of the whole bank) to deactivate all busy nodes. Then the ptr_list_busy_ itself is cleared to be used in next round. As the path finding going on, the bank is warmed up by the newed nodes - the allocating overhead gets lower and the memory cost becomes higher. A threshold can be set here to limit the total number of nodes in the bank to gain a balance between speed and space.

Another improvement is to define ptr_list_busy_ as a vector instead of a list. When packing pointers at the tail, a vector will double its size if the capacity reaches, while a list will have to allocate a new list node for each pointer. Using a vector will reduce some malloc overhead. But the improvement might be varying for different length of path, make sure you’d profiled.

Iterating Neighbors

Iterating neighbors of current node might be another performance issue if not implemented properly. For the number of neighbors is different for some nodes (e.g. nodes on the edge of map), you might chose to new a vector or list of neighbors to iterating on. After profiling, you may find that the new operation is costly.

The solution is simple - use a fixed-size container that can hold max number of neighbors (usually 4~8, depending on cell’s connectivity) to avoid allocation. For iterating neighbors is heavily used in A*, the improvement can be significant.

Of course if you’d hard coded the neighbor-iterating code in main loop of A*, you’ll not be bothered by this issue (by losing some flexibility).

Profile

While doing optimizing, it’s fatal to locate the hot spots by profiling instead of guessing. And after each modification we need to re-profile the code to make sure we do improve the performance.

You can use the clock() (in time.h) function to build a simple profiler – just divide the algorithm into several segments, count how many ticks each one cost, and how much they contribute to the overall ticks cost. But as you fine tune the bottleneck code and the performance improves, you may reach the point where clock() is not fine-grained enough to reveal the hot spot – clock() report that each part of the code just cost 0 ticks.

Commercial tools like DevPartners’ run-time performance analysis can do a much better work (by instrumenting stub into the code) like thorough percentages of ticks spent in call graph, profile result comparing etc.

Next?

After these micro level techniques applied on A*, we can try some macro level path finding optimizing. Several of them are:

i) limit search spaces

  limiting player or npcs’ moving range from whole map to several screens.

ii) cache

  recording searched path to avoid duplicate searching.

iii) pre calculate iv) two-tiered or multi-tiered search

   searching on a lower resolution version of a map then refine the result.

Of course these high level strategies will benefit from the A* optimizing we’d done. Check reference for more information.

And that’s it. Hope you might find the strategies discussed here useful for your projects. Questions, comments, etc, please email me.

Reference

[1] Game Programming Gems 1, Section 3 Artificial Intelligence contains a few great articles on A* ranged from basic to advanced topics like aesthetic optimizations and speed optimizations.

[2] GameDev.net: " Path finding and Searching, In Justin Heyes-Jones and Patrick Lester’s article, they also give some tips on optimization.

[3] GameAI.com: Pathfinding,

[4] Two-Tiered A* Path finding,

[5] Amit’s Game Programming Page: A*.

Personal tools