420 likes | 484 Views
PHA*:Performing A* in Unknown Physical Environments. Ariel Felner Bar-Ilan University. felner@cs.biu.ac.il Joint work with Roni Stern and Sarit Kraus . Appeared in Proc AAMAS02, Bolonga Italy. Journal version submitted to JAIR. Available at http://www.cs.biu.ac.il/~felner. Motivation:.
E N D
PHA*:Performing A* in Unknown Physical Environments Ariel Felner Bar-Ilan University. felner@cs.biu.ac.il Joint work with Roni Stern and Sarit Kraus. Appeared in Proc AAMAS02, Bolonga Italy. Journal version submitted to JAIR. Available at http://www.cs.biu.ac.il/~felner
Motivation: • Episode: Suppose a large army division must move from one location to another in unknown enemy territories. • Solution: Send scouts to explore the area and return with the best or optimal path. • The problem: find the optimal path between two locations in a real unknown physical environment.
Graphs in search problems known graphs Unknown or partially known graphs Very large graphs Small, partially known graphs Ex: A city map The entire graph is explicitly stored in memory Ex: tile puzzle or Rubik’s cube Our problem: an unknown graph in a real physical environment Only described implicitly.
optimal path search algorithms • For known graphs: algorithm such as Dijkstra’s shortest path, Bellman-Ford or Floyd-Warshal. Complexity O(n^2). • For Unknown very large graphs, the A* algorithm which is a best-first search algoirthm.
Best-first search schema • Node expansion:takes a node andgenerates its neighbors. • BFS:sorts all generated nodes in an OPEN-LIST and chooses the node with the best cost value for expansion. • BFS depends on its cost (heuristic) function. Different functions cause BFS to expand different nodes.
A* • A* is a best-first search algorithm that uses f(n)=g(n)+h(n)as its cost function. Nodes are sorted in an open-list according to their f-value. • g(n)is the shortest known path between the initial node and the current node n. • h(n)is an admissible heuristic estimation from n to the goal node • A* is admissible, complete and optimally effective. [Pearl 84] • Result: any other optimal search algorithm will expand at least all the nodes expanded by A*
A* in physical environments • In virtual large graphs: the complexity is measured by the number of generated nodes and expanding a node is considered to be done in constant time in the memory. • In a real physical environment: expanding a node requires a mobile agent to travel to that node.
PHA*: Physical A*. • PHA*: an algorithm that expands all the nodes expanded by A* but in an environment with physical characteristics. • PHA* finds the shortest path in such environment • Complexity is measured by the traveling effort of the agent. • We can omit the time and memory complexity of the computer because the graph is small.
Related work RoadmapA* D* Mapping, node exploration RTA* Automated route planning navigation Explores the whole graph Requires priori knowledge of the terrain. Doesn’t find the optimal path.
Real practical applications. • An army division that sends a scout in order to find the optimal path for future usage. • In a dynamic (and thus, unknown) computer network when we want to transfer large files between two nodes. A small packet, (operating as scout) might explore the graph and return with the optimal solution.
PHA* description • At the beginning of the search: • The agent is at the initial node. • The only knowledge available is the coordinates of the initial node and the goal node. • The algorithm must expand all the nodes that A* expands. • We assume that when physically reaching a node we can learn about its neighbors. • A Node is expandable if it is known a priory or was explored by the agent.
PHA*: Initial solution • Upper level: At each cycle choose to expand the node with the smallest f-value, exactly as A* does. g is known. h is the straight line heuristic. • Lower level: If the node selected by the upper level was not explored yet, the agent has to navigate to that node in order to learn about its neighbors.
Initial solution (cont.) Insert the initial node and goal node to the open list. Have the goal node been expanded? Yes No Expand node t Terminate tThe node with the smallest f-value in the open list Have the agent visited node t? Yes Upper level No Lower level Navigate to node t
Lower level navigation algorithms • The purpose of the lower level algorithm is to navigate the agent from its current location to the best node that was selected by the upper level Tree path Go through the search tree Shortest known Path Go through the shortest known path Air path Go directly to the target node
Example Tree path R Tree path 1 2 Tree path Tree path 6 5 3 4
Example R 1 2 Shortest path Shortest path Shortest path 6 5 3 4
Example R 1 2 6 5 3 4 Air path
DFS-based navigation algorithms • During the navigation learn and explore new nodes on the fly. Saves future work. • DFS based searches: The order of node selection is determined according to a heuristic function. P-DFS Go to the neighbor that is closest to target. D-DFS Go to the neighbor that is closest in direction to the target. A*-DFS Go to the neighbor that minimizes: d(curr,n)+h(n,target)
Example R 1 2 A 3 4 D P
Problem R 5 1 C 3 2 4 G
Problem R 5 1 C 3 2 4 G
Improved-A*DFS The agent should prefer to navigate through nodes that are likely to be expanded in the near future: i.e. with small f-value • curr = the agent’s current location. • t = the node that the agent is navigating to. • According to A*DFS the agent will go to the neighboring node n that minimizes: c(n)=d(curr,n)+h(n,t)
I-A*DFS (cont.) • According to I-A*DFS the agent will go to the neighboring node that minimizes: • If n is not in the open list, it is exactly as A*DFS • If n is in open list, its I-A*DFS value decreases as f(n) is near f(t), which is the smallest f-value in the open list.
Experiments • Delaunay graphs, are graphs created by Delaunay triangulation of random points (uniform distribution) on a unit square. • Created by Delaunay triangulation
Experiments (cont.) • The experiments were done on Delaunay graphs, deleting and adding random edges.
Improving the upper level • So far the nodes were expanded (by the upper level) in a best-first order according to their f-value. • However it might be better to expand first nodes that are closer to the agent. R 1 2 3 5 6 4 7 G
Win-A* • We want to expand a node in good position in the open list but also one that is closed to the agent. • We define a window of size S in the front of the open list and choose to expand a node in that window that minimizes: c(n)=f(n)*d(curr,d)
Win-A* Is the goal node in the closed list? Terminate Yes No t = the node that minimizes Send an agent to expand t Lower level Expand & Mark t as EXPANDED Is the best node marked as EXPANDED? Move the best node in the open list to the closed list Yes No
MAPHA*: multi-agent PHA* 2 different efficiency considerations: • Time– the time from the beginning of the search until the best path to the goal node is found. Corresponding algorithm - time efficient algorithm • Fuel– the cost of mobilizing the agents Corresponding algorithm - fuel efficient algorithm • We assume full knowledge sharing • We only use I-A*DFS and WIN-A*
Fuel efficient algorithm • Fuel efficient algorithm:find the optimal path to the goal node, while spending as little fuel as possible. • We will only move one agent at a time. There is no benefit from moving more than one agent at a time. • We use the same algorithm but we only have to decide which agent to move to the target node.
Which agent to move? • At the beginning all agents are at the initial state. • At each stage the upper level defines a window of nodes in the front of the open-list. • For each agent a and a node n from the window define an allocation function: c(a,n)=f(n)*d(a,n). • We choose an agent a node that minimize c(a,n). • In the case of tie breaking (such as at the beginning) we choose randomly.
Time efficient algorithm • Time efficient algorithm: here we want to find the solution as fast as possible. • All agents are always moving in order to save time. • If we have p available agents and q nodes in the window we want to distribute these p agents to these q nodes efficiently.
Time-efficient algorithm • We want the distribution function to be biased in favor of nodes with good f-value. • We want to favor the following nodes: • ** Nodes in front of the open list • ** Nodes that are close to agents • ** Nodes that are un popular, i.e. nodes that no agent was assigned to them.
Distributing agents to nodes • We iterate on the agents with the following formula: Approximating the time it would take the agent to get to that node How many agents are already assigned to go to that node The importance of exploring that node x Ties are broken according to the f-values of the nodes. The numbers of agents assigned were: 57 , 29 and 19. 2 4 8
Simulations, fuel-efficient alg. --- 500 nodes --- 1000 nodes ---- 2000 nodes ---- 4000 nodes
Results, time-efficient alg. --- 500 nodes --- 1000 nodes ---- 2000 nodes ---- 4000 nodes ---- 8000 nodes
Conclusions • The most complex single agent search was more than 10 times faster than the trivial. • For the fuel efficient algorithm there exist an optimal number of agents. • For the time efficient algorithm with many agents the time cost converges to the optimal solution.
Future research • Build a “real system” with robots. • Assume that a node is only known when the agent actually reaches is. • Combining cost: time and fuel together. • Other communications paradigms. • Solve other graph problems like MST, TSP for physical unknown graphs.
Ant-robotic A* (ideas for the future) • Ants communicate with pheromones that are stored at the nodes of the graph. • New idea: pheromone with all the graph. Memory requirements: 1 Mega per node. • Communications paradigm: each agent and each node will contain a databases for the complete graph. When an agent visits a node it merges their data.
Ant robotics A* • Idea: Two type of agents. • Searching agent: activates the search algorithm. • Communication agent: walk around the environment, (hopefully in the search frontier) spreads the data around. • Agents can switch tasks during the search. • PHA* is only a test case for such a paradigm.