270 likes | 451 Views
Dave Reed. applications of informed search optimization problems Algorithm A, admissibility, A* zero-sum game playing minimax principle, alpha-beta pruning. Optimization problems. consider a related search problem:
E N D
Dave Reed • applications of informed search • optimization problems Algorithm A, admissibility, A* • zero-sum game playing minimax principle, alpha-beta pruning
Optimization problems • consider a related search problem: • instead of finding the shortest path (i.e., fewest moves) to a solution, suppose we want to minimize some cost EXAMPLE: airline travel problem • could associate costs with each flight, try to find the cheapest route • could associate distances with each flight, try to find the shortest route • we could use a strategy similar to breadth first search • repeatedly extend the minimal cost path • search is guided by the cost of the path so far • but such a strategy ignores heuristic information • would like to utilize a best first approach, but not directly applicable search is guided by the remaining cost of the path • IDEAL: combine the intelligence of both strategies • cost-so-far component of breadth first search (to optimize actual cost) • cost-remaining component of best first search (to make use of heuristics)
Algorithm A • associate 2 costs with a path g actual cost of the path so far h heuristic estimate of the remaining cost to the goal* f = g + h combined heuristic cost estimate *note: the heuristic value is inverted relative to best first • Algorithm A: best first search using f as the heuristic
Travel problem revisited g: cost is actual distances per flight h: cost estimate is crow-flies distance %%% h(Loc, Goal, Value) : Value is crow-flies %%% distance from Loc to Goal %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% h(loc(omaha), loc(los_angeles), 1700). h(loc(chicago), loc(los_angeles), 2200). h(loc(denver), loc(los_angeles), 1400). h(loc(los_angeles), loc(los_angeles), 0).
Algorithm A implementation %%% Algorithm A (for trees) %%% atree(Current, Goal, Path): Path is a list of states %%% that lead from Current to Goal with no duplicate states. %%% %%% atree_help(ListOfPaths, Goal, Path): Path is a list of %%% states (with associated F and G values) that lead from one %%% of the paths in ListOfPaths (a list of lists) to Goal with %%% no duplicate states. %%% %%% extend(G:Path, Goal, ListOfPaths): ListOfPaths is the list %%% of all possible paths (with associated F and G values) obtainable %%% by extending Path (at the head) with no duplicate states. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% atree(State, Goal, G:Path) :- atree_help([0:0:[State]], Goal, G:RevPath), reverse(RevPath, Path). atree_help([_:G:[Goal|Path]|_], Goal, G:[Goal|Path]). atree_help([_:G:Path|RestPaths], Goal, SolnPath) :- extend(G:Path, Goal, NewPaths), append(RestPaths, NewPaths, TotalPaths), sort(TotalPaths, SortedPaths), atree_help(SortedPaths, Goal, SolnPath). extend(G:[State|Path], Goal, NewPaths) :- bagof(NewF:NewG:[NextState,State|Path], Cost^H^(move(State, NextState, Cost), not(member(NextState, [State|Path])), h(NextState,Goal,H), NewG is G+Cost, NewF is NewG+H), NewPaths), !. extend(_, _, []). differences from best associate two values with each path (F is total estimated cost, G is actual cost so far) F:G:Path since extend needs to know of current path, must pass G new feature of bagof: if a variable appears only in the 2nd arg, must identify it as backtrackable
Travel example %%% travelcost.pro Dave Reed 3/15/02 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% move(loc(omaha), loc(chicago), 500). move(loc(omaha), loc(denver), 600). move(loc(chicago), loc(denver), 1000). move(loc(chicago), loc(los_angeles), 2200). move(loc(chicago), loc(omaha), 500). move(loc(denver), loc(los_angeles), 1400). move(loc(denver), loc(omaha), 600). move(loc(los_angeles), loc(chicago), 2200). move(loc(los_angeles), loc(denver), 1400). %%% h(Loc, Goal, Value) : Value is crow-flies %%% distance from Loc to Goal %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% h(loc(omaha), loc(los_angeles), 1700). h(loc(chicago), loc(los_angeles), 2200). h(loc(denver), loc(los_angeles), 1400). h(loc(los_angeles), loc(los_angeles), 0). • note: Algorithm A finds the path with least cost (here, distance) not necessarily the path with fewest steps • suppose the flight from Chicago to L.A. was 2500 miles (instead of 2200) OmahaChicagoDenverLA would be shorter than OmahaChicagoLA ?- atree(loc(omaha), loc(los_angeles), Path). Path = 2000:[loc(omaha), loc(denver), loc(los_angeles)] ; Path = 2700:[loc(omaha), loc(chicago), loc(los_angeles)] ; Path = 2900:[loc(omaha), loc(chicago), loc(denver), loc(los_angeles)] ; No
8-puzzle example %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% h(Board, Goal, Value) : Value is the number of tiles out of place h(tiles(Row1,Row2,Row3),tiles(Goal1,Goal2,Goal3), Value) :- diff_count(Row1, Goal1, V1), diff_count(Row2, Goal2, V2), diff_count(Row3, Goal3, V3), Value is V1 + V2 + V3. diff_count([], [], 0). diff_count([H1|T1], [H2|T2], Count) :- diff_count(T1, T2, TCount), (H1 = H2, Count is TCount ; H1 \= H2, Count is TCount+1). g: actual cost of each move is 1 h: remaining cost estimate is # of tiles out of place ?- atree(tiles([1,2,3],[8,6,space],[7,5,4]), tiles([1,2,3],[8,space,4],[7,6,5]), Path). Path = 3:[tiles([1, 2, 3], [8, 6, space], [7, 5, 4]), tiles([1, 2, 3], [8, 6, 4], [7, 5, space]), tiles([1, 2, 3], [8, 6, 4], [7, space, 5]), tiles([1, 2, 3], [8, space, 4], [7, 6, 5])] Yes ?- atree(tiles([2,3,4],[1,8,space],[7,6,5]), tiles([1,2,3],[8,space,4],[7,6,5]), Path). Path = 5:[tiles([2, 3, 4], [1, 8, space], [7, 6, 5]), tiles([2, 3, space], [1, 8, 4], [7, 6, 5]), tiles([2, space, 3], [1, 8, 4], [7, 6, 5]), tiles([space, 2, 3], [1, 8, 4], [7, 6, 5]), tiles([1, 2|...], [space, 8|...], [7, 6|...]), tiles([1|...], [8|...], [7|...])] Yes • here, Algorithm A finds the same paths as best first search • not surprising since g is trivial • still, not guaranteed to be the case
Algorithm A vs. hill-climbing • if the cost estimate function h is perfect, then f is a perfect heuristic • Algorithm A is deterministic if know actual costs for each state, Alg. A reduces to hill-climbing
Admissibility • in general, actual costs are unknown at start – must rely on heuristics • if the heuristic is imperfect, Alg. A is NOT guaranteed to find an optimal solution • if a control strategy is guaranteed to find an optimal solution (when a solution exists), we say it is admissible • if cost estimate h never overestimates actual cost, then Alg. A is admissible • (when admissible, Alg. A is commonly referred to as Alg. A*)
Admissible examples • is our heuristic for the travel problem admissible? • h (State, Goal) = crow-flies distance from Goal • is our heuristic for the 8-puzzle admissible? • h (State, Goal) = number of tiles out of place, including the space • is our heuristic for the Missionaries & Cannibals admissible?
Cost of the search • the closer h is to the actual cost function, the fewer states considered • however, the cost of computing h tends to go up as it improves the best algorithm is one that minimizes the total cost of the solution • also, admissibility is not always needed or desired • Graceful Decay of Admissibility: If h rarely overestimates the actual cost by more than D, then Alg. A will rarely find a solution whose cost exceeds optimal by more than D.
Algorithm A (for graphs) • representing the search search as a tree is conceptually simpler • but must store entire paths, which include duplicates of states • more efficient to store the search space as a graph • store states w/o duplicates, need only remember parent for each state (so that the solution path can be reconstructed at the end) • IDEA: keep 2 lists of states (along with f & g values, parent pointer) OPEN: states reached by the search, but not yet expanded CLOSED: states reached and already expanded • while more efficient, the graph implementation is trickier • when a state moves to the CLOSED list, it may not be finished • may have to revise values of states if new (better) paths are found
Flashlight example • consider the flashlight puzzle discussed in class: Four people are on one side of a bridge. They wish to cross to the other side, but the bridge can only take the weight of two people at a time. It is dark and they only have one flashlight, so they must share it in order to cross the bridge. Assuming each person moves at a different speed (able to cross in 1, 2, 5 and 10 minutes, respectively), find a series of crossings that gets all four across in the minimal amount of time. state representation? cost of a move? heuristic?
Flashlight implementation • state representation must identify the locations of each person and the flashlight • bridge(SetOfPeopleOnLeft, SetOfPeopleOnRight, FlashlightLocation) • note: can use a list to represent a set, but must be careful of permutations • e.g., [1,2,5,10] = [1,5,2,10], so must make sure there is only one list repr. per set • solution: maintain the lists in sorted order, so only one permutation is possible • only 3 possible moves: • if the flashlight is on left and only 1 person on left, then • move that person to the right (cost is time it takes for that person) • if flashlight is on left and at least 2 people on left, then • select 2 people from left and move them to right (cost is max time of the two) • if the flashlight is on right, then • select a person from right and move them to left (cost is time for that person) • heuristic: • h(State, Goal) = number of people in wrong place
Flashlight code %%% flashlight.pro Dave Reed 3/15/02 %%% %%% This file contains the state space definition %%% for the flashlight puzzle. %%% %%% bridge(Left, Right, Loc): Left is a (sorted) list of people %%% on the left shore, Right is a (sorted) list of people on %%% the right shore, and Loc is the location of the flashlight. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% move(bridge([P], Right, left), bridge([], NewRight, right), P) :- merge([P], Right, NewRight). move(bridge(Left, Right, left), bridge(NewLeft, NewRight, right), Cost) :- length(Left, L), L >= 2, select(P1, Left, Remain), select(P2, Remain, NewLeft), merge([P1], Right, TempRight), merge([P2], TempRight, NewRight), Cost is max(P1, P2). move(bridge(Left, Right, right), bridge(NewLeft, NewRight, left), P) :- select(P, Right, NewRight), merge([P], Left, NewLeft). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% h(State, Goal, Value) : Value is # of people out of place h(bridge(Left, Right, _), bridge(GoalLeft, GoalRight, _), Value) :- diff_count(Left, GoalLeft, V1), diff_count(Right, GoalRight, V2), Value is V1 + V2. diff_count([], _, 0). diff_count([H|T], L, Count) :- diff_count(T, L, TCount), (member(H,L), !, Count is TCount ; Count is TCount+1). merge(L1,L2,L3): L3 is the result of merging sorted lists L1 & L2 select(X,L,R): R is the remains after removing X from list L
Flashlight answers • Algorithm A finds optimal solutions to the puzzle • note: more than one solution is optimal • can use ';' to enumerate solutions, from best to worst ?- atree(bridge([1,2,5,10],[],left), bridge([],[1,2,5,10],right), Path). Path = 17:[bridge([1, 2, 5, 10], [], left), bridge([5, 10], [1, 2], right), bridge([1, 5, 10], [2], left), bridge([1], [2, 5, 10], right), bridge([1, 2], [5, 10], left), bridge([], [1|...], right)] ; Path = 17:[bridge([1, 2, 5, 10], [], left), bridge([5, 10], [1, 2], right), bridge([2, 5, 10], [1], left), bridge([2], [1, 5, 10], right), bridge([1, 2], [5, 10], left), bridge([], [1|...], right)] ; Path = 19:[bridge([1, 2, 5, 10], [], left), bridge([2, 5], [1, 10], right), bridge([1, 2, 5], [10], left), bridge([2], [1, 5, 10], right), bridge([1, 2], [5, 10], left), bridge([], [1|...], right)] Yes
Search in game playing • consider games involving: • 2 players • perfect information • zero-sum (player's gain is opponent's loss) examples: tic-tac-toe, checkers, chess, othello, … non-examples: poker, backgammon, prisoner's dilemma, … • von Neumann (the father of game theory) showed that for such games, there is always a "rational" strategy • that is, can always determine a best move, assuming the opponent is equally rational
Game trees • idea: model the game as a search tree • associate a value with each game state (possible since zero-sum) • player 1 wants to maximize the state value (call him/her MAX) • player 2 wants to minimize the state value (call him/her MIN) • players alternate turns, so differentiate MAX and MIN levels in the tree the leaves of the tree will be end-of-game states
Minimax search • minimax search: • at a MAX level, take the maximum of all possible moves • at a MIN level, take the minimum of all possible moves • can visualize the search bottom-up (start at leaves, work up to root) • likewise, can search top-down using recursion
Minimax in practice • while Minimax Principle holds for all 2-party, perfect info, zero-sum games, an exhaustive search to find best move may be infeasible EXAMPLE: in an average chess game, ~100 moves with ~35 options/move • ~35100 states in the search tree! • practical alternative: limit the search depth and use heuristics • expand the search tree a limited number of levels (limited look-ahead) • evaluate the "pseudo-leaves" using a heuristic high value good for MAX low value good for MIN back up the heuristic estimates to determine the best-looking move at MAX level, take minimum at MIN level, take maximum
Tic-tac-toe example { • 1000 if win for MAX (X) • heuristic(State) =-1000 if win for MIN (O) • (#rows/cols/diags open for MAX – • #rows/cols/diags open for MIN) otherwise • suppose look-ahead of 2 moves
a-b bounds • sometimes, it isn't necessary to search the entire tree • a-b technique: associate bonds with state in the search • associate lower bound a with MAX: can increase • associate upper bound b with MIN: can decrease
a-b pruning • discontinue search below a MIN node if b value <= a value of ancestor discontinue search below a MAX node if a value >= b value of ancestor
tic-tac-toe example • a-b vs. minimax: worst case: a-b examines as many states as minimax best case: assuming branching factor B and depth D, a-b examines ~2bd/2 states (i.e., as many as minimax on a tree with half the depth)