350 likes | 552 Views
Problem Solving. Problem Solving. Game Playing Planning. Game Playing. Attractive AI problem, because it is abstract One of the oldest domains of AI In most cases, the world state is fully accessible Computer representation of the situation can be clear and exact. Game Playing.
E N D
Problem Solving • Game Playing • Planning
Game Playing • Attractive AI problem, because it is abstract • One of the oldest domains of AI • In most cases, the world state is fully accessible • Computer representation of the situation can be clear and exact
Game Playing • Challenging: uncertainty introduced by the • opponents and the • complexity of the problem (full search is impossible) • Hard: in chess, branching factor is about 35, and 50 moves by each player = 35100 nodes to search1040 possible legal board states.
Games vs. Search Problems • “Unpredictable” opponent solution is a contingency plan • Time limits unlikely to find goal, must approximate. Plan of attack • Algorithm for perfect play (Von Neumann, 1944) • Finite horizon, approximate evaluation (Zuse, 1945; Shannon, 1950; Samuel, 1952 –57) • Pruning to reduce costs (McCarthy, 1956)
Two-Person Perfect Information Game Two agents, players, move in turn until one of them wins, or the result is a draw. Each player has a complete and perfect model of the environment
Two-Person Perfect Information Game initial state: initial position and who goes first operators: legal moves terminal test: game over? utility function: outcome (win:+1, lose:-1, draw:0, etc.) • two players (MIN and MAX) taking turns to maximize their chances of winning (each turn generates one ply) • one player’s victory is another’s defeat • need a strategy to win no matter what the opponent does
MINIMAX : Strategy for Two-Person Perfect Info Games The mini-max procedure is a depth first, depth limited search procedure. At the current state next possible moves are generated by plausible move generator. Then static evaluation function is applied to chose the best move.We assume that the static evaluation function returns large values for the player and small numbers for the opponent.
MINIMAX PROCEDURE Assume that 10 means a win for the player -10 means a win for the opponent and 0 means a tie. If the two players are at the same level of knowledge then moves will be taken as maximizing when it is the players turn and minimizing when it is the opponents turn. We should chose move B to maximize our advantage
MINIMAX PROCEDURE But, we usually carry the search for more than one step. If we take move B the opponent will make move F which is more to his advantage.
MINIMAX PROCEDURE We propagate the static evaluation function values upwards and chose to make move C. MAX MIN (-4)
MINIMAX PROCEDURE • Complete search of most game graphs is computationally infeasible. • After search terminates, an estimate of the best first move is made. • This can be made by applying a static evaluation function to the leaf nodes of the search tree. The evaluation function measures the worth of a leaf node. • Positions favorable to MAX evaluate to a large positive value, where positions favorable to MIN evaluate to a negative value. • Values near zero correspond to game positions not particularly favorable to either MAX or MIN
MINIMAX PROCEDURE The backed up values are based on “looking ahead” in the game tree and therefore depend on features occurring near the end of the game.
3 MAX 3 MIN 2 0 9 3 0 2 6 7 MAX 2 3 5 9 0 7 4 2 1 5 6 MINIMAX PROCEDURE
Tic-Tac-Toe Assume MAX marks Xs, MIN marks Os • MAX plays first • With a depth bound of 2, we generate all the nodes till level 2, then apply static evaluation function to the positions at these nodes. If p is not a winning position for either of the players • e(p) = number of complete rows, columns or diagonals that are still open • e(p) = if p is a win for MAX • e(p) = - if p is a win for MIN
O X O O X X X X O O Tic-Tac-Toe • Thus if p is We have e(p) = 6 –4 = 2 We make use of symmetries in generating successor positions; thus the following game states are all considered identical.
MAX >=4 4 MIN =< 2 2 4 6 discard The Alpha-Beta PROCEDURE Cuts When the current maximum value is greater than the successor’s min value, don’t look further on that min subtree: Right subtree can be at most 2, so MAX will always choose the left path regardless of what appears next.
The Alpha-Beta PROCEDURE Cuts When the current min value is les than the successor’s max value, don’t look further on that max subtree: Right subtree can be at least 5, so MIN will always choose the left path regardless of what appears next. MIN =< 3 MAX >=5 3 5 1 3 discard
The Alpha-Beta PROCEDURE 3 C MAX MIN 3 A 0 D E 2 3 B 0 2 MAX 2 3 5 0 2 1 1) A has = 3 (A will be no larger than 3 ) D is pruned, since 0<3 E is pruned, since 2<3 B is pruned 2) C has = 3 (C will be no smaller than 3
Search Efficiency of - Procedure • Pruning does not affect the final result • In order to perform - cutoffs, at least some part of the search tree must be generated to the max. depth, because alpha and beta values must be based on the static values of the tip nodes. • Good move ordering improves effectiveness of pruning
Ordering is important for good Pruning For MIN, sorting successor’s utility in an increasing order is better (shown above; left). For MAX, sorting in decreasing order is better.
- Pruning Properties • A game of m moves can be represented with a tree of depth m. If the branching factor is b then we have bm tip nodes. • With perfect ordering (lowest successor values first for MIN nodes and the highest successor values first for MAX nodes), the number of cut offs is maximized and the number of tip nodes generated is minimized. • time complexity = bm/2. • bm/2 = (b1/2)m ,thus b= 35 in chess reduces to 6.
- Pruning Properties • That is, the number of tip nodes at depth d that would be generated by optimal alpha-beta search is about the same as the number of tip nodes that would have been generated at depth d/2 without alpha-beta
Planning – Means Ends Analysis • In some problems like guiding a robot around a house description of a single state is very complex. • A given action on the part of the robot will change only very small part of the total state. • Instead of writing rules that describe transformations between states, we would like to write rules that describe only the affected parts of the state description. The rest can be assumed to stay constant. • These methods focus on ways of decomposing the original problem into appropriate subparts and on ways of recording and handling interactions among the subparts as they are detected during the problem solving process.
Planning – Means Ends Analysis • The means ends analysis centers around the detection of the difference between the current state and the goal state. Once such a difference is isolated, an operator that can reduce the difference (means) must be found. In other words it relies on a set of rules that transform one problem state into another (ends). • Example: Suppose a robot is given a task of moving a desk with two things on it from one room to another.
A B C D Start Push Goal Means-Ends Analysis • The main difference between the start state and the goal state is the location of the desk • To reduce this difference PUSH or CARRY operators are available. • CARRY reaches a dead-end obj is not small • PUSH
A B C E D Place Pick up Put down Pick up Push Start Walk Put down A B C D Start Push Goal Means-Ends Analysis This difference can be reduced by using WALK to get the robot back to the objects followed by PICKUP and CARRY
References • Nilsson, N.J. Artificial Intelligence: A new Synthesis, Morgan Kaufmann, 1998 • Luger, Stubblefield. Artificial Intelligence • Rich, Artificial Intelligence