1 / 12

ADVERSARIAL GAME SEARCH: Min-Max Search

Understand the Min-Max algorithm for adversarial games like Tic-Tac-Toe, optimize for best moves, and learn about enhancements like Alpha-Beta pruning and ProbCut. Explore iterative deepening, transposition tables, and other strategies to improve decision-making in games.

burnsjohn
Download Presentation

ADVERSARIAL GAME SEARCH: Min-Max Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ADVERSARIAL GAME SEARCH:Min-Max Search X O O X X O • Between 2 adversaries • Input: A board position • Output: Best move (actually, the score for moving there) • Objective function: • board position evaluated to a number (may be +-1, 0) • higher the number better the move: maximization • Task: To compute optimum value of objective function out of all possible legal moves • i.e. Optimize for the highest board value Computer X Evaluated best move value = 1 (win)

  2. Adversarial Games: Min-Max algorithm Computer move X X X O O O O 9 options X X X X O O Adversary move 8 options X O X O Leaf: return +1 Leaf: return -1 • Search tree is going up to leaves (terminal board) to evaluate board • May not be practically feasible for a given computation time • 9! total worst case • Most algorithms will search for a pre-assigned depth or “ply” & evaluate board • Alternate move by Adversary/human & Algorithm/computer: • Minimizes the objective function for adversary move • Maximizes the objective function for self-move or computer move • the same algorithm alternately calls maximizer and minimizer

  3. Maximizing part of the Min-Max algorithm: Input: A board position; Output: Best next move, with max evaluation value Function findCompMove( ) if( fullBoard( ) ) value = DRAW; else if( ( quickWinInfo = immediateCompWin( ) ) != null ) return quickWinInfo; // if the next move ends the game; recursion termination else value = COMP_LOSS; // initialize with lowest value, max-problem for( i = 1; i ≤ 9; i++ ) // try each square in tic-tac-toe if ( isEmpty( i) ) place( i, COMP ); // temporary placement on board, // as global variable responseValue = findHumanMove( ).value; unplace( i ); // Restore board: alg does not actually move if( responseValue > value ) // Update best move value = responseValue; bestMove = i; return new Movelnfo( bestMove, value );

  4. Minimizing part of the Min-Max algorithm: Input: A board position; Output: Best opponent move, with min evaluation value Function findHumanMove( ) if( fullBoard( ) ) value = DRAW; else if( ( quickWinInfo = immediateHumanWin( ) ) != null ) return quickWinInfo; // if the next move ends the game; recursion termination else value = COMP_WIN; // initialize with lowest value for( i = 1; i ≤ 9; i++ ) // Try each square in tic-tac-toe if ( isEmpty( i) ) place( i, HUMAN ); // temporary placement on board, // as global variable responseValue = findCompMove( ).value; unplace( i ); // Restore board: alg does not actually move if( responseValue< value ) // Update best move value = responseValue; bestMove = i; return new Movelnfo( bestMove, value ); Driver call?

  5. Alpha-beta pruning [- ] Max  a=44 Max: get me >44  Min But will not return >40, so Do not call other children Must get >44 44 60>40, so call pruned 60 40 All branches pruned

  6. Min-max algorithm with Alpha-beta pruning  Alpha pruning Must be true: Max-node’s value ≥ min-node’s value Otherwise, useless to expand tree: Prun the branch  Beta pruning From Weiss’ text

  7. Min-max algorithm Variable Ply on Min-Max: When to stop a search tree path? Quiscent: Stable boards, returned values are close to each other, no higher ply needed Horizon effect: Maybe just below this there is a major event!  Avoiding horizon effect strategies exist, e.g. remember from past search or games Delay tactic: stay conservative, don’t take risk, push the “horizon” to see what develops but, may actually just delay an eventual loss Iterative deepening: use (n-1)-th ply’s best move values toward alpha-beta in n-th ply Transposition table: multiple moves get to same board, hash table to avoid such moves From Weiss’ text

  8. Min-max may lead to “wrong” path:very conservative search:presumes opponent is exactly as rational Text Figure 5.14 Max (min(99, 1000, 1000, 1000), min(100, 101, 102, 100) ) = Max (99, 100) = 100, is not the best, 1000’s on the other branch was But, that is the point, opponent will not let ‘you’ go there! However, the risk may be worth taking => utility driven search, not just on board evaluation ProbCut: Probabilistic alpha-beta pruning, - weight heuristic function from past experience – machine (machine-learning) or human (knowledge-based) - alpha-beta are on probability distribution, - pruning of probably useless branches rather than provably useless From Weiss’ text

  9. Forward Pruning:beam search K-best nodes are explored simultaneously/iteratively Not pure depth first, rather k-best first, but the optimum may be beyond k-best at a particular ply From Weiss’ text

  10. Lookup Table Database / Knowledge base of past games Specially on start- or end-game in chess Knowledge-base: compressed information as strategy, if-then-else From Weiss’ text

  11. Dice Games: Backgamon Max-probability-Min-Probability-… Max-Min part “Or” branches, Dice throwing part “And” branches: Must consider all possibilities, but probabilistic weight may be added from past knowledge From Weiss’ text

  12. Partially Observable Games: Card games Each node is really a subset of nodes belief on what “may be” the current nodes are search should filter possibilities: player plays a probing hand Utility function, as in “game therory” in economics – heuristic function typically, embeds probability measures as well From Weiss’ text

More Related