1.33k likes | 1.34k Views
Explore Adversarial Search concepts in AI, including Minimax algorithm, Alpha-Beta pruning, NegaMax formulation, and heuristic search strategies. Understand how computers aim to beat humans in strategic games like Chess, Checkers, and Go.
E N D
Intelligent Search Techniques Mark Winands & Cameron Browne
Contents • Adversarial Search • Single Agent Search • Monte Carlo Tree Search • Applications
Computer Game-playing • Can computers beat humans in board games like Chess, Checkers, Go? • This is one of the first tasks of AI (Shannon 1950)
Adversarial Search • Two (or more) opponents, each trying to maximize their expectations • Player 1 is called MAX • Obtain the maximum result • Minimize that of the opponent • Player 2 is called MIN • Obtain the minimum result • Maximize that of the opponent
Definitions - Nodes • Root node • State (position) which is to be searched • Terminal node • A node which has a fixed application dependent value (e.g., win, loss, draw) • Leaf node (non-terminal) • A node which has been assigned a heuristic value • A heuristic is an “educated guess” to an approximate terminal value • Internal nodes • Nodes whose value is a function of the successors
Definitions - Tree • Search depth d • Number of state transitions (moves) from the root of the search to the current state position (measured in ply) • Branching factor b • Average number of successor nodes (moves) • Tree vs. Directed acyclic graph (DAG) • Most trees are really DAGs • A node can have 1 parent (tree) or possible more than 1 (DAG) • Transposition
Tree (Traversal) • Depth-first search • Left to right • Other ways of traversal possible, but for the remainder we use this one!
MiniMax Search (Von Neumann, 1928) Max 3 Min 2 3 Max 7 2 4 3
Principal Variation 3 • Path from root to leaf node of optimal play by each side • Optimal path • Main line 2 3 7 2 4 3
MiniMax Analysis • Complete? Yes (if tree is finite) • Optimal? Yes (against an optimal opponent) • Time complexity? O(bd) • Space complexity? O(bd) (depth-first exploration) • For chess, b≈ 35, d≈100 for "reasonable" games exact solution completely infeasible • Can we do better?
Observation • Some nodes in the search can be proven to be irrelevant to the outcome of the search
α-β Algorithm 3 ≤ 2 3 β-pruning 2 7 2 4 3
2 4 3 The Strength of α-β 3 3 ≤ 2 More than thousand prunings
The Importance of α-β Algorithm 3 ≤ 2 3 β-pruning 2 4 3
Principal Variation Example: Alpha-Beta Algorithm ≥-4 5 ≤5 ≤-4 -4 ≤-6 ≤5 5 ≥6 5 ≥5 5 ≥-6 -4 ≥ -2 ≥5 ≥6 -6 5 0 6 4 -6 -4 -2 5 6
7 6 3 2 8 9 4 5 1 10 2 11 12 13 14 15 Example: Alpha-Beta Algorithm
Shallow pruning Example: Alpha-Beta Algorithm 6 6 ≤2 ≥6 6 ≥8 ≤2 6 ≤3 8 ≤1 ≤2 7 6 3 2 8 9 4 5 1 10 2 11 12 13 14 15 Deep pruning
Alpha-Beta Algorithm • Why is it called alpha-beta? • Maintain two bounds: • Alpha (α): a lower bound on the best value that the player can achieve • Beta (β): an upper bound on what the opponent can achieve • Search, maintaining α and β • Whenever α≥ β, further search at this node is irrelevant
NegaMax Formulation • MiniMax formulation is awkward because the search alternates between MINs and MAXs • The NegaMax formulation allows only a MAX to be used (Knuth & Moore, 1975) • Always maximize, but… • Negate the values first
5 Negate, then maximize Negate, then maximize 4 6 -5 5 6 -4 2 5 6 Negate, then maximize -5 5 0 0 6 -6 -2 2 4 -4 -6 6 4 -4 -2 2 -2 2 -5 5 4 4 -6 6 -4 4 8 -8 4 -4 Discard minimax values for MIN leaf nodes. Replaced by negation NegaMax -6 -4
Analysis • What is the best case for Alpha-Beta? • Consider two cases in this MiniMax Search:
Successor Ordering • Better known as move ordering • Alpha-beta’s performance depends on getting cut offs as soon as possible! • At a node where a cut-off is possible, ideally wants to search (one of the) best move(s) first, and cut-off immediately
Alpha-Beta Node Types • Define two node types • ALL – all successor (moves) of a node must be considered • CUT – a cut-off can occur; one of more successors (moves) or a node must be considered
Minimal α-β Tree In reality you don’t know this!
Alpha-Beta Analysis • Assume a fixed branching factor and a fixed depth • Best case: • Approximate bd/2 • Impact? • b = 10, d = 9 • Minimax: 109 = 1,000,000,000 • Alpha-beta: 105+104 = 110,000
Alpha-Beta Analysis • But… best-case analysis depends on choosing the best move first at CUT nodes (not always possible) • The worst case? No cut offs, and alpha-beta degrades to MiniMax
Heuristic Search 0.25 • Truncate the game tree (limited search depth) • Use a (static heuristic) evaluation function at the leaves to replace pay-offs • Minimax (with alpha-beta) on the reduced game tree • Playing is solving a sequence of these game trees • This approach works very well in Chess, Checkers, Backgammon 1 2 3 –1 0 0.25 1 2 3 1 2 3 1 2 0.25 0.33 0.5 0.33 -1 0.5 0 –1
Quiescence Search A quiescent position is unlikely to show wild swings of value in near future Apply Eval func only to quiescent positions Expand until quiescent position found Instead of using the evaluation function at the leaves, a special function is called that evaluates special moves (e.g. captures) only down to (infinite) depth Selective Search
Quiescence Search D =0 Eval = 50 QS=1000 10 -1000 1000 50 1000
Isn’t this good enough? • No! • Thompson (1982): search depth is strongly correlated with performance in chess • Searching one move (one ply) deeper made a (huge) difference in performance • Holds for other games too!
Performance! Performance! • Improve Alpha-Beta to guarantee near best-case results • Move ordering • Windowing • Iterative deepening • Transposition Tables • Improve the heuristic evaluation • Use parallelism to increase the search depth
Why Alpha-Beta search first? • Many search enhancements developed for alpha-beta translate to single-agent search • Most originated with alpha-beta, and were adopted by other classes of search algorithms
Maxn Algorithm • Generalization of minimax to n players • Luckhardt and Irani, 1986 Assumption: • The players alternate moves • Each player tries to maximize his/her return • Indifferent to returns of others.
Maxn Algorithm 1 (9,9,5) 2 2 (4,5,4) (9,9,5) 3 3 (7,1,8) (4,5,4) 3 3 (1,8,3) (9,9,5) 1 1 1 1 1 1 1 1 (5,3,2) (7,1,8) (8,5,4) (4,5,4) (1,8,3) (6, 6, 3) (3, 6, 3) (9 , 9, 5)
Paranoid Algorithm • Here we see the other players as one big opponent (Sturtevant and Korf 2000) • There is my own player. The Max player. • There are all the others. The Min players. • The paranoid algorithm evaluates the tree as follows. • When it is my turn to play – take the maximum of my utility. • When it is not my turn (it is one of them) I take the minimum of my utility.
Paranoid Algorithm ≥4 4 4 ≤1 5 4 ≤1 5 7 8 4 1 6 3 9
Expectimax Search Trees • Chance nodes when the outcome is uncertain • Search-based approaches must take into account all possibilities at a chance node • Increases the branching factor making deep search unlikely
What to do in the endgame? • Alpha-beta with enhancements (move ordering, transposition tables) • Knowledge (domain dependent) • Endgame databases • Special search algorithms (endgame solvers) • Proof-number search • Lambda-search
Which node has to be expanded? MAX a MIN c b d e f g h MAX ? Win ? Win ?
Which node has to be expanded? a MAX b c MIN d e f g h MAX ? ? ? i j k l m MIN ? ? ? ? ?
PN search • Allis et al. (1994) • Best-first searchmethod • Criterion: develop the leaf node that is most promising to prove the goal • Goal: (dis)prove the root node • E.g., to be a win for the player to move
Proof number and Disproof Number • Proof number (pn): the minimum number of leaf nodes which have to be proved in order to prove the node • Disproof number (dpn): the minimum number of leaf nodes which have to be disproved in order to disprove the node • Proof number and disproof number for each node • Assume one expansion for each unexpanded node
PN search (2) Three types of leaf nodes: • Proved (goal is true): • Disproved (goal is false): • Unknown:
PN Search: AND/OR • AND/OR Tree • Internal nodes OR and AND nodes: • In MinMax tree OR equivalent with MAX and AND with MIN • To prove an OR node it suffices to prove one child. To disprove an OR node all the children have to be disproved. • To prove an AND node all the children have to be proved. To disprove an AND node it suffices to disprove one child.
PN Search: Back propagation • Two types of internal nodes (tree): • OR node • AND node
PN Search: Node selection • Best-first search • Most-promising = most-proving • Path to most-promising node: • At OR node choose child with min pn • At AND node choose child with min dpn
1 2 1 0 2 1 1 0 0 2 1 1 0 1 1 0 1 1 1 PN Example a b c f g e d loss ? k l i j h draw ? ? ? win
1 2 1 0 2 1 1 0 0 2 1 1 1 1 0 1 2 1 0 1 1 1 1 1 1 m n ? ? a PN Example b c f g e d loss ? k l i j h ? draw ? win
1 3 1 1 1 1 1 1 1 1 Strength w 1 1 1 1 w w w ? w 1 1 1 1 1 1 w Weakness 1 1 w 1 1 1 1 w w w PN Search Solution 11-ply deep