510 likes | 594 Views
This outline provides an overview of adversarial search in game playing, including game trees, minimax algorithm, alpha-beta pruning, and different types of games. It covers concepts like deterministic and chance games, perfect and imperfect information, and utility values.
E N D
Adversarial Search CMPT 420 / CMPG 720
Outline • Game playing • Game trees • Minimax • Alpha-beta pruning
Games vs. search problems • competitive environments: agents’ goals are in conflict • adversarial search problems (games)
Types of Games deterministic chance perfect information imperfect information
Games • deterministic, fully-observable, turn-taking, two–player, zero-sum games • Utility values at the end are equal and opposite • Tic-tac-toe
Game Search Formulation • Two players MAX and MIN take turns (with MAX playing first) • S0: • Player(s): • Action(s): • Result(s,a): • Terminal-test(s): • Utility(s,p):
Game Search Formulation • S0: initial state • Player(s): • Action(s): • Result(s,a): • Terminal-test(s): • Utility(s,p):
Game Search Formulation • S0: initial state • Player(s): which player has the move in a state • Action(s): • Result(s,a): • Terminal-test(s): • Utility(s,p):
Game Search Formulation • S0: initial state • Player(s): which player has the move in a state • Action(s): set of legal moves in a state • Result(s,a): • Terminal-test(s): • Utility(s,p):
Game Search Formulation • S0: initial state • Player(s): which player has the move in a state • Action(s): set of legal moves in a state • Result(s,a): transition model • Terminal-test(s): • Utility(s,p):
Game Search Formulation • S0: initial state • Player(s): which player has the move in a state • Action(s): set of legal moves in a state • Result(s,a): transition model • Terminal-test(s): true/false (terminal states) • Utility(s,p):
Game Search Formulation • S0: initial state • Player(s): which player has the move in a state • Actions(s): set of legal moves in a state • Result(s,a): transition model • Terminal-test(s): true/false (terminal states) • Utility(s,p): utility function defines the final value of a game that ends in terminal state s for a player p • zero-sum games: same total payoff
Optimal strategies • MAX uses search tree to determine next move. • Assumption: Both players play optimally!! • Given a game tree, the optimal strategy can be determined by using the minimaxvalue of each node
Minimax • The minimax value of a node is the utility (for Max) of being in the corresponding state, assuming that both players play optimally. • Minimax(s) = • if Terminal-test(s) • if Player(s) = Max • if Player(s) = Min
Minimax • The minimax value of a node is the utility (for Max) of being in the corresponding state, assuming that both players play optimally. • Minimax(s) = • Utility (s) if Terminal-test(s) • max of Minimax(Result(s,a)) if Player(s) = Max • min of Minimax(Result(s,a)) if Player(s) = Min
Optimal Play 2 2 1 2 7 1 2 7 1 8 8 2 2 1 2 7 1 8 2 7 1 8 2 7 1 8 This is the optimal play MAX MIN
Two-Ply Game Tree The minimax decision Minimax maximizes the worst-case outcome for max.
What if MIN does not play optimally? • Definition of optimal play for MAX assumes MIN plays optimally: maximizes worst-case outcome for MAX. • But if MIN does not play optimally, MAX can do even better.
Minimax Algorithm function MINIMAX-DECISION(state) returns an action inputs: state, current state in game vMAX-VALUE(state) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v -∞ for a,s in SUCCESSORS(state) do v MAX(v,MIN-VALUE(s)) return v function MIN-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v ∞ for a,s in SUCCESSORS(state) do v MIN(v,MAX-VALUE(s)) return v
Properties of minimax • Complete? • Yes (if tree is finite) • Optimal? • Yes (against an optimal opponent) • Time complexity? • O(bm) • Space complexity? • O(bm) (depth-first exploration) • For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution is infeasible
Alpha-Beta Pruning • Problem with minimax search: exponential in the depth of the tree • Can we cut it in half? • It is possible to compute the minimax decision without looking at every node. • pruning: eliminate some parts of the tree
Alpha-beta pruning • We can improve on the performance of the minimax algorithm through alpha-beta pruning MAX MIN MAX 2 7 1 ?
Alpha-beta pruning • We can improve on the performance of the minimax algorithm through alpha-beta pruning MAX • We don’t need to compute the value at this node. • No matter what it is, it can’t affect the value of the root node. MIN MAX 2 7 1 ?
Alpha-Beta Example Do DFS until the first leaf Range of possible values [-∞,+∞] [-∞, +∞]
Alpha-Beta Example Do DFS until first leaf Range of possible values [-∞,+∞] [-∞, +∞]
Alpha-Beta Example (continued) [-∞,+∞] [-∞,3]
Alpha-Beta Example (continued) [-∞,+∞] [-∞,3]
Alpha-Beta Example (continued) [-∞,+∞] [3,3]
Alpha-Beta Example (continued) [3,+∞] [3,3]
Alpha-Beta Example (continued) [3,+∞] [3,3] [-∞, ∞]
Alpha-Beta Example (continued) [3,+∞] [3,3] [-∞,2]
Alpha-Beta Example (continued) [3,+∞] This node is worse for MAX [3,3] [-∞,2]
Alpha-Beta Example (continued) , [3,14] [3,3] [-∞,2] [-∞, ∞]
Alpha-Beta Example (continued) , [3,14] [3,3] [-∞,2] [-∞,14]
Alpha-Beta Example (continued) , [3,5] [3,3] [−∞,2] [-∞,5]
Alpha-Beta Example (continued) [2,2] [3,3] [−∞,2]
Alpha-Beta Example (continued) [3,3] [2,2] [3,3] [-∞,2]
α-β pruning example • Minimax(root) • = max(min(3,12,8),min(2,x,y),min(14,5,2)) • = max(3,min(2,x,y),2) • = 3
α-β pruning • We made the same minimax decision without ever evaluating two of the leaf nodes! • They are independent. • It is possible to prune entire subtrees.
α = value of the best choice found so far at any choice point along the path for max If v is worse than α, max will avoid it prune that branch Define β similarly for min Why is it called α-β?
Alpha-Beta Algorithm function ALPHA-BETA-SEARCH(state) returns an action inputs: state, current state in game vMAX-VALUE(state, - ∞ , +∞) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state, , ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v - ∞ for a,s in SUCCESSORS(state) do v MAX(v,MIN-VALUE(s, , )) ifv ≥ then returnv MAX( ,v) return v
Alpha-Beta Algorithm function MIN-VALUE(state, , ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v + ∞ for a,s in SUCCESSORS(state) do v MIN(v,MAX-VALUE(s, , )) ifv ≤ then returnv MIN( ,v) return v
Comments: Alpha-Beta Pruning • Pruning does not affect the final results. • Entire subtrees can be pruned. • Good move ordering improves effectiveness of pruning. • With “perfect ordering,” time complexity is O(bm/2) • Alpha-beta pruning can look twice as far as minimax in the same amount of time