Adversarial Search

Adversarial Search CPS 4801

Outline • Optimal decisions in games • α-β pruning

Games vs. search problems • competitive environments: agents’ goals are in conflict • adversarial search problems (games) • deterministic, turn-taking, two–player, zero-sum games • Tic-tac-toe • Time limits • How to make the best possible use of time • Choose a good move when time is limited

Define a game • S0: initial state • Player(s): which player has the move in a state • Action(s): set of legal moves in a state • Result(s,a): transition model • Terminal-test(s): true/false (terminal states) • Utility(s,p): final value of a game that ends in terminal state s for a player p

Game tree (1-player)

Game tree (2-player, deterministic, turns)

Minimax • Idea: choose move to position with highest minimax value= best achievable payoff against best play E.g., 2-ply game:

Minimax • The minimax value of a node is the utility (for Max) of being in the corresponding state, assuming that both players play optimally. • Minimax(s) = • Utility (s) if Terminal-test(s) • max of Minimax(Result(s,a)) if Player(s) = Max • min of Minimax(Result(s,a)) if Player(s) = Min • Minimax decision (backed up)

Properties of minimax • Complete? Yes (if tree is finite) • Optimal? Yes (against an optimal opponent) • Time complexity? O(bm) • Space complexity? O(bm) (depth-first exploration) • For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible

α-β Pruning • Problem with minimax search: exponential in the depth of the tree • Can we cut it in half? • It is possible to compute the minimax decision without looking at every node. • pruning: eliminate some parts of the tree

α-β pruning example

α-β pruning example • Minimax(root) • = max(min(3,12,8),min(2,x,y),min(14,5,2)) • = max(3,min(2,x,y),2) • = 3 • We made the same minimax decision without ever evaluating two of the leaf nodes! (independent) • It is possible to prune entire subtrees.

α=value of the best choice found so far at any choice point along the path for max If v is worse than α, max will avoid it prune that branch Define β similarly for min Why is it called α-β?

Deterministic games in practice • Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions. • Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply. • Othello: Logistello defeated the human world champion. It is generally acknowledged that human are no match for computers at Othello.

Adversarial Search