170 likes | 181 Views
Explore how AI revolutionizes classic strategy games like Chess, Checkers, and Go with search algorithms like Minimax and Alpha-Beta Pruning.
E N D
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College
Why Games? • Small number of rules • Well-defined knowledge set • Easy to evaluate performance • Large search spaces (too large for exhaustive search) • Fame & Fortune, e.g. Chess
Example Games & Best Computer Players (sec. 6.6 w/updates) • Chess - Deep Blue (beat Kasparov); Deep Junior (tied Kasparov); Hydra (scheduled to play British champion for 80,000 pounds) • Checkers - Chinook (world champion) • Go (Goemate, Go4++ rated “weak amateur”) • Othello - Iago (world championship level), Logistello (defeated world champion, now retired) • Backgammon - TD-Gammon (neural network that learns to play using “reinforcement learning”)
Properties of Games • Two-Player • Zero-sum • If it’s good for one player, it’s bad for the opponent and vice versa • Perfect information • All relevant information is apparent to both players (no hidden cards)
Game as Search Problem • State space search • Each potential board or game position is a state • Each possible move is an operation • Space can be BIG: • large branching factor (chess avg. 35) • deep search for game (chess avg. 50 ply) • Components of any search technique • Move generator (successor function) • Terminal test (end of game?) • Utility function (win, lose or draw?)
Game Tree • Root is initial state • Next level is all of first player’s moves • Next level is all of second player’s moves • Example: Tic Tac Toe • Root: 9 blank squares • Level 1: 3 different boards (corner, center and edge X) • Level 2 below center: 2 different boards (corner, edge) • Etc. • Utility function: win for X is 1, win for O is -1 • X is Maximizer, O is minimizer
Minimax Strategy • Max’s goal: get to 1 • Min’s goal: get to -1 • Max’s strategy • Choose moves that will lead to a win, even though min is trying to block • Minimax value of a node (backed up value): • If N is terminal, use the utility value • If N is a Max move, take max of successors • If N is a Min move, take min of successors
Minimax Values: 2-Ply Example 1 -1 -3 1 -1 1 4 -3 5 0 3 2 1
Minimax Algorithm • Depth-first search to bottom of tree • As search “unwinds”, compute backed up values • Backed-up value of root determines which step to take. • Assumes: • Both players are playing this strategy (optimally) • Tree is small enough to search completely
Alpha-Beta Pruning • We don’t really have to look at all subtrees! • Recognize when a position can never be chosen in minimax no matter what its children are • Max (3, Min(2,x,y) …) is always ≥ 3 • Min (2, Max(3,x,y) …) is always ≤ 2 • We know this without knowing x and y!
Alpha-Beta Pruning • Alpha = the value of the best choice we’ve found so far for MAX (highest) • Beta = the value of the best choice we’ve found so far for MIN (lowest) • When maximizing, cut off values lower than Alpha • When minimizing, cut off values greater than Beta
Alpha-Beta Example 3 3 <=1 2 3 5 8 1 x x 7 6 2
Notes on Alpha-Beta Pruning • Effectiveness depends on order of successors (middle vs. last node of 2-ply example) • If we can evaluate best successor first, search is O(bd/2) instead of O(bd) • This means that in the same amount of time, alpha-beta search can search twice as deep!
Optimizing Minimax Search • Use alpha-beta cutoffs • Evaluate most promising moves first • Remember prior positions, reuse their backed-up values • Transposition table (like closed list in A*) • Avoid generating equivalent states (e.g. 4 different first corner moves in tic tac toe) • But, we still can’t search a game like chess to the end!
When you can’t search to the end • Replace terminal test (end of game) by cutoff test (don’t search deeper) • Replace utility function (win/lose/draw) by heuristic evaluation function that estimates results on the best path below this board • Like A* search, good evaluation functions mean good results (and vice versa) • Replace move generator by plausible move generator (don’t consider “dumb” moves)
Good evaluation functions… • Order terminal states in the same order as the utility function • Don’t take too long (we want to search as deep as possible in limited time) • Should be as accurate as possible (estimate chances of winning from that position…) • Human knowledge (e.g. material value) • Known solution (e.g. endgame) • Pre-searched examples (take features, average value of endgame of all games with that feature)
How Deep to Search? • Until time runs out (the original application of Iterative Deepening!) • Until values don’t seem to change (quiescence) • Deep enough to avoid horizon effect (delaying tactic to delay the inevitable beyond the depth of the search) • Singular extensions - search best (apparent) paths deeper than others • Tends to limit horizon effect, since these are the moves that will exhibit it