160 likes | 291 Views
Artificial Intelligence. Games 1: Game Tree Search. Ian Gent ipg@cs.st-and.ac.uk. Artificial Intelligence. Game Tree Search. Part I : Game Trees Part II: MiniMax Part III: A bit of Alpha-Beta . Perfect Information Games. Unlike Bridge, we consider 2 player perfect information games
E N D
Artificial Intelligence Games 1: Game Tree Search Ian Gent ipg@cs.st-and.ac.uk
Artificial Intelligence Game Tree Search Part I : Game Trees Part II: MiniMax Part III: A bit of Alpha-Beta
Perfect Information Games • Unlike Bridge, we consider 2 player perfect information games • Perfect Information: both players know everything there is to know about the game position • no hidden information (e.g. opponents hands in bridge) • no random events (e.g. draws in poker) • two players need not have same set of moves available • examples are Chess, Go, Checkers, O’s and X’s • Ginsberg made Bridge 2 player perfect information • by assuming specific random locations of cards • two players were North-South and East-West
Game Trees • A game tree is like a search tree • nodes are search states, with full details about a position • e.g. chessboard + castling/en passant information • edges between nodes correspond to moves • leaf nodes correspond to determined positions • e.g. Win/Lose/Draw • number of points for or against player • at each node it is one or other player’s turn to move
Game Trees Search Trees • Strong similarities with 8s puzzle search trees • there may be loops/infinite branches • typically no equivalent of variable ordering heuristic • “variable” is always what move to make next • One major difference with 8s puzzle • The key difference is that you have an opponent! • Call the two players Max and Min • Max wants leaf node with max possible score • e.g. Win = + • Min wants leaf node with min score, • e.g. Lose = -
The problem with Game trees • Game trees are huge • O’s and X’s not bad, just 9! = 362,880 • Checkers/Draughts about 1040 • Chess about 10 120 • Go utterly ludicrous, e.g. 361! 10750 • Recall from Search1 Lecture, • It is not good enough to find a route to a win • Have to find a winning strategy • Unlike 8s/SAT/TSP, can’t just look for one leaf node • typically need lots of different winning leaf nodes • Much more of the tree needs to be explored
Coping with impossibility • It is usually impossible to solve games completely • Connect 4 has been solved • Checkers has not been • we’ll see a brave attempt later • This means we cannot search entire game tree • we have to cut off search at a certain depth • like depth bounded depth first, lose completeness • Instead we have to estimate cost of internal nodes • Do so using a static evaluation function
Static evaluation • A static evaluation function should estimate the true value of a node • true value = value of node if we performed exhaustive search • need not just be /0/- even if those are only final scores • can indicate degree of position • e.g. nodes might evaluate to +1, 0, -10 • Children learn a simple evaluation function for chess • P = 1, N = B = 3, R = 5, Q = 9, K = 1000 • Static evaluation is difference in sum of scores • chess programs have much more complicated functions
O’s and X’s • A simple evaluation function for O’s and X’s is: • Count lines still open for maX, • Subtract number of lines still open for min • evaluation at start of game is 0 • after X moves in center, score is +4 • Evaluation functions are only heuristics • e.g. might have score -2 but maX can win at next move • O - X • - O X • - - - • Use combination of evaluation function and search
MiniMax • Assume that both players play perfectly • Therefore we cannot optimistically assume player will miss winning response to our moves • E.g. consider Min’s strategy • wants lowest possible score, ideally - • but must account for Max aiming for + • Min’s best strategy is: • choose the move that minimises the score that will result when Max chooses the maximising move • hence the name MiniMax • Max does the opposite
Minimax procedure • Statically evaluate positions at depth d • From then on work upwards • Score of max nodes is the max of child nodes • Score of min nodes is the min of child nodes • Doing this from the bottom up eventually gives score of possible moves from root node • hence best move to make • Can still do this depth first, so space efficient
What’s wrong with MiniMax • Minimax is horrendously inefficient • If we go to depth d, branching rate b, • we must explore bd nodes • but many nodes are wasted • We needlessly calculate the exact score at every node • but at many nodes we don’t need to know exact score • e.g. outlined nodes are irrelevant
Alpha-Beta search • Alpha-Beta = • Uses same insight as branch and bound • When we cannot do better than the best so far • we can cut off search in this part of the tree • More complicated because of opposite score functions • To implement this we will manipulate alpha and beta values, and store them on internal nodes in the search tree
Alpha and Beta values • At a Mx node we will store an alpha value • the alpha value is lower bound on the exact minimax score • the true value might be • if we know Min can choose moves with score < • then Min will never choose to let Max go to a node where the score will be or more • At a Min node, we will store a beta value • the beta value is upper bound on the exact minimax score • the true value might be • Alpha-Beta search uses these values to cut search
Alpha Beta in Action • Why can we cut off search? • Beta = 1 < alpha = 2 where the alpha value is at an ancestor node • At the ancestor node, Max had a choice to get a score of at least 2 (maybe more) • Max is not going to move right to let Min guarantee a score of 1 (maybe less)
Summary and Next Lecture • Game trees are similar to search trees • but have opposing players • Minimax characterises the value of nodes in the tree • but is horribly inefficient • Use static evaluation when tree too big • Alpha-beta can cut off nodes that need not be searched • Next Time: More details on Alpha-Beta