270 likes | 374 Views
Antonio Sanchez Texas Christian University. Spring 2011 Artificial Intelligence COSC 40503. Heuristic Function. In a board game, before we make a move when we have various options How do we know which is the best move? Information
E N D
Antonio Sanchez Texas Christian University Spring 2011Artificial Intelligence COSC 40503
Heuristic Function In a board game, before we make a move when we have various options How do we know which is the best move? Information If I have an algorithm that guides to win the game (truth) Adaptation Through learning by trial and error (algedonics or backpropagation) knowledge Knowledge By defining a heuristic function F that gives a sense of evaluation and comparison between moves, allowing us to use a contingent rule • IF in state iTHEN move to stake k /// rule selection
Heuristic Function In a board game, to arrive at that knowledge • I have to perceive some patterns from the boards, this is, aspects that show a comparative value among them • For example, we believe that the green option is the best, although this is not an absolute truth, nevertheless is knowledge • IF Black Panel • THEN Green Panel How can we define the knowledge behind this opinion? After some consideration we may come up with something like this: BoardEvaluation = MyOpenOptions - TheirOpenOptions Using this function we obtain for the three possible panels that: • BoardEvaluation(Yellow Board) = 8 - 5 = 3 • BoardEvaluation( Green Board) = 8 - 4 = 4 • BoardEvaluation( Red Board) = 8 - 6 = 2BoardEvaluation(Black Board) = 8 - 8 = 0
Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Heuristic Function: E =∑aXp Let us test the function further on the game with another board. Here we have a tie between the blue and the green boards. Furthermore we feel that the red board is a better one and the function is not showing it. Our knowledge is incomplete, therefore we must add extra terms How about BoardEvaluation = MyOpenOptions - TheirOpenOptions + 2* MyWinningOptions Much better. In general then we can stipulate that a evaluation function is a polynomial with as many elements we think represent the value of the board. Furthermore each element can by multiplied by factor and elevated to a power to weight the element in the function: E =∑aXp Determining the right function is a matter of expertise i.e. knowledge and commitment Authors in the 60s: Nils Nilseen, A. Samuel, Herbert Simon y Allen Newell
One or many Heuristic Functions A final note on the heuristic function. As we play further down the game, the function may be have to be changed according to where in the game we are. • This is to say at the beginning the strategy of the game may be to achieve a given position. • In the middle of the game maybe the strategy lies in capturing pieces. • Towards the end of the game it maybe that doing just one move is important. Should this be the case, then we might have different X terms with different factors a,p in the formula to accommodate the strategy at hand, thus having different formulas according a the depth of the game E(t) =∑aXp E(t+δt) =∑aXpE(t+2δt) =∑aXp
Evaluation Evaluation Evaluation Minimization Evaluation Evaluation Evaluation Evaluation Evaluation MInimization Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Evaluation Jumping Ahead However after I play my move, my opponents will play theirs and if they are rational players, they will use a criteria to select the move in order to get a benefit and reducing the value of the board just chosen by me. Therefore I should evaluate not my next move, but a combination of my move and theirs to determine my best choice. In their case I should consider not a maximization of the formula E =∑aXp but rather, the minimization since by reducing the value of the function from their perspective they chooses a better board position. Lets take a look
By the way: Bremermann’s limit Using Einsten’s E = mc2 equation and the Heinseberg Uncertainty principle, H. Bremermann’s derived a limit to the amount of storage capacity of matter. It can also be used to determine the speed in a communication channel The value proposed is 2 * 1047 bits per second per gram. The number is important because it gives us a sense of how much can humankind can expect to get from nanotechnologies and beyond The Ipod stores 15,000 songs in just 5.5 ounces represents only a value 3 * 109 and the smallest memory might achieve up to 2 * 1010 bits per gram, yet the limit is there and one day will be reached
Project 2 and Exam 1 • Note the requirements for project 2 • Your E =∑aXp must beat the STM • The E =∑aXp must train the the STM, letting play it play first. After playing no more than 500 games. Most likely a set of formulas with various degrees will be necessary • For all three games: the evaluation should be done using recursive alpha beta (2 ply ) and should do well when playing against a human • Note the date of the next project: Monday March 7
5 2 3 4 3 2 8 3 5 6 7 5 MiniMax The algorithm to evaluate the next move will work bottom up as follows: • Branch down two levels (known as one ply) • Evaluate all the boards • Minimize by selecting the board with minimum value • Move up one level • Maximize by selecting the board with the best value • This is the best move with a given value from the function It is called MINImize MAXimize (MINIMAX) Selected move 5
12 9 2 7 1 8 3 7 Minimax 3 Max 2 1 3 Min Eval
MiniMax Now in reality one should try branch the tree further down than one ply, the closer we get to the end of the game the closer to the truth our knowledge will be. As a matter of fact, many computer games that you play have generally three levels: novice, medium, expert. Well the expertise mostly likely reflect how far they branch down, most of the time we deal with 1,2 and 3 ply’s. The further you go down, the longer it will take to decide. Nowadays faster machines are able to go as far as down as twelve ply’s. Imagine just the number of boards to evaluate. eval = > minimize - maximize - minimize - maximize - minimize -maximize => select
// Maximize method int[] Maximize(BOARD,alpha,beta,depthGrow) { float max = -infinite; for ( j= 1; j <= BOARD[3] ; j=j+1 ) { MaxBOARD = NextBoard(j,BOARD); POSSIBLE = Minimize(MaxBOARD, depthGrow); if POSSIBLE[2]> max BOARD = POSSIBLE }; return BOARD ; } Pseudo Code MiniMax public class Minimax with alpha beta pruning int[] BOARD = new int[3] // BOARD[1] is the board number // BOARD[2] is the evaluation of the board //BOARD[3] is the number of possible boards that open from this one int[] POSSIBLE = new int[3] ; int[] MinBOARD = new int[3]; int[] MaxBOARD = new int[3] ; int play, plyLimit, depthGrow ; // define desired board, play and depth // Recursive MINIMAX public void init() { plyLimit = depth ; game = "playing"; while (Game = "playing") { BOARD = Maximize(BOARD, 1); display(BOARD); game = EvaluateEnd(BOARD); if game=“playing” {BOARD = otherPlay (BOARD); game = EvaluateEnd(BOARD); } } ; }; // Minimize Method int[] Minimize(BOARD,alpha,beta,depthGrow) {depthGrow = depthGrow +1; float min = + infinite ; for ( j = 1; j <= BOARD[3]; j=j+1 ) { MinBOARD = NextBoard(j, BOARD); if ( depthGrow = plyLimit ) POSSIBLE[2] = Evaluate(MinBOARD[1]) else POSSIBLE = Maximize(MinBOARD, depthGrow); } if ( POSSIBLE[2] < min ) BOARD = POSSIBLE; } Return BOARD } // Evaluate method float Evaluate(int[3] BOARD) { // determines the value of the function for the board } // Evaluate End method String EvaluateEnd(int[3] BOARD) { // determines if the games has ended } // Display Board void display(int[3] BOARD) { // display board over the interface } // Next Board Method int NextBoard(int j, int[3] BOARD){ // determines the next sequential board }
Big trees So what do you when you have a tree with many branches you do not need You prune it, don’t you? Back in the sixties, A. Samuel did it, he called his algorithm the αβpruning The idea behind is that the values in the upper two nodes already selected might preclude some nodes to be considered
X X 2 7 1 8 3 X αβ pruning >=3 3 does not go Max does not go goes up as α <=2 <=8 <=1 <=3 goes up and stops (α >=3) does not go goes up and stops ! (α >=3) does not go Min does not go does not go goes up goes up as β Eval
α β Recursive algorithm But the pruning algorithm is recursive and goes further down the tree, α β work two levels at a times, having stacked values α’ β’ andα’’ β’’ α’’ Much better here β’α values at this level help in reducing the nodes even way below β’’ α’ β’ α β values at this level help in reducing the nodes below
0 max min 0 0 0 max min 0 -3 0 -3 3 max 0 5 -3 3 3 -3 0 2 -2 3 An example of Alpha-beta pruning Source: www.cs.umbc.edu/~ypeng/F02671/lecture-notes/Ch05.pptt
Final tree max min max min max 0 5 -3 3 3 -3 0 2 -2 3 Source: www.cs.umbc.edu/~ypeng/F02671/lecture-notes/Ch05.pptt
Another αβ example Source: www.cs.pitt.edu/~milos/ courses/cs2710/lectures/Class8.pdf
Pseudo Code αβ Maximize method int[] Maximize(BOARD,alpha,beta,depthGrow) { alpha = -infinite; for ( j= 1; j <= BOARD[3] ; j=j+1 ) { MaxBOARD = NextBoard(j,BOARD); POSSIBLE = Minimize(MaxBOARD,alpha,beta,depthGrow); if ( beta > alpha ) alpha = beta; if alpha >= POSSIBLE[2] then BOARD = POSSIBLE }; return BOARD ; } public class Minimax with alpha beta pruning int[] BOARD = new int[3] // BOARD[1] is the board number // BOARD[2] is the evaluation of the board //BOARD[3] is the number of possible boards that open from this one int[] POSSIBLE = new int[3] ; int[] MinBOARD = new int[3]; int[] MaxBOARD = new int[3] ; int play, plyLimit, depthGrow ; // define desired board, play and depth // Recursive MINIMAX public void init() { plyLimit = depth ; game = "playing"; while (Game = "playing") { BOARD = Maximize(BOARD,alpha, beta, 1); display(BOARD); game = EvaluateEnd(BOARD); if game=“playing” {BOARD = otherPlay (BOARD); game = EvaluateEnd(BOARD); } } ; }; // Minimize Method int[] Minimize(BOARD,alpha,beta,depthGrow) {depthGrow = depthGrow +1; pruning = "in process" ; beta = + infinite ; for ( j = 1; j <= BOARD[3] AND pruning = "in process" ; j=j+1 ) { MinBOARD = NextBoard(j, BOARD); if ( depthGrow = plyLimit ) POSSIBLE[2] = Evaluate(MinBOARD[1]) else POSSIBLE = Maximize(MinBOARD, alpha, beta, depthGrow); } if ( MinBOARD[2] <= beta ) { BOARD = POSSIBLE ; beta = MinBOARD[2] ; if (beta <= alpha) pruning = "done" ; } } Return BOARD } // Evaluate method float Evaluate(int[3] BOARD) { // determines the value of the function for the board } // Evaluate End method String EvaluateEnd(int[3] BOARD) { // determines if the games has ended } // Display Board void display(int[3] BOARD) { // display board over the interface } // Next Board Method int NextBoard(int j, int[3] BOARD){ // determines the next sequential board }
Bigger trees ? If you have still a bigger tree you can always do a Crash Cut, The idea behind this pruning is to cut a complete sub tree before branching if you do not like a given type of move For example, in chess, lets say you do not care to exchange your queens, therefore that whole sub tree is chopped off
α β Crash Cut algorithm α’’ Just take disregard this sub tree β’’ α’ β’
Pseudo Code Crash Cut // Maximize with Crash Cut Pruning int Maximize(BOARD,alpha,beta,depthGrow) { alpha = -infinite; for ( j= 1; j <= BOARD[3] ; j=j+1 ) { MaxBOARD[1] = NextBoard(j,BOARD); // funcion dontCut returns true if there is not a Crash Cut If (dontCut(MaxBOARD[1]) { POSSIBLE = Minimize(MaxBOARD,alpha,beta,depthGrow); if ( beta > alpha ) alpha = beta; if alpha >= POSSIBLE[2] then BOARD = POSSIBLE } }; return BOARD ; // dontCut Method boolean dontCut(int board) { cut = true; for (j=1; j<=cutBoards and cut=true, j=j+1) { if board = cutList[j] cut = false } return cut } public class Minimax with alpha beta pruning int[] BOARD = new int[3] // BOARD[1] is the board number // BOARD[2] is the evaluation of the board //BOARD[3] is the number of possible boards that open from this one int[] POSSIBLE = new int[3] ; int[] MinBOARD = new int[3]; int[] MaxBOARD = new int[3] ; int play, plyLimit, depthGrow ; // define desired board, play and depth // Recursive MINIMAX public void init() { plyLimit = depth ; game = "playing"; while (Game = "playing") { BOARD = Maximize(BOARD,alpha, beta, 1); display(BOARD); game = EvaluateEnd(BOARD); if game=“playing” {BOARD = otherPlay (BOARD); game = EvaluateEnd(BOARD); } } ; }; // Evaluate method float Evaluate(int[3] BOARD) { // determines the value of the function for the board } // Evaluate End method String EvaluateEnd(int[3] BOARD) { // determines if the games has ended } // Display Board void display(int[3] BOARD) { // display board over the interface } // Next Board Method int NextBoard(int j, int[3] BOARD){ // determines the next sequential board } // Minimize Method int[] Minimize(BOARD,alpha,beta,depthGrow) {depthGrow = depthGrow +1; pruning = "in process" ; beta = + infinite ; for ( j = 1; j <= BOARD[3] AND pruning = "in process" ; j=j+1 ) { MinBOARD = NextBoard(j, BOARD); if ( depthGrow = plyLimit ) POSSIBLE[2] = Evaluate(MinBOARD[1]) else POSSIBLE = Maximize(MinBOARD, alpha, beta, depthGrow); } if ( MinBOARD[2] <= beta ) { BOARD = POSSIBLE ; beta = MinBOARD[2] ; if (beta <= alpha) pruning = "done" ; } } Return BOARD }
α β minimax time Crash cut ply Behavior with pruning
Game Trees with Chance Nodes • Chance nodes (shown as circles) represent the dice rolls. • Each chance node has 21 distinct children with a probability associated with each. • We can use minimax to compute the values for the MAX and MIN nodes. • Use expected values for chance nodes. • For chance nodes over a max node, as in C, we compute: • epectimax(C) = Sumi(P(di) * maxvalue(i)) • For chance nodes over a min node compute: • epectimin(C) = Sumi(P(di) * minvalue(i)) Min Rolls Max Rolls Source: www.cs.umbc.edu/~ypeng/F02671/lecture-notes/Ch05.pptt
Meaning of the evaluation function A1 is best move A2 is best move 2 outcomes with prob {.9, .1} • Dealing with probabilities and expected values means we have to be careful about the “meaning” of values returned by the static evaluator. • Note that a “relative-order preserving” change of the values would not change the decision of minimax, but could change the decision with chance nodes. • Linear transformations are ok Source: www.cs.umbc.edu/~ypeng/F02671/lecture-notes/Ch05.pptt
State of the ArtChess point scale Source: www.cs.umbc.edu/~ypeng/F02671/lecture-notes/Ch05.pptt