160 likes | 294 Views
By James Mannion Computer Systems Lab 08-09 Period 3. The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Program. Abstract. Searching through large sets of data Complex, vast domains Heuristic searches Chess
E N D
By James Mannion Computer Systems Lab 08-09 Period 3 The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Program
Abstract • Searching through large sets of data • Complex, vast domains • Heuristic searches • Chess • Evaluation Function • Machine Learning
Introduction • Games • Minimax search • Alpha-beta pruning • Only look 2-3 moves into the future • Estimate strength of position • Evaluation function • Can improve heuristic by learning
Introduction • Seems simple, but can become quite complex. • Chess masters spend careers learning how to “evaluate” moves • Purpose: can a computer learn a good evaluation function?
Background • Claude Shannon, 1950 • Brute force would take too long • Discusses evaluation function • 2-ply algorithm, but looks further into the future for moves that could lead to checkmate • Possibility of learning in distant future
Development • Python • Stage 1: Text based chess game • Two humans input their moves • Illegal moves not allowed
Development • Stage 2: Introduce a computer player • 2-3 ply • Evaluation function will start out such that choices are based on a simple piece-differential where each piece is waited equally
Development • Stage 3: Learning • Temporal Difference Learning • Weight adjustment: • w←w + a*(Pt - Pt-1)*∂wPt-1 • a = 200/(199 + n) • P = 1/(1 + e-h) • h = w1(j1 – k1) + … +w5(j5 – k5)
Testing • Learning vs No Learning • Two equal, piece-differential players pitted against each other. • One will have the ability to learn • Multiple Games • Weight values and win-loss differential tracked over the length of the test
Results • Weights changed • This affected performance • Equilibrium values reached • Program actually got worse at chess • Probably due to code error
References • Shannon, Claude. “Programming a Computer for Playing Chess.” 1950 • Beal, D.F., Smith, M.C. “Temporal Difference Learning for Heuristic Search and Game Playing.” 1999 • Moriarty, David E., Miikkulainen, Risto. “Discovering Complex Othello Strategies Through Evolutionary Neural Networks.” • Huang, Shiu-li, Lin, Fu-ren. “Using Temporal-Difference Learning for Multi-Agent Bargaining.” 2007 • Russell, Stuart, Norvig, Peter. Artificial Intelligence: A Modern Approach. Second Edition. 2003. • Asgharbeygi, Nima, Stracuzzi, David and Langley, Pat.“Relational Temporal Difference Learning”.