By James Mannion Computer Systems Lab 08-09 Period 3

By James Mannion Computer Systems Lab 08-09 Period 3 The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Programme

Abstract • Searching through large sets of data • Complex, vast domains • Heuristic searches • Chess • Evaluation Function • Machine Learning

Introduction • Simple domains, simple heuristics • The domain of chess • Deep Blue – brute force • Looking at 30^6 moves before making the first • Supercomputer • Too many calculations • Not efficient

Introduction (cont’d) • Minimax search • Alpha-beta pruning • Only look 2-3 moves into the future • Estimate strength of position • Evaluation function • Can improve heuristic by learning

Introduction (cont’d) • Seems simple, but can become quite complex. • Chess masters spend careers learning how to “evaluate” moves • Purpose: can a computer learn a good evaluation function?

Background • Claude Shannon, 1950 • Brute force would take too long • Discusses evaluation function • 2-ply algorithm, but looks further into the future for moves that could lead to checkmate • Possibility of learning in distant future

Development • Python • Stage 1: Text based chess game • Two humans input their moves • Illegal moves not allowed

Development (cont’d)

Development (cont’d) • Stage 2: Introduce a computer player • 2-3 ply • Evaluation function will start out such that choices are based on a simple piece-differential where each piece is waited equally

Development (cont’d) • Stage 3: Learning • Temporal Difference Learning • Weight adjustment: • w_i < − − w_i + a((n_ic − n_ip)/(n_ic)) • Heuristic function: • h = c_1(p_1) + c_2(p_2) + c_3(p_3) + c_4(p_4) + c_5(p_5) • Piece values: • p-i = Sum(w_i) – Sum(b_i) over i

Testing • Learning vs No Learning • Two equal, piece-differential players pitted against each other. • One will have the ability to learn • Thousands of games • Win-loss differential tracked over the length of the test • By the end, the learner should be winning significantly more games.

Data

Data (cont'd)

References • Shannon, Claude. “Programming a Computer for Playing Chess.” 1950 • Beal, D.F., Smith, M.C. “Temporal Difference Learning for Heuristic Search and Game Playing.” 1999 • Moriarty, David E., Miikkulainen, Risto. “Discovering Complex Othello Strategies Through Evolutionary Neural Networks.” • Huang, Shiu-li, Lin, Fu-ren. “Using Temporal-Difference Learning for Multi-Agent Bargaining.” 2007 • Russell, Stuart, Norvig, Peter. Artificial Intelligence: A Modern Approach. Second Edition. 2003. • Asgharbeygi, Nima, Stracuzzi, David and Langley, Pat.“Relational Temporal Difference Learning”.

By James Mannion Computer Systems Lab 08-09 Period 3

By James Mannion Computer Systems Lab 08-09 Period 3

Presentation Transcript

Computer Systems Lab TJHSST

Computer Systems Lab TJHSST

Computer Systems Lab TJHSST

Computer Systems Lab TJHSST Current Projects 2004-2005 Third Period

Computer Systems Lab TJHSST Current Projects 2004-2005 Second Period

Computer Systems Lab TJHSST

Lab 08

ECE354 - Computer Systems Lab II

Lab 09

‘07 ‘08 ‘09

Lab period

James Knox Polk By: Andy Eliason Period 3

James Knox Polk By: Andy Eliason Period 3

Operating Systems Lab. (#3)

Warm Up 09-08-09

Computer Systems Lab TJHSST Current Projects 2004-2005 First Period

Computer Systems Lab TJHSST

The Computer Systems Lab

Computer Systems Lab TJHSST Current Projects 2004-2005 Second Period

Computer Systems Lab TJHSST

Computer Systems Lab TJHSST Current Projects 2004-2005 Third Period

Computer Systems Lab TJHSST