1 / 16

By James Mannion Computer Systems Lab 08-09 Period 3

This project explores implementing AI & Temporal Difference Learning in a chess program. It covers heuristic searches, evaluation functions, Minimax search, Alpha-beta pruning, and learning processes. The study delves into the complexity of chess evaluation and the potential for improvement through learning algorithms. Development stages progress from a text-based game to introducing a computer player and integrating learning techniques like Temporal Difference Learning. Testing involves comparing learning vs. non-learning players in game simulations to assess their win-loss differentials. The references include works on temporal difference learning, neural networks, and AI in gaming.

terrilyn
Download Presentation

By James Mannion Computer Systems Lab 08-09 Period 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. By James Mannion Computer Systems Lab 08-09 Period 3 The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Programme

  2. Abstract • Searching through large sets of data • Complex, vast domains • Heuristic searches • Chess • Evaluation Function • Machine Learning

  3. Introduction • Simple domains, simple heuristics • The domain of chess • Deep Blue – brute force • Looking at 30^6 moves before making the first • Supercomputer • Too many calculations • Not efficient

  4. Introduction (cont’d) • Minimax search • Alpha-beta pruning • Only look 2-3 moves into the future • Estimate strength of position • Evaluation function • Can improve heuristic by learning

  5. Introduction (cont’d) • Seems simple, but can become quite complex. • Chess masters spend careers learning how to “evaluate” moves • Purpose: can a computer learn a good evaluation function?

  6. Background • Claude Shannon, 1950 • Brute force would take too long • Discusses evaluation function • 2-ply algorithm, but looks further into the future for moves that could lead to checkmate • Possibility of learning in distant future

  7. Development • Python • Stage 1: Text based chess game • Two humans input their moves • Illegal moves not allowed

  8. Development (cont’d)

  9. Development (cont’d)

  10. Development (cont’d)

  11. Development (cont’d) • Stage 2: Introduce a computer player • 2-3 ply • Evaluation function will start out such that choices are based on a simple piece-differential where each piece is waited equally

  12. Development (cont’d) • Stage 3: Learning • Temporal Difference Learning • Weight adjustment: • w_i < − − w_i + a((n_ic − n_ip)/(n_ic)) • Heuristic function: • h = c_1(p_1) + c_2(p_2) + c_3(p_3) + c_4(p_4) + c_5(p_5) • Piece values: • p-i = Sum(w_i) – Sum(b_i) over i

  13. Testing • Learning vs No Learning • Two equal, piece-differential players pitted against each other. • One will have the ability to learn • Thousands of games • Win-loss differential tracked over the length of the test • By the end, the learner should be winning significantly more games.

  14. Data

  15. Data (cont'd)

  16. References • Shannon, Claude. “Programming a Computer for Playing Chess.” 1950 • Beal, D.F., Smith, M.C. “Temporal Difference Learning for Heuristic Search and Game Playing.” 1999 • Moriarty, David E., Miikkulainen, Risto. “Discovering Complex Othello Strategies Through Evolutionary Neural Networks.” • Huang, Shiu-li, Lin, Fu-ren. “Using Temporal-Difference Learning for Multi-Agent Bargaining.” 2007 • Russell, Stuart, Norvig, Peter. Artificial Intelligence: A Modern Approach. Second Edition. 2003. • Asgharbeygi, Nima, Stracuzzi, David and Langley, Pat.“Relational Temporal Difference Learning”.

More Related