The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Progra

By James Mannion Computer Systems Lab 08-09 Period 3 The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Programme

Abstract • Searching through large sets of data • Complex, vast domains • Heuristic searches • Chess • Evaluation Function • Machine Learning

Introduction • Simple domains, simple heuristics • The domain of chess • Deep Blue – brute force • Looking at 30100 moves before making the first • Supercomputer • Too many calculations • Not efficient

Introduction (cont’d) • Minimax search • Alpha-beta pruning • Only look 2-3 moves into the future • Estimate strength of position • Evaluation function • Can improve heuristic by learning

Introduction (cont’d) • Seems simple, but can become quite complex. • Chess masters spend careers learning how to “evaluate” moves • Purpose: can a computer learn a good evaluation function?

Background • Claude Shannon, 1950 • Brute force would take too long • Discusses evaluation function • 2-ply algorithm, but looks further into the future for moves that could lead to checkmate • Possibility of learning in distant future

Background (cont’d) • D.F. Beal, M.C. Smith, 1999 • Temporal Difference learning • Program spends 20,000 games learning the evaluation function • Beats program that did not learn a function • Shows that programs can make their evaluation functions better

Background (cont’d) • David E. Moriarty, Riso Miikkulainen • Evolutionary Neural Networks • Othello • Complex Strategies developed

Background (cont’d) • Shiu-li Huang, Fu-ren Lin, 2007 • Temporal Difference Learning • Multi-Agent Bargaining • Bargaining, while not necessarily adversarial, is similar to chess and other games.

Development • Python • Stage 1: Text based chess game • Two humans input their moves • Illegal moves not allowed

Development (cont’d)

Development (cont’d) • Stage 2: Introduce a computer player • 2-3 ply • Evaluation function will start out such that choices are random

Development (cont’d) • Stage 3: Learning • Temporal Difference Learning • Adjusts the weights of the evaluation function slightly based on gameplay • Evaluation function updated each time a game is played

Testing • Learning vs No Learning • Two equal, random computer players pitted against each other. • One will have the ability to learn • Thousands of games • Win-loss differential tracked over the length of the test • By the end, the learner should be winning significantly more games.

References • Shannon, Claude. “Programming a Computer for Playing Chess.” 1950 • Beal, D.F., Smith, M.C. “Temporal Difference Learning for Heuristic Search and Game Playing.” 1999 • Moriarty, David E., Miikkulainen, Risto. “Discovering Complex Othello Strategies Through Evolutionary Neural Networks.” • Huang, Shiu-li, Lin, Fu-ren. “Using Temporal-Difference Learning for Multi-Agent Bargaining.” 2007 • Russell, Stuart, Norvig, Peter. Artificial Intelligence: A Modern Approach. Second Edition. 2003.

The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Progra

The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Progra

Presentation Transcript

Algorithms of Artificial Intelligence

The Implementation of Artificial Intelligence and Temporal Difference Learning Algorithms in a Computerized Chess Progra

Artificial Intelligence in Chess

Temporal Difference Learning

Artificial Intelligence Machine learning

Evolutionary Algorithms and Artificial Intelligence

Distributed Control Algorithms for Artificial Intelligence

Chapter 6: Temporal Difference Learning

Lecture 18: Temporal-Difference Learning

Temporal-Difference Learning Week #6

Machine learning Subset of Artificial Intelligence

Learning in Artificial Intelligence - Great Learning

The Implementation of Artificial Intelligence and Machine Learning in a Computerized Chess Program

Randomized Strategies and Temporal Difference Learning in Poker

Chapter 6: Temporal Difference Learning

Applications of Artificial Intelligence and Machine Learning in Othello

Artificial Intelligence and Machine Learning

Artificial Intelligence and Deep Learning in Use

Artificial Intelligence and Machine Learning Services

Impact of Artificial Intelligence in E-Learning System

Machine Learning and Artificial Intelligence Solutions