490 likes | 521 Views
Evolution and Coevolution of Artificial Neural Networks playing Go. Thesis by Peter Maier, Salzburg, April 2004 Additional paper used Computer Go, by Martin Müller Presented by Dima Stopel bravo102@gmail.com. Overview. Go: History and Rules Role of the computer in Go
E N D
Evolution and Coevolutionof Artificial Neural Networksplaying Go Thesis by Peter Maier, Salzburg, April 2004 Additional paper used Computer Go, by Martin Müller Presented by Dima Stopel bravo102@gmail.com
Overview • Go: History and Rules • Role of the computer in Go • Brief introduction to ANN • Experimental Setup • Training of Go playing ANN’s • Evolution of Go playing ANN’s
History of Go • Go is an ancient Chinese board game that is believed to be 2,000 to 4,000 years old. • Go is played around the world, and has several names. The Chinese call it Wei-chi. In Korea it’s Baduk. The Japanese word is Igo, or just Go.
Go BasicsStones and Board • Boards • Standard 19x19 • Beginners 9x9 13x13 • Stones • 180 White • 181 Black
Go BasicsGamePlay and Winning Condition • Play starts on an empty board. • Players put their stones at the intersections of the lines on the board. • Players can pass at any time. • Consecutive passes end the game. • The goal of the game is to control a larger area than the opponent and take more prisoners.
Go BasicsThree Rules of Go • Rule 1 • Stones of one color that have been completely surrounded by the opponent are removed from the board as prisoners. Liberties
Go BasicsThree Rules of Go • Rule 2 • No suicide moves are allowed.
Go BasicsThree Rules of Go • Rule 3 – The “Ko” rule. • No infinity.
Overview • Go: History and Rules • Role of the computer in Go • Brief introduction to ANN • Experimental Setup • Training of Go playing ANN’s • Evolution of Go playing ANN’s
State of computer in GoThe challenge • It is easy to write a Go program that can play a complete game. However, it is hard to write a program that plays well.
State of computer in GoOverview picture • 1980 The Ing Foundation’s issues a million dollar prize for a professional level Go program
State of computer in GoThe challenge • Computer Go poses many formidable conceptual, technical and software engineering challenges. • Most programs have required 5–15 person-years of effort. • Contained 50–100 modules dealing with different aspects of the game. • The fact that despite all these efforts the current level of programs is still modest
State of computer in GoThe challenge • The search space for 19 × 19 Go is very large compared to other popular board games. • The number of distinct board positions is , and about 1.2% of these are legal. • In chess 20 potential moves from a board position are available, but on a standard 19x19 Go board there are about 200–300 potential moves.
Overview • Go: History and Rules • Role of the computer in Go • Brief introduction to ANN • Experimental Setup • Training of Go playing ANN’s • Evolution of Go playing ANN’s
Brief intro to ANN History and motivation • 1943 Warren McCulloch, Walter Pitts A Logical Calculus of Ideas Immanent in Nervous Activity • 1958 Frank Rosenblatt Perceptron • 1969 M. Minsky, S. Papert's Perceptrons • 1974 Paul Werbos Back-propagation • 1986 D. E. Rumelhart Learning internal representation by error propagation. • 1991 K. Hornik Approximation capabilities of multilayer feed forward networks.
Brief intro to ANN General facts • Data Driven Model • One of the most flexible mathematical models for Data Mining. • Very powerful when trying to solve very complex non linear problems
Brief intro to ANN Weights and Biases – Artificial Neuron Non linear power!
Brief intro to ANN Example – XOR problem (0,1) (1,1) (1,0) (0,0)
Brief intro to ANN Activation Functions Logistic Sigmoid Hyperbolic Tangent Sign Function
Brief intro to ANN multilayer perceptron Neurons Weights Hidden Layer Output Layer Inputs
Brief intro to ANN multilayer perceptron • Most ANN architectures are: • Feed Forward • Layered (two layers) • In the past up to three layers were used. • There are a lot of more other types: • Recurrent Networks • Feed Forward but not layered
Brief intro to ANN Learning Procedure • The process of determining the values for W on the basis of the data is called learning or training • We want to make as close as possible to We can achieve this by minimizing the error function by changing W (weights). Error function example Sum Of Squares
Brief intro to ANN Example – Gradient Descent Error function derivative calculation Next weight calculation
Evolution approachTogether with ANN • Parts that can be evolved • Connection weights • Network topology • Hidden neurons amount • Activation functions
Overview • Go: History and Rules • Role of the computer in Go • Brief introduction to ANN • Experimental Setup • Training of Go playing ANN’s • Evolution of Go playing ANN’s
Experimental SetupThe Referee and Game Ending • Referee • JaGo, a Go playing Java program written by Fuming Wang, slightly improved • Game Ending • Each players passes • When one player placed all of his stones • There are no free intersections left • Fools Draw
Experimental SetupComputer Go players • Random Player • Only knows the rules • Naïve Player • Knows the rules • Knows some basic strategies • JaGo Player • JaGo is the best computer player that have been used • Estimated rank ~16 kyu
Experimental SetupANN Player • Creativity factor • Strength (wins / games) 2000 games with each player, total 6000 games
Experimental SetupGo Board Representation • Standard input representation • Two inputs for each intersections • Naïve input representation • One input for each intersections • Limited View Input Representation • w sized windows
Experimental Setup Output Representation • Standard output representation • One output for each intersections • Row-Column output representation • One for each row or column
Experimental Setup Techniques used • Simple Training • Simple Evolution • Using Random,Naïve and JaGo players • Simple Coevolution • Competing against each other • Cultural Coevolution • Competing against culture • Hall of Fame Coevolution • Competing against HoF
Experimental Setup ANN Encoding • Feed Forward ANN • Three chromosomes were used • Binary: Connections Encoding • Binary: Hidden Neurons Encoding • Real: Weights and Biases Encoding • Recurrent ANN • Two chromosomes were used (no hidden layer) • Binary: Connections Encoding • Real: Weight and Biases Encoding • Generally Two Point Crossover were used • Strength function were used always as fitness
Overview • Go: History and Rules • Role of the computer in Go • Brief introduction to ANN • Experimental Setup • Training of Go playing ANN’s • Evolution of Go playing ANN’s
Training of Go playing ANN’s • Feed forward, fully connected ANN’s were used • Each training experiment is repeated 20 times • Training set consisted of Go games played by JaGo against itself • Sigmoid function was used • Number of connections: • For straight evaluation each network played 2000 games against three players • Learning algorithm used:Resilient back–propagation
Overview • Go: History and Rules • Role of the computer in Go • Brief introduction to ANN • Experimental Setup • Training of Go playing ANN’s • Evolution of Go playing ANN’s
Evolution of Go playing ANN’sInitialization • Binary chromosomes: randomly set with probability p. • Real chromosomes: randomly set to small random values between -0.1 and 0.1 • Maximum of 20 hidden neurons were used. • 20 experiments for each run • End conditions: Fitness (strength) = 1 or 3000 generations reached.
Evolution of Go playing ANN’sResults for Random Interesting observation: Output values of ANN’s wasn’t influenced by the inputs. Just tried to catch important places (mainly in the middle of the board).
Evolution of Go playing ANN’sResults for Naïve Important observation: ANN is able to learn basic Go by means of evolution. > 200 generations => all ANN could win by rate of 50%
Evolution of Go playing ANN’sResults for JaGo Important observation: 1000 generation for mean fitness of 0.4. Standard deviation is twice as high as for Naïve.
Evolution of Go playing ANN’sResults for JaGo – Recurrent ANN’s Important observation: Within 1000 generation a individual evolved with fitness function of 1!
Evolution of Go playing ANN’sCultural Coevolution Population Culture
Evolution of Go playing ANN’sResults for Cultural Coevolution “The analysis of these culture ANNs showed that the ANNs—when playing the black stones—take advantage of a weakness of JaGo.”
Evolution of Go playing ANN’sResults for HoF Poor related to Cultural
The End Any Questions ?
The End Thank you ;)