130 likes | 257 Views
Applications of GP in Computer Gaming. Chris Stinson COSC 4V82, Brock University. Contents. Introduction Robocode Experiment Details Results Conclusions. Introduction. Focus in computer games shifting from sound and graphics to sophisticated, human-level Artificial Intelligence [2]
E N D
Applications of GP in Computer Gaming Chris Stinson COSC 4V82, Brock University
Contents • Introduction • Robocode • Experiment Details • Results • Conclusions
Introduction • Focus in computer games shifting from sound and graphics to sophisticated, human-level Artificial Intelligence [2] • Machine Learning Techniques successfully implemented commercially, but GP only in academic applications
Varieties of Game AI • [Ponsen, 2004] • Offline Learning: AI improved during development (prior to release) • EA’s used sporadically for commercial games • Online Learning: AI adapts during gameplay (after game released) • Dynamic scripting often used • No known implementation of EA’s – too slow, inconsistent results • Indirect Adaption: pre-determined actions in response to game statistics • Ex. Dynamic difficulty settings in Max Payne 2 • Direct Adaption: behaviour adapts to current performance • More difficult to implement than Indirect, but offers more dynamic experience • Ex. Creatures trained in Black & White with a reinforcement algorithm
Genetic Programming? • Machine learning has been used, sure, but what about GP? • Human-competitive results from GP-generated programs • Both simulated human players and opponent AI • Ex. controllers for Tetris, Pac Man, Super Mario, and various racing games, as well as bots for Quake 2 and Unreal Tournament [3]
Robocode • Simulation-based game in which robotic tanks fight to destruction • No direct human interaction; program vs. program • Regular contests between submitted tank AI
The Experiment • Use of GP to generate Robocode AI • Designed for 1v1 battles (as opposed to free-for-alls) • Code size limit: 4 semicolons • Objective: human-competitive machine intelligence • The result holds its own or wins a regulated competition involving human contestants (in the form of either live human players or human-written computer programs). • Previous Experiments: • Similar experiment by Eisenstein, but not generalized • Some ANN-based attempts, but limited success
GP Language Implemented using ECJ Terminals: individual / enemy info, constants, and “random” Functions: Standard set of arithmetic and logical • Boolean: IfGreater and IfPositive Fire command implemented as single-argument function
Fitness Evaluation • Three unique, top-ranked adversaries used • Static selections throughout experiment • Three rounds against each, per generation (since non-deterministic game) • Tournament Scoring used for fitness: • Where is the player’s score, and is the adversary’s score • Modified fitness used for early rounds: • Where is a small, fixed constant • Improves variance, where many early player scores are zero • Average fitness of each round and each opponent used
GP Parameters • Population: 256 • Restricted by computational limitations • Generations: no set limit • Interesting methodology: manually stopped when fitness flatlined • Initial population: ramped half-and-half (depth 4-6) • Crossover: 0.95 • Mutation: 0.05 (Grow) • Selection: Tournament (k=5) • Elitism: 2 best individuals kept • Testing: 100 rounds against 12 different adversaries
Results • Entry competitive with hand-crafted programs [1] • Ranked 3rd out of 27 entries in HaikuBot league competition (2004) Average fitness • Implies it can hold its own Best fitness • Implies can defeat most opponents Keep in mind: fitness scores reflect battles against top opponents
Closing Remarks • Impressive results, but lots of potential improvements: • Improved computational resources • Best run took about 10 days to complete! • Authors estimate additional adversaries would greatly improve generality • Parallelization techniques • Linear GP? • Co-evolution? • Attempted (along with strong typing and ADFs) to little effect • Best strategies early on dominate no diversity; two-phase process? • Far from real-time, but proof that GP can succeed in complex environments • What if an evolved solution could further adapt, specializing to counter the player?
References • [1] Shichel, Y & Ziserman, E & Sipper, M (2004). GP-Robocode: Using Genetic Programming to Evolve Robocode Players: Ben-Gurion University • [2] Ponsen, M (2004). Improving Adaptive Game AI With Evolutionary Learning: Delft University of Technology • [3] Ebner, M & Tiede, T (). Evolving Driving Controllers using Genetic Programming:Symposium on Computational Intelligence and Games • Questions?