1 / 38

Constructing Intelligent Agents via Neuroevolution

Constructing Intelligent Agents via Neuroevolution. By Jacob Schrum schrum2@cs.utexas.edu. Motivation. Intelligent agents are needed Search-and-rescue robots Mars exploration Training simulations Video games Insight into nature of intelligence Sufficient conditions for emergence of:

neith
Download Presentation

Constructing Intelligent Agents via Neuroevolution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Constructing Intelligent Agents via Neuroevolution By Jacob Schrum schrum2@cs.utexas.edu

  2. Motivation • Intelligent agents are needed • Search-and-rescue robots • Mars exploration • Training simulations • Video games • Insight into nature of intelligence • Sufficient conditions for emergence of: • Cooperation • Communication • Multimodal behavior

  3. Talk Outline • Bio-inspired learning methods • Neural networks • Evolutionary computation • My research • Learning multimodal behavior • Modular networks in Ms. Pac-Man • Human-like behavior in Unreal Tournament • Future work • Conclusion

  4. Artificial Neural Networks • Brain = network of neurons • ANN = abstraction of brain • Neurons organized into layers Inputs Outputs

  5. What Can Neural Networks Do? • In theory, anything! • Universal Approximation Theorem • Can’t program: too complicated • In practice, learning/training is hard • Supervised: Backpropagation • Unsupervised: Self-Organizing Maps • Reinforcement Learning: Temporal-Difference and Evolutionary Computation

  6. Evolutionary Computation • Computational abstraction of evolution • Descent with modification (mutation) • Sexual reproduction (crossover) • Survival of the fittest (natural selection) • Evolution + Neural Nets = Neuroevolution • Population of neural networks • Mutation and crossover modify networks • Net used as control policy to evaluate fitness

  7. Neuroevolution Example Start With Parent Population

  8. Neuroevolution Example Start With Parent Population Evaluate and Assign Fitness 100 90 50 31 75 56 61

  9. Neuroevolution Example Start With Parent Population Evaluate and Assign Fitness 100 90 50 31 75 56 61 Clone, Crossover and Mutate To Get Child Population

  10. Neuroevolution Example Start With Parent Population Evaluate and Assign Fitness 100 90 50 31 75 56 61 Clone, Crossover and Mutate Children Are Now the New Parents Repeat Process: Fitness Evaluations 100 120 83 50 69 60 99 As the process continues, each successive population improves performance

  11. Neuroevolution Applications Double Pole Balancing F. Gomez and R. Miikkulainen, “2-D Pole Balancing With Recurrent Evolutionary Networks” ICANN 1998

  12. Neuroevolution Applications Finless Rocket Control F. Gomez and R. Miikkulainen, “Active Guidance for a Finless Rocket Using Neuroevolution” GECCO 2003

  13. Neuroevolution Applications Vehicle Crash Warning System N. Kohl, K. Stanley, R. Miikkulainen, M. Samples, and R. Sherony, "Evolving a Real-World Vehicle Warning System" GECCO 2006

  14. Neuroevolution Applications http://nerogame.org/ Training Video Game Agents K. O. Stanley, B. D. Bryant, I. Karpov, R. Miikkulainen, "Real-Time Evolution of Neural Networks in the NERO Video Game" AAAI 2006

  15. What is Missing? • NERO agents are specialists • Sniping from a distance • Aggressively rushing in • Humans can do all of this, and more • Multimodal behavior • Different behaviors for different situations • Human-like behavior • Preferred by humans

  16. What I do With Neuroevolution • Discover complex agent behavior • Discover multimodal behavior Contributions: • Use multi-objective evolution • Different objectives for different modes • Evolve modular networks • Networks with modules for each mode • Human-like behavior • Constrain evolution

  17. Pareto-based Multiobjective Optimization High health but did not deal much damage Tradeoff between objectives Dealt lot of damage, but lost lots of health

  18. Non-dominated Sorting Genetic Algorithm II • Population P with size N; Evaluate P • Use mutation (& crossover) to get P´ size N; Evaluate P´ • Calculate non-dominated fronts of P È P´ size 2N • New population size N from highest fronts of P È P´ K. Deb, S. Agrawal, A. Pratap, T. Meyarivan, "A Fast Elitist Non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II" PPSN VI, 2000

  19. Ms. Pac-Man • Popular classic game • Predator-prey scenario • Ghosts are predators • Until power pill is eaten • Multimodal behavior needed • Running from threats • Chasing edible ghosts • More?

  20. Modular Networks • Different areas of brain specialize • Structural modularity → functional modularity • Apply to evolved neural networks • Separate module → behavioral mode • Preference neurons (grey) arbitrate between modules • Use module with highest preference output ( )( )

  21. Module Mutation • Let evolution decide how many modules Networks start with one module New modules added by one of several module mutations Previous Duplicate Random

  22. Intelligent Module Usage • Evolution discovers a novel task division • Not programmed • Dedicates one module to luring (cyan) • Improves ghost eating when using other module

  23. Comparison With Other Work [1] A.M. Alhejali, S.M. Lucas: Evolving diverse Ms. Pac-Man playing agents using genetic programming. UKCI 2010. [2] A.M. Alhejali, S.M. Lucas: Using a training camp with Genetic Programming to evolve Ms Pac-Man agents. CIG 2011. [3] M.F. Brandstetter, S. Ahmadi: Reactive control of Ms. Pac Man using information retrieval based on Genetic Programming. CIG 2012. [4] G. Recio, E. Martín, C. Estébanez, Y. Sáez: AntBot: Ant Colonies for Video Games. TCIAIG 2012. [5] A.M. Alhejali, S.M. Lucas: Using genetic programming to evolve heuristics for a Monte Carlo Tree Search Ms Pac-Man agent. CIG 2013.

  24. Types of Intelligence • Evolved intelligent Ms. Pac-Man behavior • Surprising module usage • Evolution discovers the unexpected • Diverse collection of solutions • Still not human-like • Human-like vs. optimal • Human intelligence

  25. Modern Game: Unreal Tournament • 3D world with simulated physics • Multiple human and software agents interacting • Agents attack, retreat, explore, etc. • Multimodal behavior required to succeed

  26. Human-like Behavior: BotPrize • International competition at CIG conference • A Turing Test for video game bots • Judge as human over 50% of time to win • After 5 years, we won in 2012 • Evolved combat behavior • Constrained to be human-like

  27. Guessing Game • Coleman: ???? • Milford: ???? • Moises: ???? • Lawerence: ???? • Clifford: ???? • Kathe: ???? • Tristan: ???? • Jackie: ????

  28. Judging Game

  29. Player Identities • Coleman: UT^2 (Our winning bot) • Milford: ICE-2010 (bot) • Moises: Discordia (bot) • Lawerence: Native UT2004 bot • Clifford: w00t (bot) • Kathe: Human • Tristan: Human • Jackie: Native UT2004 bot

  30. Human Subject Study • Six participants played the judging game • Recorded extensive post-game interviews • What criteria to humans claim to judge by?

  31. Lessons Learned • Don’t be too skilled • Evolved with accuracy restrictions • Disable elaborate dodging • Humans are “tenacious” • Opponent-relative actions • Encourage “focusing” on opponent • Don’t repeat mistakes • Database of human traces to get unstuck

  32. Bot Architecture

  33. Future Work • Evolving teamwork • Ghosts must cooperate to eat Ms. Pac-Man • Unreal Tournament supports team play • Domination, Capture the Flag, etc. • Interactive evolution • Evolve in response to human interaction • Adaptive opponents/assistants • Evolutionary art • Content generation http://picbreeder.org/

  34. Conclusion • Evolution discovers unexpected behavior • Modular networks learn multimodal behavior • Human behavior not optimal • Evolution can be constrained to be more human-like • Many directions for future research

  35. Questions? contact Jacob Schrum schrum2@cs.utexas.edu

More Related