1 / 21

Evolving Neural Network Agents in the NERO Video Game

Evolving Neural Network Agents in the NERO Video Game. Author : Kenneth O. Stanley, Bobby D. Bryant, and Risto Miikkulainen Presented by Yi Cheng Lin. Outline. Introduction The behavior of agents Challenges to traditional Reinforcement learning (RL) techniques

umed
Download Presentation

Evolving Neural Network Agents in the NERO Video Game

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evolving Neural Network Agents in the NERO Video Game Author:Kenneth O. Stanley, Bobby D. Bryant, and Risto Miikkulainen Presented by Yi Cheng Lin

  2. Outline • Introduction • The behavior of agents • Challenges to traditional Reinforcement learning (RL) techniques • Real-time NeuroEvolution of augmenting Topologies (rtNEAT) • NeuroEvolving Robotic Operatives (NERO) • Playing NERO • Conclusion

  3. Introduction • The world video game market in 2002 was between $15 billion and $20 billion • This paper introduces the real-time NeuroEvolution of Augmenting Topologies (rt-NEAT) • It’s purpose is let Non-player-character (NPC) interact with palyers in game playing

  4. The behavior of agents • The behavior of agents in current games is often repetitive and predictable • Machine learning could potentially keep video games interesting by allowing agents to change and adapt • a major problem with learning in video games is that if behavior is allowed to change, the game content becomes unpredictable

  5. Challenges to traditional Reinfor -cement learning (RL) techniques • Large state/action space • Diverse behaviors • Consistent individual behaviors • Fast adaptation • Memory of past states

  6. Real-time NeuroEvolution of augmenting Topologies (rtNEAT) • The rtNEAT method is based on NEAT, a technique for evolving neural networks for complex reinforcement learning task using a genetic algorithm • NEAT is based on three key idea

  7. NEAT • First, tracking genes with historical markings to allow easy crossover between different topologies • each unique gene in the population is assigned a unique innovation number, and the number are inherited during crossover • protecting innovation via speciation

  8. NEAT • Second, the reproduction mechanism for NEAT is explicit fitness sharing, where organisms in the same species must share the fitness of their niche, preventing any one species from taking over the population • Third, NAET begins with a uniform population of simple networks with no hidden nodes

  9. Running NEAT in Real Time

  10. rtNEAT • After every n ticks of the game clock, rtNEAT performs the following operation: Step 1: Remove the agent with the worst adjusted fitness from the population assuming one has been alive sufficiently long so that it has been properly evaluated • It is also important not to remove agents that are too young

  11. rtNEAT Step 2: Re-estimate F for all species (F : average fitness) Step 3:Choose a parent species to create the new offspring ,where is the average fitness of species k, is the sum of all the average species fitness

  12. rtNEAT Step 4: Adjust compatibility threshold Ct dynamically and reassign all agents to species • the advantage of this kind of dynamic compatibility thresholding is that it keeps the number of species relatively stable Step 5: Replacing the old agent with the new one

  13. Determining Ticks Between Replacements • The appropriate frequency can be determined through a principled approach • Parameter: • n : the ticks between replacements • I : the fraction of the population that is too young and therefore cannot be replaced • m : is the minimum time alive • |P| is the population size

  14. Determining Ticks Between Replacements • It is best to let the user choose I because in general it is most critical to performance • rtNEAT can determine the correct number of ticks between replacements n to maintain a desired eligibility level. • In NERO, 50% of the population remains eligible using this technique

  15. NeuroEvolving Robotic Operatives (NERO) • Training Mode • The player sets up training exercises by placing objects on the field and specifying goals through several sliders • Battle Mode

  16. Avoiding turret fire

  17. Navigating a maze

  18. Conclusion • A real-time version of NEAT (rtNEAT) was developed to allow users to interact with evolving agents • Using this method, it was possible to build an entirely new kind of video game, NERO, where the characters adapt in real time in response to the player’s actions

More Related