Competitive Relative Performance Evaluation of Neural Controllers for Competitive Game Playing With Teams of Real Mobile

CRIM Competitive Relative Performance Evaluation of Neural Controllers for Competitive Game Playing With Teams of Real Mobile Robots A. L. Nelson and E. Grant Center for Robotics and Intelligent Machines Department of Electrical and Computer Engineering North Carolina State University Raleigh, NC 27695-7911 T. C. Henderson Department of Computer Science School of Computing 3190 Merrill Engineering Building University of Utah Salt Lake City, Utah 84112 Nelson, Grant and Henderson, PerMIS 2002

Presentation Overview • Introduction to and recent history of Evolutionary Robotics (ER) • Absolute vs. relative fitness functions • The CRIM ER Research Environment • Evolvable neural network architecture and genetic representation • Results: Evolution of game behaviors • Results: Measuring evolved behavior quality Nelson, Grant and Henderson, PerMIS 2002

Relevant Results in Evolutionary Robotics • 1996: Floreano, Mondata: • Homing behavior (Khepera) • Embodied evolution • 1998: Jakobi also Meyer • Octopod locomotion with object avoidance • Evolution in simulation w/ transference to reality • 1999: Pollack • Phototaxis • Embodied evolution Nelson, Grant and Henderson, PerMIS 2002

Robot Task: competitive team game playing: ‘Capture the Flag’ A more complex behavioral task: Neural Controllers for Team Behavior Nelson, Grant and Henderson, PerMIS 2002

Fitness and Training Functions for Selection in Evolutionary Robotics • Fitness function specification is the central issue in Evolutionary Robotics • Most research to date has used hand-formulated absolute fitness functions • We propose using competitive tournaments to determine the relative fitness of each individual in a population Nelson, Grant and Henderson, PerMIS 2002

The Relative Competitive Fitness Function Absolute Fitness Functions • Used in almost all ER work • Maximize or minimize a sum of hand derived problem specific factors • Example: Avoid objects and travel straight with differential steering • Used in this work • For a member of an evolving population pP, Fitness over the course of a complete tournament of games is given by • F(p) = w(p) + d(p) + n(p) Nelson, Grant and Henderson, PerMIS 2002

The CRIM ER Research Environment ANN GA Population EvBot Simulated Environment Real Maze Environment Nelson, Grant and Henderson, PerMIS 2002

Range Sensor Emulation, Real Video Nelson, Grant and Henderson, PerMIS 2002

The Genome and Mutation • Neural network weight and connection Information is stored in a matrix W of real numbers • The chromosome: • Composite mutation operator: Nelson, Grant and Henderson, PerMIS 2002

The Genetic Algorithm • Incomplete (μ+λ)-ES: greedy selection with mutation and replacement • The next generation is given by the union of the following sets Nelson, Grant and Henderson, PerMIS 2002

Results: Evolution of Behavior • Wall avoidance • Selective goal avoidance Nelson, Grant and Henderson, PerMIS 2002

Ranking Evolved Behavior Using Hand-coded Knowledge-based Controllers: Method • Rank evolved controllers on a continuum with knowledge based controllers designed to perform the same task • Good controller (Rule) • Bad controller (Random) • Sets of paired games played using controllers in real robots in the real world • 5 random game initializations were generated • These were used to play 10 game of ANN vs. Rule based controllers and 10 games of ANN vs. Random controllers Nelson, Grant and Henderson, PerMIS 2002

Results • Neural controllers won 3 out of 10 games against the good rule based controller while the rule based controllers won 7: (30% to 70%) • The neural network controllers won 8 out of 10 against the random controller, while the random controller won 0 (two games were incomplete): (100% to 0%) • Equally match controllers would receive a ranking of (50% to 50%) Nelson, Grant and Henderson, PerMIS 2002

Results: Example Games Played With Real Robots Using Three Controller Types Rule vs. Neural Random vs. Neural Nelson, Grant and Henderson, PerMIS 2002

Conclusions • A new evolutionary robotics platform with small autonomous computationally powerful robots has been developed • Neural controllers were evolved in simulation using a relative fitness function for selection • Evolved controllers were tested in the real world using real robots • A post evolution metric that ranked evolved controllers on a continuum with knowledge based controllers was used Nelson, Grant and Henderson, PerMIS 2002

Ongoing Research • Human-robot control interface for rule extraction and intelligent control testing • Continued development and evaluation of training fitness evaluation • Colony and swarm robot control research Nelson, Grant and Henderson, PerMIS 2002

Acknowledgements • John Galeotti: Developed The EvBot Linux OS and hardware architecture • This work was initially funded under a DARPA grant for distributed robotics Nelson, Grant and Henderson, PerMIS 2002

Competitive Relative Performance Evaluation of Neural Controllers for Competitive Game Playing With Teams of Real Mobile