300 likes | 511 Views
DARPA ITO/MARS Project Update Vanderbilt University. A Software Architecture and Tools for Autonomous Robots that Learn on Mission. K. Kawamura, M. Wilkes, R. A. Peters II, D. Gaines Vanderbilt University Center for Intelligent Systems http://shogun.vuse.vanderbilt.edu/CIS/IRL/. 12 January 2000.
E N D
DARPA ITO/MARS Project UpdateVanderbilt University A Software Architecture and Tools for Autonomous Robots that Learn on Mission K. Kawamura, M. Wilkes, R. A. Peters II, D. GainesVanderbilt UniversityCenter for Intelligent Systemshttp://shogun.vuse.vanderbilt.edu/CIS/IRL/ 12 January 2000
Vanderbilt MARS Team • Kaz Kawamura, Professor of Electrical & Computer Engineering. MARS responsibility - PI, Integration • Dan Gaines, Asst. Professor of Computer Science. MARS responsibility - Reinforcement Learning • Alan Peters, Assoc. Professor of Electrical Engineering. MARS responsibility - DataBase Associative Memory, Sensory EgoSphere • Mitch Wilkes, Assoc. Professor of Electrical Engineering. MARS responsibility - System Status Evaluation • Jim Baumann, Nichols Research MARS responsibility - Technical Consultant Sponsoring Agency Army Strategic Defense Command
CMDR ENVIR SQUAD 1 LEARNING SELF SQUAD 2 . . . COMM SQUAD N IMA Test Demo Final Demo IMA agents and schema Learning algorithms Year 1 Year 2 A Software Architecture and Tools for Autonomous Mobile Robots That Learn on Mission GRAPHIC: NEW IDEAS: Learning with a DataBase Associative Memory Sensory EgoSphere Attentional Network Robust System Status Evaluation Demo III SCHEDULE: IMPACT: Mission-level interaction between the robot and a Human Commander. Enable automatic acquisition of skills and strategies. Simplify robot training via intuitive interfaces - program by example.
Project Goal • Develop a software control system for autonomous mobile robots that can: • accept mission-level plans from a human commander, • learn from experience to modify existing behaviors or to add new behaviors, and • share that knowledge with other robots.
Project Approach • Use IMA, to map the problem to a set of agents. • Develop System Status Evaluation (SSE) for self diagnosis and to assess task outcomes for learning. • Develop learning algorithms that use and adapt prior knowledge and behaviors and acquire new ones. • Develop Sensory EgoSphere, behavior and task descriptions, and memory association algorithms that enable learning on mission.
MARS Project: The Robots ATRV-Jr. ISAC HelpMate
Environment Agent . . Commander Agent Squad Agent 1 . Self Agent Squad Agent 2 Act./Learning Agent Communications Agent Squad Agent n IMA The IMA Software Agent Structure of a Single Robot
Robust System Status Analysis • Timing information from communication between components and agents will be used. • Timing patterns will be modeled. • Deviations from normal indicate “discomfort.” • Discomfort measures will be combined to provide system status information.
What Do We Measure? • Visual Servoing Component • error vs. time • Arm Agent • error vs. time, proximity to unstable points • Camera Head Agent • 3D gaze point vs. time • Tracking Agent • target location vs. time • Vector Signals/Motion Links • log when data is updated
Planning/Learning Objectives • Integrated Learning and Planning • learn skills, strategies and world dynamics • handle large state spaces • transfer learned knowledge to new tasks • exploit a priori knowledge • Combine Deliberative and Reactive Planning • exploit predictive models and a priori knowledge • adapt given actual experiences • make cost-utility trade-offs
Generate Abstract Map • Nodes selected based on learned action models • Each node represents a navigation skill
Generate Plan in Abstract Network • Plan makes cost-utility trade-offs • Plans updated during execution
Planning/Learning Status • Action Model Learning • adapted MissionLab to allow experimentation (terrain conditions) • using regression trees to build action models • Plan Generation • developed prototype Spreading Activation Network • using to evaluate potential of SAN for plan generation
Role of ISAC in MARS ISAC is a testbed for learning complex, autonomous behaviors by a robot under human tutelage. • Inspired by the structure of vertebrate brains • a fundamental human-robot interaction model • sensory attention and memory association • learning sensory-motor coordination (SMC) patterns • learning the attributes of objects through SMC
A IMA Primitive Agent Hardware I/O Robot Self Agent Human Agent Human Robot A A A A A A A Software System System Architecture
Next Up: Peer Agent We are currently developing the peer agent. The peer agent encapsulates the robot’s understanding of and interaction with other (peer) robots.
self agent human agent environment agent peer agent object agent peer agent object agent System Architecture: High Level Agents Due to the flat connectivity of IMA primitives, all high level agents can communicate directly if desired.
Robot Learning Procedure • The human programs a task by sequencing component behaviors via speech and gesture commands. • The robot records the behavior sequence as a finite state machine (FSM) and all sensory-motor time-series (SMTS). • Repeated trials are run. The human provides reinforcement feedback. • The robot uses Hebbian learning to find correlations in the SMTS and to delete spurious info.
Robot Learning (cont’d) • The robot extracts task dependent SMC info from the behavior sequence and the Hebbian-thinned data. • SMC occurs by associating sensory-motor events with behaviors nodes in the FSMs. • The FSM is transformed into a spreading activation network (SAN). • The SAN becomes a task record in the database associated memory (DBAM) and is subject to further refinements.