220 likes | 319 Views
Demo for AAMAS-2012. GaTAC: A Scalable and Realistic Testbed for Multiagent Decision Making. Ekhlas Sonu, Prashant Doshi Dept. of Computer Science University of Georgia Athens, GA, 30602, USA [sonu,pdoshi]@cs.uga.edu. Objective.
E N D
Demo for AAMAS-2012 GaTAC: A Scalable and Realistic Testbed for Multiagent Decision Making Ekhlas Sonu, Prashant Doshi Dept. of Computer Science University of Georgia Athens, GA, 30602, USA [sonu,pdoshi]@cs.uga.edu
Objective • To design and implement a realistic testbed to evaluate the performance of decision making algorithms in a problem domain that is: • Relevant in cooperative, competitive and mixed settings • i.e. across different frameworks such as Dec-POMDP, I-POMDP, etc. • Scalable in problem size • No. of Physical States • Flexible in agent capabilities • Number and type of actions and observations • Extensible in number of agents and adaptable to agent types
Motivation • Recently, there have been substantial development in multi-agent decision making algorithms that has driven researchers to go beyond the traditional toy problem domains such as the Tiger Problem, Machine Maintenance Problem, Grid meeting, etc. • Some larger problem domains include Cooperative Box-Pushing, Mars Rover, etc.: • Applied in cooperative settings
A Desirable Problem Domain • A desirable problem domain for multi-agent decision making must be: • Scalable in physical states • Flexible in agent capabilities actions & observations • Extensible in number of agents • Relevant to cooperative, competitive and mixed settings • Able to produce solutions rich in structure • Realistic with a popular appeal
Proposed Scenario: Autonomous Unmanned Aerial Vehicles • Application: • Law enforcement [Murphy, Cycon; 1998] • Fighting forest fires [Casbeer, et.al.; 2005] • Border surveillance [Haddal, Gertler; 2010] • Wartime reconnaissance • Uncertainty in AUAVs due to: • Uncertainty about physical state • Noisy actuators and sensors • Added Complexity: Presence of other agents • May be cooperative or competitive • Related Research • Focuses on formulating flight trajectories [R. Bernard, et.al.,2002, 2003. S.M. Li, et.al 2002]
An example decision making scenario with AUAVs • We propose a problem domain involving a Autonomous Uninhabited Aerial Vehicles • The operating theatre may be divided into various sectors (as is a common practice) and may be represented as a grid of a predetermined size
An example decision making scenario with AUAVs • An example UAV recon problem may involve a UAV (I) (or a team of UAVs) trying to apprehend a target (T) (or a team of moving targets) while another team of UAVs (J) tries to help the target(s) escape to a safe house • Of course the exact problem description is flexible S.H.
GaTAC: Overview Georgia Testbed for Autonomous Control of vehicles (GaTAC): computer simulation framework for evaluating solution to a UAV reconnaissance problem. It provides: • Hyperrealistic 3D rendering of AUAV acting in real world scenario • Scalability in problem size and number of agents • Flexibility in designing actions and observations of each agent • Input: • Agent control function (policies) for all agents generated by any (multi-agent) decision making algorithm • Output: • Simulation of policies on a flight simulator • Results of simulations may be compared for policies generated by different algorithms using metrics such as number of captures, cumulative reward, etc.
GaTAC: Workflow We begin with a formal description for any UAV decision making problem Formulate problem as .dpomdp/.ipomdp file Configure GaTAC for simulation (i.e. setup environment) GaTAC
GaTAC: Workflow Obtain policies Solve using algo. of choice .dpomdp/.ipomdp Policies for each agent are fed to GaTAC to be simulated and evaluated GaTAC
GaTAC: Workflow .dpomdp/ .ipomdp Solve Simulate policies and evaluate results using metrics such as number of success, cumulative rewards, etc. GaTAC
GaTAC Components • Each instance of GaTAC has three components: • Flight Simulator • Off-shelf open source flight simulator on which policies are simulated • One instance of flight simulator for each agent • Autonomous Control Module • Control each aircraft and make it behave according to the policy on the flight simulator • Communication Module • Send aircraft behavior from ACM to flight simulator • Communicate with other agents (if required) • GaTAC instances may run on different machines • Connected using communication module
Architecture Communication between agents Communication Module Autonomous Control Module Flight Simulator
Flight Simulator • FlightGear: • Open-source (written in C++) • Multi-platform • Hyperrealistic 3D graphics • 3D virtual map • Flexible with choices of • Multiple models of aircrafts • Locations to act as operating environment • Weather condition, time of day, etc. • 6 DOF flight dynamics model • Simulates effects of airflow on different parts of aircraft
FlightGear in Operating Scenario • FG utilizes realistic 3D scenery available from TerraGear • Provides multiple view of the flying aircraft • Cockpit view, tail view, etc. • Multiple instances of FG may be linked together through external servers- ideal for multi-agent settings
Autonomous Control Module Used to algorithmically control the aircraft and make it behave according to policy: 3 levels of hierarchy Agent Actions on Grid High Level Actions Takeoff, Fly-Straight, Turn, Change Altitude Low Level Actions Control Rudder, Throttle, Aileron, Roll, Pitch, etc. Perform low level actions to control aircraft by adjusting parameters along the 6DoF Perform simple tasks that represent simple aircraft behaviors Actions constructed using high level actions to represent actions of agents in the problem at hand
Communication Module • Establish a communication channel between: • Autonomous Control Module and FlightGear • Between each agent (if required e.g. in team settings) • Communication channels use UDP, httpd and XML • Communicate low-level flight control data from an instance of autonomous control module to respective instance of FlightGear • Communicate aircraft position to all other instances of GaTAC in real time (used to formulate observations)
Communication Module Functions • Send control data from ACM to FG • May adjust flight parameters such as controlling thrust, rudder, aileron, altitude, etc. • Receive the aircraft’s flight dynamics in real time from FG and send to ACM for path correction • Position , aircraft orientation on 6 DoF, flight speed, altitude, etc. • May be used to pass messages between GaTAC instances (when communication between agents is required)
GaTAC Control Algorithm Get Observations/ Next Action Read policy from file Fly according to policy Observation =Successful? Start FlightGear No Agent action systematically broken down into high-level and then low-level actions to control the aircraft algorithmically Obtain action to perform from the policy Next action may be obtained from policy using the observation Repeat until termination condition reached Yes Mission Accomplished
Conclusions GaTAC: • Can act as an open-source testbed for decision theoretic agents • May be used to compare different algorithms irrespective of decision making framework (Dec-POMDP, I-POMDP, MTDP, etc.) • Is extensible: no upper bound on size of problem • No. of physical states, no. of agents, no. and types of actions & observations • Facilitates deployment of decision theoretic agents in hyper-realistic real world settings (cooperative, competitive, or mixed) • Easily configurable for simulating any UAV problem • Provides for communication between agents • May be extended to include choice of locations and aircrafts