260 likes | 277 Views
Maintaining Shared Belief in a Large Multiagent Team. Prasanna Velagapudi, Oleg Prokopyev, Katia Sycara & Paul Scerri University of Pittsburgh Carnegie Mellon University. Large Multiagent Teams. 1000s of robots, agents, and people Must collaborate to complete complex tasks
E N D
Maintaining Shared Beliefin a Large Multiagent Team Prasanna Velagapudi, Oleg Prokopyev, Katia Sycara & Paul Scerri University of Pittsburgh Carnegie Mellon University
Large Multiagent Teams • 1000s of robots, agents, and people • Must collaborate to complete complex tasks • Necessarily distributed algorithms • Assume fully connected, peer-to-peer communications Search and Rescue Disaster Response UAV Surveillance
Maintaining Shared Belief • Agents need to share information about objects and uncertainty in the environment to perform roles • Individual sensor readings unreliable • Used to reason about appropriate actions • Maintenance of mutual beliefs is key • Need effective means to propagate information • Cannot guarantee that network neighbors are spatially local • Agents that need information should be likely to get it
Why not share everything? I’m forwarding you the potential locations and uncertainty for all sand-like objects. Oh, one more thing… I’ve located some sand. I detect sand. Please don’t… WHAT?! Ok. Mind the SAM. That’s nice.
Maintaining Shared Belief • Key Idea: Not every change in belief is of equal value! • Propagate the most useful changes • Rosencrantz, Thrun, et al. UAI 2003 • Make a distributed estimate of usefulness
Maintaining Shared Belief • How do we measure usefulness? • Usefulness metrics are domain-specific • Define a value function for each agent which is maximized when it receives the information it needs • Interested agents are ones that have a strictly positive value for a given piece of information
Token-Based Methods • Encapsulate sensor reading into a token • Allows independent integration of information • Pass token stochastically around team • Shown to be effective method of coordination in large scale teams • Xu, et al. AAMAS 2005
Token-Based Methods • Two big questions for information tokens: • Where should the token be sent? • Is it useful to the team to propagate the token?
Token Propagation • How should we determine when to stop sending in a large network? • Basic solution: Constant TTL • Based solely on value estimate of originating agent • Cannot handle dynamic network effects • Needs to be tuned to team size and network topology
Token Propagation • Can we do better by using agent knowledge? • Agents in a team already accumulate knowledge about teammates and the world • Build belief of world from sensor information • Monitor traffic flow between themselves and neighbors • Design an algorithm to make use of this knowledge • Agents assign value to information contained in token • Communications have fixed cost
Using Agent Knowledge • Codify knowledge into value estimation functions • Local value estimate based on local belief • Team value estimate based on relational information • Remaining value estimate based on team distribution
Using Agent Knowledge • Define policies that use these estimators and constant-size statistics to determine if a token should be propagated
Token TTL = 2 Token TTL = 1 Token TTL = 0 C-Policy • Constant TTL Agent 2 Agent 1 Agent 3
Token TTL = 4 Token TTL = 5 Token TTL = 3 Token TTL = 8 S-Policy • Use local value function and threshold to determine if we should propagate more or less Agent 2 v(s) = 0 Agent 1 v(s) > 0 Agent 1 v(s) = 0 Agent 3 v(s) > 0 Agent 3 v(s) = 0
Token u = 1 V = 3.2 N = 1 Token u = 2 V = 3.7 N = 2 Token u = 3 V = 12.9 N = 3 Token u = 3 V = 12.9 N = 4 S-OPT-Policy • Use all three estimators to determine optimal propagation Agent 2 v(s) = 0.5 Agent 2 v(s) = 0 Agent 1 v(s) = 3.2 Agent 1 v(s) = 0 Agent 3 v(s) = 9.2 Agent 3 v(s) = 0
Metrics • Need measures of efficiency that are independent of team size and value function • The proportion of value attained • The total number of communications
Analysis of Policies • Can analytically solve for expected values of metrics for C-policy and S-policy • C-policy solutions are straightforward • S-policy solutions may provide insight into policy characteristics • S-OPT-policy solutions are possible, but far more complex
Policy Simulation • Effects of varying network and policy parameters tested in simple simulator • Homogenous team of agents with binary value • Gaussian noise added to network estimators
S-OPT policy has a critical point in its parameterization Policy Simulation
S and S-OPT policies keep working as team size increases Number of messages increases linearly when proportional value collected is constant Policy Simulation
S policy sends too many messages when many agents are interested C policy sends many messages even if very few agents are interested S policy can break down if few agents are interested Policy Simulation
Varying Interest • Model three possible team distributions • Low interest – Agents in the team are often not interested in the token • Medium interest – Roughly half the time, agents in the team are interested in the token • High interest – Agents in the team are often interested in the token • Use policy parameters that collect the same proportion of value (fv) in constant-interest simulation • Compare the number of communications
Varying Interest (Higher is better) (Lower is better)
Conclusions • Taking advantage of team knowledge can improve token performance • Belief propagation algorithms may improve existing routing techniques • Inherently distributed decision-making • Can dynamically deal with stochastic routing methods • Possible to analytically solve for expected behavior • Relaxing guarantees on information sharing allows for significant domain-specific optimization of belief propagation