480 likes | 497 Views
Large Scale Coordination of Heterogeneous Agents. CMU-Robotics: Katia Sycara, Paul Scerri Robin Glinton, Pras Velagapudi, Yonghong Wang www.cs.cmu.edu/~softagents/muri_7. Overview. Scaling distributed robot path planning
E N D
Large Scale Coordination of Heterogeneous Agents CMU-Robotics: Katia Sycara, Paul Scerri Robin Glinton, Pras Velagapudi, Yonghong Wang www.cs.cmu.edu/~softagents/muri_7
Overview • Scaling distributed robot path planning • Understanding the dynamics of information propagation in large heterogeneous team Second Year Review
Scaling Distributed Robot Planning Katia Sycara, Paul Scerri Pras Valagapudi
Planning for Large Teams Second Year Review • Multiple agents, graph world, no uncertainty • Previous Approaches • Coupled (e.g. composite robots)[Parsons’90; Svestka ‘98] • Intractable for large teams • Decoupled (e.g. prioritized planning)[Erdman’87; Berg’05] • Plan from higher to lower priority • Higher priority dynamic obstacles for lower priority robots • Sequential path computation (higher to lower priority robots); scales poorly • Reactive (e.g. Dynamic Networks)[Clark 2003; Chun 1999] • Runs at execution time; • Poor quality solutions
Distributed Prioritized Planning Second Year Review At each robot: • Compute initial path • Determine local priority • Broadcast path to team • Listen for other teammates paths • If a higher priority path is received, add as an obstacle in space-time • Compute new collision-free path • Go to 3.
Reduced DPP (RDPP) Second Year Review • DPP requires broadcasting messages to every teammate every time agents replan • Can reduce this with two assumptions • If robot does not receive message to contrary, it can assume plans have not changed • Each robot sends its re-planned path only to robots of lower priority (but saves and considers plans of robots with higher priority)
Sparse DPP Second Year Review • Goal: reduce number of messages even more than RDPP • Step 1: each robot sends it path to w random team mates • Step 2: each robot computed conflicts of paths it receives and notifies conflicting robots • Step 3: Use prioritized planning for detected collisions • Probability calculations (for w=ksqrt(n)) showed that the exact probability of collision and the lower bound converge to an asymptote.
Results-Varying Size of Team Second Year Review
Varying Density of Map # robots 240 Second Year Review
Summary of Results Second Year Review • DPP works as well as centralized • RDPP • Takes many fewer sequential steps • In some instances may take longer in wall-clock time (due to uneven computation time of A*) • Sparse does poorly overall • Why? • Detecting collisions alone is insufficient; need to detect dependencies between agents • Sparse takes much longer to figure out dependencies • In RDPP, agents are preemptively discovering dependencies before collisions occur
Domains with Uncertainty Scaling path planning in uncertain and incomplete observation environments Full joint DEC-MDP problem = NEXP-hard Independent planners: can’t account for teammates, will do poorly Existing work: exploits specific structure I-MDP, ND-POMDP, TI-POMDP, TD-POMDP, EDI-Dec-MDP, EDI-CR, JESP, Factored MDP, ND-POMDP Limited by exponentially scaling algorithm steps, or restrictive models Second Year Review
DEC-POMDP with Coordination Locales (DPCL) Second Year Review • Subclass of Dec-POMDPs • State space is factored into global tasks and local states (interactions in limited parts of state space) • Explicit enumeration of interaction • Same-time Coordination Locales (STCL) • Situations where state or reward from simultaneous action execution by robots cannot be described by the individual transition and rewards (eg collisions) • Future-time Coordination Locales (FTCL) • Actions of one robot can impact actions of others in future E.g. agent changing environment for another agent- clearing debris
A Simple Rescue Domain Unsafe Cell Rescue Agent Clearable Debris Narrow Corridor Victim Cleaner Agent Second Year Review
TREMOR Second Year Review Approximate algorithm that optimizes expected joint reward while exploiting limited CLs [Varakantham 2009]. • Construct a tree of all possible assignments of tasks to agents • Do a branch and bound search, using Dec-MDPs as the bound, to compute an optimal joint policy consistent with the task assignment. • To find exact value, do iterative reward shaping • Agents make independent POMDP solutions • Solution policies are jointly evaluated • Marginal difference between expected and actual computed value for each Coordination Locale • Difference is added as a shaping function of local reward model • Go to 1.
L-TREMOR Second Year Review • TREMOR effective w/ up to 10 agents • However, TREMOR scalability is limited: • Allocation tree grows exponentially • Solution of Dec-MDPs grows exponentially • Evaluation of POMDP policies is centralized • L-TREMOR keeps the iterative, parallel parts of TREMOR and replaces the other parts • Branch + Bound search Auction • Joint evaluation Sampling approximation • Prioritized reward shaping -Do not apply reward-shaping if interacting robot is lower priority
Preliminary Results – Collision Only 3x70 Column N = 100 10x10 Square N = 100 3x5 Column N = 10 (Similar in structure to 3x5 column) 16 Second Year Review 1/2/2020
Future Work 17 Second Year Review Second Year Review Adding handling of both STCL and FTCL interactions Studying communications dynamics in L-TREMOR (“Reduced” & “Sparse” approaches) Finding analytic bounds on performance for particular classes of interaction Expanding to more realistic domains Working with larger team sizes (200+) 1/2/2020
Understanding the Dynamics of Belief Propagation in Large Teams Katia Sycara, Paul Scerri Robin Glinton, Yonghong Wang
Motivation • Large heterogeneous networked teams (1000s of robots, humans, software agents) exchanging beliefs to maintain joint understanding of a situation • Controlling information dynamics in large, concurrent, networked systems (1000s of nodes). • e.g. human/robot/agent teams, sensor networks, computer clusters/clouds, human organizations • Nodes need accurate information from peers • Individuals need the right information at the right time • Enhanced task performance/planning/increased accuracy of computations • Simultaneously minimize bandwidth usage • Can’t broadcast everything to everyone (exponential explosion of messages) • Investigate belief sharing dynamics to: • Mitigate adverse emergent effects • Leverage beneficial emergent effects Second Year Review
Information Fusion Model Single fact of interest to all members of a networked team Small number of agents with sensor access Agents communicate only conclusions (T,F) about the fact Use Bayesian Filter to combine sensor data and conclusions communicated from neighbors Same information received through multiple intermediaries System characterized by information cascades System dynamics highly non-linear Potential vulnerability: Single additional inputs can cause large portions to draw wrong conclusion Despite high accuracy of sensors Currently examining human networks for such effects (GEO Game) Second Year Review
Optimal System Performance Second Year Review Studied belief sharing in concurrent networked systems • Empirically discovered an extraordinarily efficient pattern of stochastic information exchange called scale-invariant dynamics • Each node receives the most relevant information from every other node, eg. • Initially 20% of nodes know the truth of a fact with 55% accuracy • After exchanging number of messages O(number of nodes) 99% of nodes know the truth
What is scale-invariant dynamics? T T T T Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Sensor Sensor T T T T T T T Cascades Second Year Review
Dynamics: cascade distribution P(c) • System Dynamics characterized by “avalanches” • Chains of state changes precipitated by a single sensor input P(c) 0.1 0.2 … 0.9 1 1 2 3 4 5 6 7 8 . . . 100 . . . 1000 Casade size c Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Second Year Review Agent
Scale Invariant Dynamics ln(# of cascades size c) ln(cascade size c) Second Year Review
Performance relative to cp Dramatic performance peak! Reliability C80 cp (“trust”) linearly related to α, average number of new branches in a conclusion cascade Second Year Review
Performance peak: Network Variation Position of peak varies with network details Second Year Review
Decentralized Adaptation of Dynamics:inducing scale invariance Were able to identify parameter α, average number of neighbors that change belief upon receipt of a new sensed fact α is linearly related to cp and can be locally detected α=1 corresponds to scale invariant dynamics Local algorithm DACOR can dynamically adapt to changes in the network by changing α to induce scale invariant dynamics DACOR gives performance on par with systems where cp was manually tuned Glinton, Scerri, Sycara “Exploiting Scale Invariant Dynamics for Efficient Information Propagation in Large Teams”, AAMAS 2010, 2nd place for Best Paper Award Second Year Review
More Realistic Models • Asynchronous Communication • Studied different models of message delay (eg, fixed, normally distributed) • Multiple facts with statistical dependencies • Probability of fact being true dependent on other facts of concern to other agents • Scale invariant effect persists Second Year Review
Why Scale Invariant Dynamics Results in Efficient Information Exchange Analytical explanation for the efficiency of scale invariant dynamics Discovered how non-linear information exchange between agents transforms sensor readings to agent beliefs Found system “response function” in the language of controls and signal processing We show that for scale-invariant information exchange System response to an input sensor reading (how much its weighted in the belief of a typical agent) is linear in the consistency (or relevance) of the reading to the beliefs of all other agents Second Year Review
Signal Processing View of Information Exchange Filter Signal processing Multi-agent system 30 Second Year Review Second Year Review 1/2/2020
Convolution w2 w3 s3 w1 s1 s2 w6 w4 w5 s6 s4 s5 w9 w7 w8 s8 s9 s7 31 Second Year Review Second Year Review 1/2/2020
T Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Sensor Sensor Sensor T Multi-agent Convolution S1 32 Second Year Review Second Year Review 1/2/2020
Multi-agent Convolution F F F F Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Agent Sensor Sensor Sensor F F F F F F S2 S1 33 Second Year Review Second Year Review 1/2/2020
Multi-agent Convolution 34 Second Year Review Second Year Review 1/2/2020
Cascade sequence distribution Scale Invariant Cascade Distribution: Probability that sensor reading st Will be incorporated into the belief Of an agent Linear function of the “coherence” of the sensor readings s1-st 35 Second Year Review Second Year Review 1/2/2020
Expected Number of Messages Exchanged 36 Second Year Review Second Year Review 1/2/2020 Expected number of messages M for n sensor readings
Summary of Analytical Result Scale Invariant Dynamics Number of agents that receive a sensor reading is a linear function of the relevance of that sensor reading to all agents Accuracy is increased because agents receive only the most relevant sensor readings Expected number of messages on the order of the number of sensor readings 37 Second Year Review Second Year Review 1/2/2020
Geo Game: Model • Agent network: G=(V,E) • V: set of agents • E: edges connecting agents • Map: M=(L,R) • L: set of locations • R: set of roads connecting locations • I=(i1,i2,...,ik): set of items • Each agent is assigned a set of items to collect • Items are randomly distributed at different locations • Distance of roads are the actual physical distance plus some random number. • Agent need to explore locations to collect items assigned. Second Year Review
Agent Coordination • Agents share information about • items at locations • costs of roads • Coordination in exploring unknown sites. • Avoid exploring same unknown sites • Coordination in collecting items • Avoid conflict when collecting same items. • If an item available, agent closest collects it • Multiple items available, minimize total cost in collecting them. Second Year Review
Results: Depth of communication • Agent broadcast information about items at locations • The broadcast depths varies from 1 to 6 hops • Deeper broadcast increases performance. • After 3, or 4 hops, benefit of more hops is small. • In following experiment, we set # of hops at 4. Second Year Review
Results: Degree of information sharing • Agents broadcast information to neighbors, then passed on to neighbor's neighbors,...4 hops • Random: Agent picks a random location to explore • Myopic greedy: Agent picks a nearby location which has the desired item, if no nearby locations has the desired item, it picks a random one. • Global knowledge: Each agent shares all information with all other agents Second Year Review
Does cooperation really help team performance? • Cooperation might worsen team performance when there is too much uncertainty • Road distance:R=R0+r(u,d) • R0=1 • u and d are randomly chosen from 0 to 200 • Random part dominates fixed distance • Divide 10000 steps into 10 stages, 1000 steps each. • Broadcasting road costs is worse in first 2 stages, after 3rd stage, it performs better. Second Year Review
Plans for Next Year • Continue exploring scalability of path planning algorithms • Compare results of network simulations with human performance of Geo Game • Insert cognitive models in the large scale network simulations • Explore interactions of economically rational agents in the sims • Explore utility of the notions and potential applications of scale invariant dynamics • Increase speed and accuracy of concurrent computations –self coordinated to potentially avoid e.g. deadlock, process starvation • Coordinate information fusion processes from unstructured information sources • Robots coordinate requests for assistance to human operator so as to benefit their whole team (integrate with call center) • Regulate information exchange in human/machine systems –improve human task performance by providing the most relevant information from other nodes (integrate and compare with Geo Game) Second Year Review
Publications Glinton, R., Sycara, K, Scerri, P. “Exploiting Scale Invariant Dynamics for Efficient Information propagation in Large Teams”, Proceedings of the 2010 Conference on Autonomous Agents and Multi-Agent Systems, Toronto, CA, May, 2010 (Runner Up for Best Paper Award). Lisy V., Zivan R., Sycara, K. Pechoucek M. “Deception in Networks of Mobile Sensing Agents”, Proceedings of the 2010 Conference on Autonomous Agents and Multi-Agent Systems, Toronto, CA, May, 2010. Scerri, P., Velagapudi, P., Sycara, K. “Analyzing the Theoretical Performance of Information Sharing”, In Dynamics of Information Systems: Theory and Applications, Springer 2010. Paruchuri, P., Varakantham, P., Sycara, K., Scerri, P. ”Effect of Human Biases on Human Agent Teams”, International Conference on Intelligent Agent Technology, Aug 30-Sep 3, Toronto, CA, 2010. Velagapudi, P., Sycara, K., Scerri, P. “Decentralized Prioritized Planning in Large Multirobot Teams”, IROS, October 18-22, Taipei, Taiwan, 2010. Chakraborty, N., Sycara, K. “Reconfiguration Algorithms for Mobile Robotic Networks”, ICRA, Alaska, May 8-11, 2010. Second Year Review
Publications Zivan, R., Glinton, R, Sycara, K., Distributed Constraint Optimization for large teams of mobile sensing agents, In proceedings of the International conference on Intelligent Agent Technology, Milan, Italy, September 15-18, 2009. Yu, B., Li, C. Sycara, K. “An Incentive Mechanism for Message Relaying in Unstructured Peer to Peer Systems”, Electronic Commerce Research and Applications Volume 8, Issue 6, November-December 2009, Pages 315-326 Paruchuri, P., Glinton R., Sycara, K. Scerri, P. “Effect of Humans on Belief propagation in large heterogeneous teams”, in Hirsch, M Pardalos, P. and Murphy R (eds), Dynamics of Information Systems, Springer, 2009. Glinton, R., Paruchuri, P., Scerri, P. Sycara, K “Self-organized criticality of belief propagation in large heterogeneous teams”, Hirsch, M Pardalos, P. and Murphy R (eds), Dynamics of Information Systems, Springer, 2009. Velagapudi, P., Prokopiev, O., Scerri, P., Sycara, K. “A Token-Based Approach to Sharing Beliefs in a Large Multiagent Team”, Control and Information systems, Springer, 2009. Glinton, R., Sycara, K, Scerri, D., and Scerri P., “The Statistical Mechanics of Belief Sharing in Multi Agent Systems”, International Journal Information Fusion. DOI information: 10.1016/j.inffus.2009.09.003 Second Year Review
Publications Velagapudi, P., Prokopyev, O., Sycara, K. Scerri, P. Analyzing the performance of randomized information sharing, AAMAS 2009, Budapest, Hungary, May 15-19, 2009. Glinton, Scerri, Paruchuri, Sycara “Analysis and Design of Information Dynamics in Large Scale Networked Systems" International Conference on the Dynamics of Information System, Gainsville, Fla, Jan 28-30, 2009. Paruchuri, Glinton, Scerri, Sycara "The role of humans in information propagation in Large Scale Heterogeneous Teams" International Conference on the Dynamics of Information Systems, Gainsville, Fla, Jan 28-30, 2009. Okamoto, S, Sycara, K. and Scerri, P., “Software Personal Assistants for Human Organizations”, Virginia Dignum (ed) Multi Agent Systems: Semantics and Dynamics of Organizational Models, IGI, ISBN: 1-60566-256-9, February 2009 Glinton, R., Scerri, P., Sycara, K. Towards the Understanding of Information Dynamics in Large Scale Networked Systems, Twelfth International conference on information Fusion 2009, Seattle WA., July 6-9, 2009. Owens, S., Sycara, K., Scerri, P. “Using Immersive 3D Terrain Models for Fusion of UAV Surveillance Imagery”, In Proceedings of AIAA, 2009. Second Year Review
Publications Chechetka, A., Sycara, K, Scerri, P. “Insights into the Impact of Social Networks on Evolutionary Games”, in A. Yang, Y. Shin (eds.) Application of Complex Adaptive systems, IDEA Group Inc, 2008. Glinton, R., Scerri, P. Sycara, K. “Agent Organized Networks Redux”, Proceedings of the AAAI, Chicago, Il. July 13-17, 2008. Simonetto A., Scerri, P. Sycara, K. “A Mobile Network for Mobile Sensors”, Proceedings of the International Conference on Information Fusion, Cologne, Germany, June 30-July 3, 2008. Okamoto, S., Scerri, P., Sycara, K. “The Impact of Vertical Specialization on Hierarchical Multi-Agent Systems”, Proceedings of the AAAI, Chicago, Il. July 13-17, 2008. Glinton, R., Scerri, P. Sycara, K. “Agent-Based Sensor Coalition formation”, Proceedings of the International Conference on Information Fusion, Cologne, Germany, June 30-July 3, 2008. Xu, Y., Scerri, P., Lewis, M. and Sycara, K. (2008). Information Sharing among Large Scale Teams, In Proceedings of FUSION'08. Second Year Review