Distributed Multi-Agent Optimization Frameworks for Teamwork Enhancement

Lafayette College Towards a Theoretic Understanding of DCEEScott Alfeld, Matthew E. Taylor, Prateek Tandon, and Milind Tambe http://teamcore.usc.edu

Forward Pointer When Should There be a “Me” in “Team”? Distributed Multi-Agent Optimization Under Uncertainty Matthew E. Taylor, Manish Jain, Yanquin Jin, Makoto Yooko, & Milind Tambe Wednesday, 8:30 – 10:30 Coordination and Cooperation 1

Teamwork: Foundational MAS Concept • Joint actions improve outcome • But increases communication & computation • Over two decades of work • This paper: increased teamwork can harm team • Even without considering communication & computation • Only considering team reward • Multiple algorithms, multiple settings • But why?

DCOPs: Distributed Constraint Optimization Problems • Multiple domains • Meeting scheduling • Traffic light coordination • RoboCup soccer • Multi-agent plan coordination • Sensor networks • Distributed • Robust to failure • Scalable • (In)Complete • Quality bounds

DCOP Framework a1 a2 a3

DCOP Framework a1 a2 a3 Different “levels” of teamwork possible Complete Solution is NP-Hard

D-Cee: Distributed Coordination of Exploration and Exploitation • Environment may be unknown • Maximize on-line reward over some number of rounds • Exploration vs. Exploitation • Demonstrated mobile ad-hoc network • Simulation [Released] & Robots [Released Soon]

DCOP Distrubted Constraint Optimization Problem

DCOP → DCEE Distributed Coordination of Exploration and Exploitation

DCEE Algorithm: SE-Optimistic (Will build upon later) Rewards on [1,200] If I move, I’d get R=200 a1 a2 a3 a4 50 75 99

DCEE Algorithm: SE-Optimistic (Will build upon later) Rewards on [1,200] If I move, I’d gain 275 If I move, I’d gain 251 If I move, I’d gain 101 If I move, I’d gain 125 a1 a2 a3 a4 a3 50 75 99 Explore or Exploit?

Success! [ATSN-09][IJCAI-09] • Both classes of (incomplete) algorithms • Simulation and on Robots • Ad hoc Wireless Network (Improvement if performance > 0)

k-Optimality • Increased coordination – originally DCOP formulation • In DCOP, increased k = increased team reward • Find groups of agents to change variables • Joint actions • Neighbors of moving group cannot move • Defines amount of teamwork (Higher communication & computation overheads)

“k-Optimality” in DCEE • k=1, 2, ... • Groups of size k form, those with the most to gain move (change the value of their variable) • A group can only move if no other agents in its neighborhood move

Example: SE-Optimistic-2 Rewards on [1,200] If I move, I’d gain 275 If I move, I’d gain 251 If I move, I’d gain 101 If I move, I’d gain 125 a1 a2 a3 a4 50 75 99  275 + 250 - 150 200-99  251 + 275 - 150  101 + 251 - 101  125 + 275 - 125 a1 a4 a2 a2 a3 a3 99 50 75

Sample coordination results Omniscient: confirms DCOP result, as expected ! ! ? Artificially Supplied Rewards (DCOP) Complete Graph Chain Graph

Physical Implementation • Create Robots • Mobile ad-hoc Wireless Network

Confirms Team Uncertainty Penalty • Averaged over 10 trials each • Trend confirmed! • (Huge standard error) ! ! ? Total Gain Chain Complete

Problem with “k-Optimal” • Unknown rewards • cannot know if can increase reward by moving! • Define new term: L-Movement • # of agents that can change variables per round • Independent of exploration algorithm • Graph dependant • Alternate measure of teamwork

L-Movement • Example: k = 1 algorithms • L is the size of the largest maximal independent set of the graph • NP-hard to calculate for a general graph • harder for higher k • Consider ring & complete graphs, both with 5 vertices • ring graph: maximal independent set is 2 • complete graph: maximal independent set is 1 • For k =1 • L=1 for a complete graph • size of the maximal independent set of a ring graph is: General DCOP Analysis Tool?

Configuration Hypercube No (partial-)assignment is believed to be better than another wlog, agents can select next value when exploring Define configuration hypercube: C Each agent is a dimension is total reward when agent takes value cannot be calculated without exploration values drawn from known reward distribution Moving along an axis in hypercube → agent changing value Example: 3 agents (C is 3 dimensional) Changing from C[a, b, c] to C[a, b, c’] Agent A3 changes from c to c’

How many agents can move? (1/2) • In a ring graph with 5 nodes • k = 1 : L = 2 • k = 2 : L = 3 • In a complete graph with 5 nodes • k = 1 : L = 1 • k = 2 : L = 2

How many agents can move? (2/2) Configuration is reachable by an algorithm with movement L in s steps if an only if and C[2,2] reachable for L=1 if s ≥ 4

L-Movement Experiments For various DCEE problems, distributions, and L: For steps s = 1...30: • Construct hypercube with s values per dimension • Find M, the max achievable reward in s steps, given L • Return average of 50 runs Example: 2D Hypercube • Only half reachable if L=1 • All locations reachable if L=2 s s

Restricting to L-Movement: Complete L=1→2 Complete Graph • k = 1 : L = 1 • k = 2 : L = 2 Average Maximum Reward Discovered

Restricting to L-Movement: Ring L=2→3 Ring graph • k = 1 : L = 2 • k = 2 : L = 3 Average Maximum Reward Discovered

Ring Complete • Uniform distribution of rewards • 4 agents • Different normal distribution

k and L: 5-agent graphs • Increasing k changes L less in ring than complete • Configuration Hypercube is upper bound • Posit a consistent negative effect • Suggests why increasing k has different effects: • Larger improvement in complete than ring for increasing k

L-movement May Help Explain Team Uncertainty Penalty • L = 2 will be able to explore more of C than algorithm with L = 1 • Independent of exploration algorithm! • Determined by k and graph structure • C is upper bound – posit constant negative effect • Any algorithm experiences diminishing returns as k increases • Consistent with DCOP results • L-movement difference between k = 1 algorithms and k = 2 • Larger difference in graphs with more agents • For k = 1, L = 1 for a complete graph • For k = 1, L increases with the number of vertices in a ring graph

Thank you Towards a Theoretic Understanding of DCEEScott Alfeld, Matthew E. Taylor, Prateek Tandon, and Milind Tambe http://teamcore.usc.edu

Distributed Multi-Agent Optimization Frameworks for Teamwork Enhancement

Distributed Multi-Agent Optimization Frameworks for Teamwork Enhancement

Presentation Transcript

Review Network Fundamentals

http:// www.youtube.com/watch?v=cngtkR4Oo6Q http:// www.youtube.com/watch?v=JVaCsgUpQeE

The Starter Generator!

Ecuador and The Galapagos Islands

Hyper Text Transfer Protocol

CIT 380: Securing Computer Systems

Università di Bologna

COS 461: Computer Networks Midterm Review

Data Hiding in Encrypted H.264/AVC Video Streams

Human Language Technology for the Semantic Web gate.ac.uk/ nlp.shef.ac.uk/

Chapter 6

PLYOMETRICS

What is classification?

CLIL: Teaching Science to Language Learners

12 Essential Steps of a Phenomenal Story