400 likes | 555 Views
Software Multiagent Systems: Lecture 10. Milind Tambe University of Southern California tambe@usc.edu. Announcements. From now on, slides posted on our class web site Password: teamcore Homework answers will be sent out by email next week. DCOP Definition. d i d j f(d i ,d j ) 1
E N D
Software Multiagent Systems: Lecture 10 Milind Tambe University of Southern California tambe@usc.edu
Announcements From now on, slides posted on our class web site Password: teamcore Homework answers will be sent out by email next week
DCOP Definition di dj f(di,dj) 1 2 2 0 • Variables {x1,x2,…,xn} distributed among agents • Domains D1,D2,...,DN, • Link functions fij: Di x Dj→ N. Find assignment A* s.t. F(A*) is min, F(A) =Sfij(di,dj), xidi,xj dj in A x1 x1 x1 Cost = 0 Cost = 4 Cost = 7 x2 x2 x2 x4 x3 x4 x3 x3 x4
Branch and Bound Search • Familiar with branch and bound search?
Synchronous Branch and Bound (Hirayama97) di dj f(di,dj) 1 2 2 0 • Agents prioritized into chain • Choose value, send partial solution (with cost) to child • When cost exceeds upper bound, backtrack • Agent explores all its values before reporting to parent Concurrency? Asynchrony? x1 x1 x1 x1 x1 0 ?? x2 x2 x2 x2 x2 1 x3 x3 x3 x3 x3 4 = UB 3 x4 x4 x4 x4 x4
DCOP before ADOPT • Branch and Bound • Backtrack condition - when cost exceeds upper bound • Problem – sequential, synchronous • Asynchronous Backtracking • Backtrack condition - when constraint unsatisfiable • Problem - only hard constraints allowed • Observation: Backtrack only when sub-optimality is proven
Adopt: Idea #1 • Weak backtracking: When lower bound gets too high Why lower bounds? • Allows asynchrony! • Yet allows quality guarantees Downside? • Backtrack before sub-optimality is proven • Cant throw away solutions; need to revisit!
Adopt: Idea #2 • Solutions need revisiting • How could we do that? • Remember all previous solutions • Efficient reconstruction of abandoned solutions
Adopt Overview x1 x2 x3 x4 • Agents are ordered in a DFS TREE • Constraint graph need not be a tree
Adopt Overview x4 di dj f(di,dj) 1 2 2 0 x1 x2 x3 x4 • Agents concurrently choose values • VALUE messages sent down • COST messages sent uponly to parent • THRESHOLDmessages sent downonly to child x1 Constraint Graph VALUE messages COST messages x2 THRESH messages x3
Asynchronous, concurrent search di dj f(di,dj) 1 2 2 0 x1 x2 x3 x4 Each variable has two values: b and w Each initialized with a lower-bound of 0
Asynchronous, concurrent search di dj f(di,dj) 1 2 2 0 x1 x1 x1 x1 0 1 2 x2 x2 x2 x2 1 2 x3 x4 x3 x4 x3 x4 x3 x4 Concurrently report local costs,with context e.g. x3 sends cost 2 with x1=b,x2=b x1 switches to “better?” value • x2, x3 switch to best value, • report cost, with context • x2 disregards x3’s report (context mismatch) OptimalSolution x1 x2 x3 x4 . . . Concurrently choose, send to descendents
Asynchronous, concurrent search Algorithm: • Agents are prioritized into tree • Agents: • Initialize lower bounds of values to zero • Concurrently choose values, send to all connected descendents. • Choose the best value given what ancestors chose: • immediately send cost message to parent • Cost = lower bound + cost with ancestors • Costs asynchronously reach parent • Asynchronous costs: context attachment
Weak Backtracking • Suppose parent has two values, “white” and “black” Explore “white” first Receive cost msg Now explore “black” parent parent parent LB(w) = 0 LB(b) = 0 LB(w) = 2 LB(b) = 0 LB(w) = 2 LB(b) = 0 Receive cost msg Go back to “white” Termination Condition True parent parent parent LB(w) = 2 LB(b) = 3 LB(w) = 2 LB(b) = 3 LB(w)=10 =UB(w) LB(b)=12 . . . .
Key Lemma for soundness/correctness di dj f(di,dj) 1 2 2 0 Lemma: Assuming no context change, an agent’s report of cost is non-decreasing and is never greater than the actual cost. Inductive Proof Sketch: Leaf agents never overestimate cost. Each agent sums the costs from its children and chooses its best choice and reports to parent. x1 x1 x1 5 0 x2 x2 x2 2 1 x4 x4 x3 x3 x4 x3 5 is an OVERestimate! Instead, x2 switches to unexplored value, reports lower bound x2 receives costs from children, computes total cost of 2 + 1 + 2 = 5.
Revisiting Abandoned Solutions Problem • reconstructing from scratch: inefficient • remembering solutions: expensive Solution • remember only lower bounds: polynomial space • use lower bounds to efficiently re-search Chain Ordering parent lower bound = 10 threshold = 10 single child
Revisiting Abandoned Solutions Solution • remember only lower bounds – polynomial space • use lower bounds to efficiently re-search • Suppose parent has two values, “a” and “b” Explore “a” First Now explore “b” Return to “a” LB(a) = 10 LB(b) = 11 parent parent parent LB(a) = 10 LB(b) = 0 threshold = 10 single child single child single child
Backtrack Thresholds • agent i received threshold = 10 from parent Explore “white” first Receive cost msg Stick with “white” agent i agent i LB(w) = 0 LB(b) = 0 threshold = 10 LB(w) = 2 LB(b) = 0 threshold = 10 LB(w) = 2 LB(b) = 0 threshold = 10 Receive more cost msgs Now try black Key Point: Don’t change value until LB(current value) > threshold. LB(w) = 11 LB(b) = 0 threshold = 10 LB(w) = 11 LB(b) = 0 threshold = 10
Tree Ordering lower bound = 10 parent thresh = ? thresh = ? multiple children Idea:Rebalance threshold Time T1 Time T2 Time T3 parent parent parent thresh=5 cost=6 thresh=5 thresh=4 thresh=6
Evaluation of Speedups • Conclusions • Adopt’s asynchrony and parallelism yields significant efficiency gains • Sparse graphs (density 2) solved optimally, efficiently by Adopt.
Metric: Cycles • Cycle = one unit of algorithm progress in which all agents receive incoming messages; perform computation, send outgoing messages • Independent of machine speed, network conditions, etc. Outgoing comm
Number of Messages • Conclusion • Communication grows linearly • only local communication (no broadcast)
Bounded error approximation root lower bound = 10 thresh = 10 + b • Motivation Quality control for approximate solutions • Problem User provides error bound b • Goal Find any solution S where cost(S) cost(optimal soln) + b • Adopt’s ability to provide quality guarantees naturally leads to bounded error approximation!
Evaluation of Bounded Error • Conclusion • Varying b is an effective method for doing time-to-solution/solution-quality tradeoffs.
Adopt summary – Key Ideas • First-ever optimal, asynchronous algorithm for DCOP • polynomial space at each agent • Weak Backtracking • lower bound based search method • Parallel search in independent subtrees • Efficient reconstruction of abandoned solutions • backtrack thresholdsto control backtracking • Bounded error approximation • sub-optimal solutions faster • bound on worst-case performance
Discussion • Can we improve Adopt efficiency? • Can we allow n-ary constraints in Adopt? • Does Adopt preserve privacy? • What are some key applications of Adopt?
New Ideas for Efficiency • Communication Structure • Idea: Reach a solution faster if end-to-end messaging is shorter • Application: Shorter depth trees in ADOPT • Intelligent Preprocessing of Bounds • PASSUP heuristic: bounds via one-time message up the tree • PASSUP extended via a framework of several preprocessing heuristics
Performance (EAV) Orders of Magnitude Speedup!
OptAPO 2004 • OPTAPO
J. Davin, P. J. Modi , "Impact of Problem Centralization in Distributed Constraint Optimization Algorithms," Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2005.
Defining DCOP Centralization • Centralization: Aggregating problem information into a single agent • information was initially distributed among multiple agents, and • aggregation results in a larger local search space. For example, constraints on external variables canbe centralized.
Motivation • Adopt and OptAPO: • Adopt does no centralization. • OptAPO does partial centralization. • OptAPO completes in fewer cycles than Adopt for graph coloring • But, cycles do not capture performance differences • When different levels of centralization.
Metric: Cycles • What is missing in measuring cycles? Outgoing comm
Key Questions • How do we measure performance of DCOP algorithms that differ in their level of centralization? • How do Adopt and OptAPO compare when we use such a measure?
Results • Tested on graph coloring problems, |D|=3 (3-coloring). • # Variables = 8, 12, 16, 20, with link density = 2n or 3n. • 50 randomly generated problems for each size. CCC: Cycles: OptAPO takes fewer cycles, but more constraint checks.