1.11k likes | 1.32k Views
Computational Coalition Formation. Talal Rahwan (University of Southampton, UK). What is an Agent?. Hello… I am an intelligent entity…. I am situated in a an environment… I can perceive the environment I have goals… I have a list of available actions…
E N D
Computational Coalition Formation TalalRahwan(University of Southampton, UK)
What is an Agent? Hello… I am an intelligent entity…. I am situated in a an environment… I can perceive the environment I have goals… I have a list of available actions… I act autonomously to satisfy my goals and… most importantly… I am social !!!
Organizations in Multi-Agent Systems Hierarchies Compounds Markets Matrix Organizations Control Coalitions Teams Holarchies Congregations Federations Societies Data
Organizations in Multi-Agent Systems Hierarchies Compounds Markets Matrix Organizations Coalitions Teams Holarchies Congregations Federations Societies buyer seller
Organizations in Multi-Agent Systems Hierarchies Compounds Markets Matrix Organizations Teams Coalitions Holarchies Congregations Federations Societies
Coalition Formation Main characteristicis Coalitions in general are goal-directed and short-lived No coordination among members of different coalitions The organizational structure within each coalition is flat
Applications of Coalition Formation Smart Energy Grids Intelligent appliances and energy storage devices coordinate for optimal energy use Electronic-commerce Cooperation among buyers to obtain quantity discounts, and sellers to maintain cartel pricing. Disaster Management UN report said: "Efforts by the United Nations in Haiti have lacked sufficient coordination“
Applications of Coalition Formation • Distributed sensor networks:Coalitions of sensors can work together to track targets of interest [Dang et al. 2006] • Distributed vehicle routing:Coalitions of delivery companies can be formed to reduce the transportation costs by sharing deliveries [Sandholm and Lesser, 1997]. • Information gathering:Several information servers can form coalitions to answer queries [Klusch and Shehory, 1996].
Cooperative Game Theory Game theory studies interactions between agents in situations known as games Coalition Formation is studied in a field of Game theory, called Cooperative Game Theory In cooperative game, agents benefit from cooperation, but is that all that makes a game a cooperative one?
Example: the Prisoner’s Dilemma • Two agents committed a crime. • Court does not have enough evidence to convict them of the crime, but can convict them of a minor offence (1 year in prison each) • If one suspect confesses (acts as an informer), he walks free, and the other suspect gets 4 years • If both confess, each gets 3 years • Agents have no way of communicating or making binding agreements
Prisoners’ Dilemma: Matrix Representation a2 quiet confess a1 quiet confess • Interpretation: the pair (x, y) at the intersection of row i and column j means that the row player gets xand the column player gets y
Prisoners’ Dilemma: the Rational Outcome a2 • a1’s reasoning: • if a2 stays quiet, I should confess • if a2 confesses, I should confess, too • a2 reasons in the same way • Result: both confess and get 3 years in prison. • But: they could have got only 1 year each! • So why do not they cooperate? a1 Quite Confess Quite Confess
Cooperative vs. Non-Cooperative Games • In Non-Cooperation games, players cannot make binding agreements • But what if binding agreements are possible? • Cooperative games model scenarios, where • agents can benefit by cooperating • binding agreements are possible
Cooperative Games Yes Partition Function Game (PFG) Does a coalition influence other co-existing coalitions? No Characteristic Function Game (CFG) Cooperative Game Transferable Utility (TU) Game Yes Can a player transfer part of its utility to another? No Non-Transferable Utility (NTU) Game
Example: Writing Papers • n researchers working at n different universities can form groups to write papers • the composition of a group determines the quality of the paper they produce • each author receives a payoff from his own university: • promotion • bonus • teaching load reduction • TU or NTU? • CFG or PFG? NTU (payoffs are non-transferable) CFG (a group does not influence others)
Example: Growing Fruits • n farmers can cooperate to grow fruit • Each group grows apples or oranges • a group of size k can grow f(k) tons of apples, or g(k) tons of oranges, where f() and g() are convex functions of k • The market price of a fruit drops monotonically as the number of tons available in the market increases • TU or NTU? • CFG or PFG? TU (money is transferable) PFG (a group can influence another)
Example: Buying Ice-cream • nchildren, each has some money: • Supermarkets sells many ice-cream tubs, in different sizes: • Type 1contains 500g, costs $7 • Type 2contains 750g, costs $9 • Type 3 contains 1kg, costs $11 • children have utility for ice-cream, and don‘t care about money • The payoff of a group is the maximum amount of ice-cream the members of the group can buy by pooling their money • TU or NTU? • CFG or PFG? TU (ice-cream is transferable) CFG (many available tubs)
How is a Cooperative Game Played? • Although agents work together, they can still be selfish • We need to: • Partition the agents into coalitions • Divide the payoff of each coalition among its members such that the outcome is stable (i.e., no player, or group of players, has an incentive to deviate) • We may also want to ensure that the outcome is fair • We will now see how to formalize these ideas
Cooperative Games Yes Partition Function Game (PFG) Does a coalition influence other co-existing coalitions? No Characteristic Function Game (CFG) Focus of this talk Cooperative Game Transferable Utility (TU) Game Yes Can a player transfer part of its utility to another? No Non-Transferable Utility (NTU) Game
Transferable Utility Games Formalized • A transferable utility game is a pair (A, v), where: • A ={a1, ..., an} is the set of players (or agents) • v: 2A → Ris the characteristic function • for each C ⊆ A, v(C) is the value of C, i.e., the payoff that members of C can attain by working together • usually it is assumed that • v(Ø) = 0 • v(C) ≥ 0 for any C ⊆ A • v(C) ≤ v(D) for any C, D such that C ⊆ D • The biggest possible coalition (the one containing all agents) is called the grand coalition
Ice-Cream Game: Characteristic Function C: $6, M: $4, P: $3 w = 500g w = 750g w = 1000g p = $7 p = $9 p = $11 • v(Ø) = v({C}) = v({M}) = v({P}) = 0 • v({C, M}) = 750, v({C, P}) = 750, v({M, P}) = 500 • v({C, M, P}) = 1000
Transferable Utility Games: Outcome An outcome of a TU game G =(A, v) is a pair (CS, x), where: • CS =(C1, ..., Ck) is a coalition structure, i.e., a partition of A into coalitions: • C1... Ck = A, • Ci Cj = Ø for i ≠ j • x = (x1, ..., xn) is a payoff vector, which specifies the payoff of each agent: • xi ≥ 0 for all ai A • xi = v(C) for all CCS
Superadditive Games • Definition: a game G = (A, v) is called superadditiveiff: v(C U D) ≥ v(C)+v(D) for any two disjoint coalitions C and D • Example:v(C) = |C|2: • v(C U D) = (|C|+|D|)2 ≥ |C|2+|D|2 = v(C) + v(D) • In superadditive games, any two coalitions can always merge without losing money; hence, we can assume that players form the grand coalition
Superadditive Games • In super-additive games, the grand coalition forms, so: Which coalition structure is optimal? Grand coalition is always optimal ! Non-Superadditive Game How should we divide the payoffs? How to divide payoff of grand coalition ? Superadditive Game
What Is a Good Outcome? • C: $4, M: $3, P: $3 • v(Ø) = v({C}) = v({M}) = v({P}) = 0 • v({C, M}) = 500, v({C, P}) = 500, v({M, P}) = 0 • v({C, M, P}) = 750 • This is a superadditive game • outcomes are payoff vectors • How should the players share the ice-cream? • if they share as (200, 200, 350), Charlie and Marcie can get more ice-cream by buying a 500g tub on their own, and splitting it equally • the outcome (200, 200, 350) is not stable!
Transferable Utility Games: Stability • Definition: the core of a game is the set of all stable outcomes, i.e., outcomes that no coalition wants to deviate from. That is: core(G) = {(CS, x) | xi ≥ v(C) for any C ⊆ A} • Note: G is not necessarily superadditive • Example: • v({a1, a2, a3}) = 9 • v({a4, a5}) = 4, • but if v({a2, a4}) = 7 ? • then (({a1, a2, a3}, {a4, a5}), (3, 3, 3, 3, 1)) is NOT in the core a2 a4 a1 a3 a5
Ice-Cream Game: Core • C: $4, M: $3, P: $3 • v(Ø) = v({C}) = v({M}) = v({P}) = 0, v({C, M, P}) = 750 • v({C, M}) = 500, v({C, P}) = 500, v({M, P}) = 0 • (200, 200, 350) is not in the core: • v({C, M}) > xC + xM • (250, 250, 250) is in the core: • no subgroup of players can deviate so that each member of the subgroup gets more • (750, 0, 0) is also in the core: • Marcie and Pattie cannot get more on their own!
Games with Empty Core • The core is a very attractive solution concept • However, some games have empty cores • G = (A, v) • A = {a1, a2, a3}, v(C) = 1 if |C| > 1 and v(C) = 0 otherwise • consider an outcome (CS, x) • if CS = ({a1}, {a2}, {a3}), the grand coalition can deviate • if CS = ({a1, a2}, {a3}), either a1 or a2 gets less than 1, so can deviate with a3 • same argument for CS = ({a1, a3}, {a2}) or CS = ({a2, a3}, {a1}) • suppose CS = {a1, a2, a3}: xi > 0 for some ai, so x(A\{ai}) < 1, yet v(A\{ai}) = 1
e-Core • If the core is empty, we may want to find approximately stable outcomes • Need to relax the notion of the core: core: (CS, x): x(C) ≥v(C)for all C N e-core: (CS, x): x(C) ≥ v(C) - efor all C N • Is usually defined for superadditive games only • Example: G = (A, v), A = {a1, a2, a3}, v(C) = 1 if |C| > 1, v(C) = 0 otherwise • 1/3-core is non-empty: (1/3, 1/3, 1/3)1/3-core • e-core is empty for any e < 1/3: xi ≥ 1/3 for some i = 1, 2, 3, so x(A\{ai}) ≤ 2/3, v(A\{ai}) = 1
What is a Good Outcome ? Given 3 agents, the set of agents is: The possible coalitions are: A = {a1, a2, a3} a2 a1 a3 a1 a3 a1 a2 a1 a2 a3 a2 a3 5 5 5 24 12 12 12 <? ? ?> A solution of a coalitional game: STABILITY THE CORE STABILITY
Characteristic Function Games Given 3 agents, the set of agents is: The possible coalitions are: N = {a1, a2, a3} a2 a1 a3 a1 a3 a1 a2 a2 a3 5 5 5 12 12 12 • a2 : Wait! But it is not fair! A solution of a coalitional game: STABILITY THE CORE Such division of payoff that no sub-coalition wants to deviate a1 a2 a3 a3 : My contribution to every coalition in the game is the same as a1 24 • a1: Great! I like this core division!` <10 7 7> <13 7 4>
Characteristic Function Games Given 3 agents, the set of agents is: The possible coalitions are: N = {a1, a2, a3} a2 a1 a3 a1 a3 a1 a2 a2 a3 5 5 5 12 12 12 A solution of a coalitional game: Fairness criteria: FAIRNESS SHAPLEY VALUE • Symmetry A unique division of payoff That meets fairness criteria (axioms) a1 a2 a3 • Null-player 24 • Additivity <8 8 8> <? ? ?> • Efficiency
Shapley Value – Definition a1 a2 a3 a1 a2 a3 a2 a1 a3 a2 a3 a1 a3 a1 a2 a3 a2 a1 a2 a1 a3 a1 a3 a1 a2 a2 a3 5 5 5 12 12 12
Shapley Value – Definition a1 a2 a3 a1 a2 a3 a2 a1 a3 a2 a3 a1 a3 a1 a2 a3 a2 a1 a2 a1 a3 a1 a3 a1 a2 a2 a3 5 5 5 12 12 12
Shapley Value – Definition MC(a1) a1 a2 a3 +5 a1 a2 a3 +5 a2 a1 a3 +7 a2 a3 a1 +12 48/6 = 8 = Sh(a1) +7 a3 a1 a2 0 0 +12 a3 a2 a1 a2 a1 a3 a1 a3 a1 a2 a2 a3 5 5 5 12 12 12
Coalition Formation Process optimal? Coalition structure generation Value Value Value Payoff distribution Value
Coalition Structure Generation (in Characteristic Function Games) Given 3 agents, the set of agents is: {a1,a2,a3} The possible coalitions are: {a1} {a2} {a3} {a1,a2} {a1,a3} {a2,a3} {a1,a2,a3} The possible coalition structures: {{a1},{a2},{a3}}{{a1,a2},{a3}}{{a2},{a1,a3}}{{a1},{a2,a3}}{{a1,a2,a3}} Input: a value of every possible coalition Output: a coalition structure in which the sum of values is maximized 20 40 30 70 40 65 95 20+40+30=90 70+30=100 40+40=80 20+65=85 95
Exercise What is the optimal coalition structure ? Answer { {1}, {2}, {3,4} }
Exercise What is the optimal coalition structure ?
Limit the permitted sizes, and use greedy algorithms(Shehory & Kraus 1999, 1996, 1995) Example: The set of possible coalitions of agents: A = {1, 2, 3, 4, 5, 6}
1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 Z = {1,2,4} {3} 2n max Sv(ci) . xi i=1 Integer programming (IP) C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 1 2 3 4 The integer model: s.t. Z . x = (1,…,1) xi {1,0}
Samples 7 1 5 6 2 3 4 8 9 9 3 2 8 6 1 7 5 4 6 1 2 3 4 8 7 5 9 3 8 9 1 5 4 2 6 7 4 2 1 7 6 8 9 5 3 6 9 3 1 7 8 4 5 2 Offspring Parents 9 3 2 8 6 1 9 5 3 9 3 2 8 6 1 7 5 4 4 2 1 7 6 8 7 5 4 4 2 1 7 6 8 9 5 3 {{3, 2}, {1}, {5, 3}} {{4, 2, 1}, {5, 4}} Order-based genetic algorithms (OBGA) (S. Sen & P. Dutta 2000) 1 7 6 3 8 4 2 5 9 2 5 7 1 4 3 9 8 6 { {2, 5}, {1, 4, 3} } { {1}, {3}, {4, 2, 5} } GA consists of 3 steps: (1)Evalutation(2) Selection (3) Re-combination overlap
Related Work [Rahwan & Jennings N. R., An Improved Dynamic Programming Algorithm for Coalition Structure Generation, AAMAS 2008]. • Dynamic Programming techniques IDP • Anytime with guarantees on solution quality IP [Rahwan et al., An Anytime Algorithm for Optimal Coalition Structure Generation, AAAI 2007, JAIR 2009]. IP IDP Algorithm Property Worst case performance O(3n) O(nn) Return solutions anytime True False Time to return optimal solution Fast Slow
The Dynamic Programming (DP) Algorithm • Main observation: To find the optimal partition of a set of agents, it is sufficient to: • try the possible ways to split the set into two sets, and • For every half, find the optimal partition of that half.
{1} {2} {3} {4} {1} {2} {3} {4} V({1})=30 V({2})=40 V({3})=25 V({4})=45 30 40 25 45 {1} {2} {1,3} {1,4} {2} {3} {2} {4} {3,4} {2} {1,3} {1,2,4} {1} {3,4} {2} {3,4} {1,2} {3,4} {1,2} {1,3} {1,4} {2,3} {2,4} {3,4} {1,2,3} {1,2,4} {1,3,4} {2,3,4} {1,2,3,4} V({1,2})=50 f({1})+f({2})=70 V({1,3})=60 f({1})+f({3})=55 V({1,4})=80 f({1})+f({4})=75 V({2,3})=55 f({2})+f({3})=65 V({2,4})=70 f({2})+f({4})=85 V({3,4})=80 f({3})+f({4})=70 V({1,2,3})=90 f({1})+f({2,3})=95 f({2})+f({1,3})=100 f({3})+f({1,2})=95 V({1,2,4})=120 f({1})+f({2,4})=115 f({2})+f({1,4})=110 f({4})+f({1,2})=115 V({1,3,4})=100 f({1})+f({3,4})=110 f({3})+f({1,4})=105 f({4})+f({1,3})=105 V({2,3,4})=115 f({2})+f({3,4})=120 f({3})+f({2,4})=110 f({4})+f({2,3})=110 V({1,2,3,4})=140 f({1})+f({2,3,4})=150 f({2})+f({1,3,4})=150 f({3})+f({1,2,4})=145 f({4})+f({1,2,3})=145 f({1,2})+f({3,4})=150 f({1,3})+f({2,4})=145 f({1,4})+f({2,3})=145 70 60 80 65 85 80 100 120 110 120 150
The Coalition Structure Graph {a1},{a2},{a3},{a4} V = 140 optimal {a1},{a2},{a3,a4}{a3},{a4},{a1,a2}{a1},{a3},{a2,a4}{a2},{a4},{a1,a3}{a1},{a4},{a2,a3}{a2},{a3},{a1,a4} {a1},{a3} V = 150 V = 120 V = 125 V = 145 V = 130 V = 145 {a1},{a2,a3,a4}{a1,a2},{a3,a4}{a2},{a1,a3,a4}{a1,a3},{a2,a4}{a3},{a1,a2,a4}{a1,a4},{a2,a3}{a4},{a1,a2,a3} {a1,a3} V = 145 V = 130 V = 140 V = 130 V = 145 V = 135 V = 135 {a1,a2,a3,a4} V = 140
The Improved Dynamic Programming Algorithm (IDP) • We define a subset of edges E* • We prove that the edges in E* are sufficient to form a path to every node in the graph • We modify the original algorithm such that it only evaluates the movements through the edges in E*
The Coalition Structure Graph {a1},{a2},{a3},{a4} optimal {a1},{a2},{a3,a4}{a3},{a4},{a1,a2}{a1},{a3},{a2,a4}{a2},{a4},{a1,a3}{a1},{a4},{a2,a3}{a2},{a3},{a1,a4} {a1},{a2,a3,a4}{a1,a2},{a3,a4}{a2},{a1,a3,a4}{a1,a3},{a2,a4}{a3},{a1,a2,a4}{a1,a4},{a2,a3}{a4},{a1,a2,a3} {a1,a2,a3,a4}
{1} {2} {3} {4} {1} {2} {3} {4} V({1})=50 V({2})=30 V({3})=45 V({4})=35 50 30 45 35 {1,2} {1} {3} {1} {4} {2,3} {2,4} {3} {4} {1} {2,3} {4} {1,2} {1,3,4} {4} {2,3} {4} {1,2,3} {1,2} {1,3} {1,4} {2,3} {2,4} {3,4} {1,2,3} {1,2,4} {1,3,4} {2,3,4} {1,2,3,4} V({1,2})=90 f({1})+f({2})=80 V({1,3})=80 f({1})+f({3})=95 V({1,4})=65 f({1})+f({4})=85 V({2,3})=90 f({2})+f({3})=75 V({2,4})=70 f({2})+f({4})=65 V({3,4})=60 f({3})+f({4})=80 V({1,2,3})=125 f({1})+f({2,3})=140 f({2})+f({1,3})=125 f({3})+f({1,2})=135 V({1,2,4})=120 f({1})+f({2,4})=120 f({2})+f({1,4})=115 f({4})+f({1,2})=125 V({1,3,4})=135 f({1})+f({3,4})=130 f({3})+f({1,4})=130 f({4})+f({1,3})=130 V({2,3,4})=115 f({2})+f({3,4})=110 f({3})+f({2,4})=115 f({4})+f({2,3})=125 V({1,2,3,4})=160 f({1})+f({2,3,4})=175 f({2})+f({1,3,4})=160 f({3})+f({1,2,4})=170 f({4})+f({1,2,3})=175 f({1,2})+f({3,4})=170 f({1,3})+f({2,4})=165 f({1,4})+f({2,3})=175 90 95 85 90 70 80 140 125 135 125 175 V({1,2,3})=125 V({1,2,4})=120 V({1,3,4})=135 V({2,3,4})=115 125 {1,2,3} 120 {1,2,4} 115 {2,3,4} V({1,2,3,4})=160 • {1,4} {2,3} f({1,2})+f({3,4})=170 f({1,3})+f({2,4})=165 f({1,4})+f({2,3})=175
Evaluation of IDP The total number of evaluations performed by IDP is only 38.7% of those performed by the original dynamic programming (DP) algorithm