310 likes | 395 Views
Guiding dynamics in potential games. Avrim Blum Carnegie Mellon University Joint work with Maria-Florina Balcan and Yishay Mansour. [Cornell CSECON 2009]. [This talk based on results in “Improved Equilibria via Public Service Advertising”, SODA’09 and “The Price of Uncertainty”, ACM-EC’09].
E N D
Guiding dynamics in potential games Avrim Blum Carnegie Mellon University Joint work with Maria-Florina Balcan and Yishay Mansour [Cornell CSECON 2009] [This talk based on results in “Improved Equilibria via Public Service Advertising”, SODA’09 and “The Price of Uncertainty”, ACM-EC’09]
Good equilibria, Bad equilibria Many games have both good and bad equilibria. In some places, everyone throws their trash on the street. In some, everyone puts their trash in the trash can. In some places, everyone drives their own car. In some, everybody uses and pays for good public transit.
Good equilibria, Bad equilibria s n 1 t Many games have both good and bad equilibria. A nice formal example is fair cost-sharing. n players in weighted directed graph G. Player i wants to get from si to ti, and they share cost of edges they use with others. Good equilibrium: all use edge of cost 1. (cost 1/n per player) Bad equilibrium: all use edge of cost n. (cost 1 per player)
Good equilibria, Bad equilibria 0 0 0 s1 sn k ¿ n … 1 1 1 1 t Many games have both good and bad equilibria. A nice formal example is fair cost-sharing. n players in weighted directed graph G. Player i wants to get from si to ti, and they share cost of edges they use with others. Shared transit cars
Shared transit 0 0 0 s1 sn k ¿ n … 1 1 1 1 cars t High-level questions 1. Can a helpful authority encourage behavior to move from bad to good? • Model as having some limited powers of persuasion
High-level questions 1. Can a helpful authority encourage behavior to move from bad to good? • Model as having some limited powers of persuasion 2. In reverse direction, if we get people into a good equilibrium (and players are selfish, reasonably myopic, etc) then like to think behavior will stay there.
High-level questions 1. Can a helpful authority encourage behavior to move from bad to good? • Model as having some limited powers of persuasion 2. If game has small fluctuations in costs, or a few Byzantine players, (when) could behavior spiral out of control?
Direction 1: guiding from bad to good Ride public transit 0 0 0 s1 sn k k … 1 1 1 1 t “Public service advertising model”: 0. n players begin in some arbitrary configuration. • Authority launches public-service advertising campaign, proposing joint action s*.
Direction 1: guiding from bad to good Ride public transit 0 0 0 s1 sn k … 1 1 1 1 t “Public service advertising model”: 0. n players begin in some arbitrary configuration. • Authority launches public-service advertising campaign, proposing joint action s*. Each player i pays attention and follows with probability . Call these the receptive players • Authority launches public-service advertising campaign, proposing joint action s*.
Direction 1: guiding from bad to good Ride public transit 0 0 0 s1 sn k … 1 1 1 1 t “Public service advertising model”: 0. n players begin in some arbitrary configuration. • Authority launches public-service advertising campaign, proposing joint action s*. Each player i pays attention and follows with probability . Call these the receptive players • Remaining (non-receptive) players fall to some arbitrary equilibrium for themselves, given play of receptive players. • Campaign wears off. Entire set of players follows best-response dynamics from then on.
Direction 1: guiding from bad to good “Public service advertising model”: 0. n players begin in some arbitrary configuration. • Authority launches public-service advertising campaign, proposing joint action s*. Each player i pays attention and follows with probability . Call these the receptive players • Remaining (non-receptive) players fall to some arbitrary equilibrium for themselves, given play of receptive players. • Campaign wears off. Entire set of players follows best-response dynamics from then on. Note #1: if =1, can just propose best Nash equilibrium. Key issue: what if < 1?
Direction 1: guiding from bad to good “Public service advertising model”: 0. n players begin in some arbitrary configuration. • Authority launches public-service advertising campaign, proposing joint action s*. Each player i pays attention and follows with probability . Call these the receptive players • Remaining (non-receptive) players fall to some arbitrary equilibrium for themselves, given play of receptive players. • Campaign wears off. Entire set of players follows best-response dynamics from then on. Note #2: Can replace 2 with poly(n) steps of best-response for non-receptive players.
Fair Cost Sharing 0 0 0 s1 sn k … 1 1 1 1 t Cost sharing: (PoS = log(n), PoA = n) If only an probability of players following the advice, then we get within O(log(n)/) of OPT. Proof Idea: - Advertiser proposes OPT (any apx also works) - In any NE for non-receptive players, any such player i can’t improve by switching to his path PiOPT in OPT. #receptives on edge e - Calculate total cost of these guaranteed options. Rearrange sum...
Fair Cost Sharing Cost sharing: (PoS = log(n), PoA = n) If only an probability of players following the advice, then we get within O(log(n)/) of OPT. Proof Idea: - Finally, use: X ~ Bi(n,p) - Take expectation, add back in cost of receptives: get O(OPT/). (End of phase 2) - Calculate total cost of these guaranteed options. Rearrange sum...
Fair Cost Sharing Cost sharing: (PoS = log(n), PoA = n) If only an probability of players following the advice, then we get within O(log(n)/) of OPT. Proof Idea: - Finally, use: X ~ Bi(n,p) - Take expectation, add back in cost of receptives: get O(OPT/). (End of phase 2) - Finally, in last phase, std potential argument shows behavior cannot get worse by more than an additional log(n) factor. (End of phase 3)
Cost Sharing, Extension Cost sharing: + linear delays: - Problem: can’t argue as if remaining NR players didn’t exist since they add to delays Proof Idea: - Define shadow game: pure linear latency fns. Offset defined by equilib at end of phase 2. - This has good PoA. # users on e at end of phase 2 - Behavior at end of phase 2 is equilib for this game too. - Show
Party affiliation games + + - - + • Given graph G, each edge labeled + or -. • Vertices have two actions: RED or BLUE. Pay 1 for each + edge with endpoint of different color, and each – edge with endpoint of same color. +1 to keep ratios finite • Special cases: • All + edges is consensus game. • All – edges is cut-game.
Party affiliation games Degree (1/2 - )n/8 across cut (PoS = 1, PoA = (n2)) Party Affiliation: - Threshold behavior: for > ½, can get ratio O(1), but for < ½, ratio stays (n2). (assume degrees (log n)). Lower bound: - Consensus game, two cliques, with relatively sparse between them. Players “locked” into place.
Party affiliation games (PoS = 1, PoA = (n2)) Party Affiliation: - Threshold behavior: for > ½, can get ratio O(1), but for < ½, ratio stays (n2). (assume degrees (log n)). Upper bound: - Split nodes into those incurring low-cost vs those incurring high-cost under OPT. - Show that low-cost will switch to behavior in OPT. For high-cost, don’t care. - Cost only improves in final best-response process.
High-level questions 1. Can a helpful authority encourage behavior to move from bad to good? • Model as having some limited powers of persuasion 2. If game has small fluctuations in costs, or a few byzantine players, could behavior spiral out of control?
Direction #2 If game has small fluctuations in costs, or a few Byzantine players, could behavior spiral out of control? A few ways this could happen: • Small changes cause good equilibria to disappear, only bad ones left. (economy?) • Bad behavior by a few players causes pain for all (nukes) • Neither of above, but instead through more subtle interaction with dynamics…
Model • Players follow best (or better) response dynamics. • Costs of resources can fluctuate between moves: cit2 [ci/(1+), ci(1+)] (alternatively, one or more Byzantine players who move between time steps) • Play begins in a low-cost state. • How bad can things get? Price-of-Uncertainty() of game = maximum ratio of eventual social cost to initial cost.
Model • Players follow best (or better) response dynamics. • Costs of resources can fluctuate between moves: cit2 [ci/(1+), ci(1+)] Price-of-Uncertainty() of game = maximum ratio of eventual social cost to initial cost.
Model • Players follow best (or better) response dynamics. • Costs of resources can fluctuate between moves: cit2 [ci/(1+), ci(1+)] Price-of-Uncertainty() of game = maximum ratio of eventual social cost to initial cost. One way to look at this: • Define graph: one node for each state. Edge u ! v if perturbation can cause BR to move from u to v. • What do the reachable sets look like?
cm c3 c2 c1 Set-cover games • Special case of fair cost-sharing • n players, m resources, with costs c1,…,cm. Each player has some allowable resources • Each player chooses some allowable resource. • Players split cost with all others choosing same one.
Main results Set-cover games: • If = O(1/nm) then PoU = O(log n). • However, for any constant > 0, PoU = (n). • Also, a single Byzantine player can take state from a PNE of cost O(OPT) to one of cost (n¢OPT).
Main results General fair-cost-sharing games: • If many players for each (si,ti) pair (ni = (m)), then PoU = O(1) even for constant >0. • Open for general number of players. Matroid congestion games: (strategy sets are bases of matroid. E.g., set-cover where choose k resources) • If = O(1/nm) then PoU = O(log n) for fair cost-sharing. • In general, if = O(1/nm) then PoU = O(GAP). In both cases, require best-response. Better-response not enough. (unlike set-cover) Also results for other classes of games too.
cj ck Set-Cover games (upper bound) For upper bound, think of players in sets as a stack of chips. • View ith position in stack j as having cost cj/i. Load chips with value equal to initial cost. • When player moves from j to k, move top chip. Cost of position goes up by at most (1+)2. • At most mn different positions. So, following the path of any chip and removing loops, cost of final set is at most (1+)2nm times its value. So, if = O(1/nm) then PoU = O(log n).
Matroid games In matroid games, can think of each player as controlling a set of chips. • Nice property of best response in matroids: • Can always order the move so that each individual chip is doing better-response. • Apply previous argument. • Fails for better-response. • Here, can get player to do kind of binary counting, bad even for exponentially-small .
Open questions and directions And how dangerous could small fluctuations be in knocking them out? Looking at: how can we help players find their way to a good state? Getting to good states: nice line of work on how players might be able to do it all by themselves. [Blume, Young, Shamma, Marden, Beggs…] • Noisy best-response / noisy adaptive play. • Distribution in limit favors good states, like simulated annealing. • But, time could be exponential (subway).
Open questions and directions To reach good states quickly, need to give players more information about game they are playing. More general, self-interested models for this? Getting to good states: nice line of work on how players might be able to do it all by themselves. [Blume, Young, Shamma, Marden, Beggs…] • Noisy best-response / noisy adaptive play. • Distribution in limit favors good states, like simulated annealing. • But, time could be exponential (subway).