560 likes | 591 Views
Price of Total Anarchy. June 2008. Slides by Israel Shalom Based on “Regret Minimization and the Price of Total Anarchy” By Avrim Blum, MohammadTaghi Hajiaghayi, Katrina Ligett and Aaron Roth. Agenda. Preliminaries Game Theory Basics Regret Minimization Hotelling games Valid games
E N D
Price of Total Anarchy June 2008 Slides by Israel Shalom Based on “Regret Minimization and the Price of Total Anarchy” By Avrim Blum, MohammadTaghi Hajiaghayi, Katrina Ligett and Aaron Roth
Agenda • Preliminaries • Game Theory Basics • Regret Minimization • Hotelling games • Valid games • Atomic congestion games • Algorithmic efficiency
Games in Strategic Form • The game has players • Each player has his available pure strategies • marks the strategy profiles • Individual utility (payoff) functions
Games in Strategic Form – cont’d • Examples: • Rock, Paper, Scissors • Prisoner’s Dilemma
Mixed Strategies • Users can play “mixed strategies” as well – a probability distribution over , we mark this as • marks the mixed strategy profiles • The payoffs are now defined as the expected value of over the randomness of the players • Sometimes marked by
Best Response and Nash Equilibria • Lowercase letters will usually denote elements: • , , … • We denote by the selected strategies of the players other than i ( ) • A strategy is best response to if for all : • A strategy profile is a Nash Equilibrium if for all i, is a best response to . • Pure equilibria might exist, but in every game there is at least one mixed Nash Equilbrium.
Nash Equilibria • Examples: • Rock, Paper, Scissors • Mixed equilibrium: ([1/3,1/3,1/3], [1/3, 1/3, 1/3]) • Prisoner’s Dilemma • Pure equilibrium (Confess, Confess)
Social Optimum • Sometimes, we’ll define a social utility (welfare) function, similar to payoffs: • Choices that would make sense: • For mixed strategies, we’ll look for the expected value (analogous to payoff in mixed strategies) • Socially optimum strategy profile (and OPT) are: We are assuming a maximizing game throughout, the minimization is analogous
Price of Anarchy • Let mark all the Nash Equilibria in the game • The price of anarchy is defined as the ratio of the worst NE to optimum:
Price of Anarchy • Prisoner’s Dilemma • Notice that the fraction is flipped (minimization game) OPT = 2 N = 6
Regret Minimization • Let mark the strategy profiles in T steps • We define the regret of player i in a maximization game: • Intuitively, this is “how much i could gain more in average had he played a single strategy throughout the game”
Regret Minimization • When a player i uses a regret-minimizing algorithm, for any sequence, we have the property • Where: • vanishes as • marks the number of steps before • The expectancy is over the algorithm’s randomness • In other words, the expected value of regret vanishes • Notice that this is for maximizing games
Regret Minimization • This implies that for any sequence , if player i is regret-minimizing, then: • The price of total anarchy is defined as: • Where max is taken over , that are play profiles with regret-minimization property
Regret and NE • Notice that when playing a Nash Equilibrium, all players will have zero-regret • If there’s a better “constant” response, we can improve by moving to it • Therefore, the price of total anarchy in any game consists an upper bound for the price of anarchy Regret-minimizing strategies NE
Advantages of Regret Minimization • Computational • Nash Equilibria are hard (PPAD-hard) to calculate – even for small action spaces • There are efficient regret minimization algorithms for polynomial number of actions • Motivational • No particular reason for players to converge down to NE • There might be multiple equilibria, and agents may individually prefer different ones • Byzantine players’ actions are not taken into account in NE • Regret-minimization considers only local information, much more practical
Agenda • Preliminaries • Hotelling games • Definition • POA/POTA • Generalization • Valid games • Atomic congestion games • Algorithmic efficiency
Hotelling - Game Definition • Souvenir stand owners in Paris: • There are tourists every day, they buy from whichever stand they find first • Each stand owner wishes to maximize his own sales • We want “fairness”, the social welfare function is the minimum of the total sales made. • Formally: • We have an n-vertex graph . • Each seller locates himself at a vertex • Each day, a tourist in each vertex, goes to the closest seller • If there is a “tie” between the sellers, they split the gains • Minimum utility:
Hotelling - Optimum Solution • Notice that the sum of payoffs is always exactly n • Therefore, the social optimum is achieved when all players have equal payoffs • This can happen if all players play on the same vertex • Therefore
Hotelling – POA • Theorem 3.1 The price of anarchy in the Hotelling game is (2k – 2)/k • Proof • We are to show that all players gain at least n/(2k – 2) • Assume the contrary, that player i gains less than that in S • Consider player i “leaving” the game. The total payoff is still n, so the average payoff for players is now n/(k-1) • There must be at least one player h gaining at least the average, playing the vertex vh • Player i can assure n/(2k – 2) by moving to vh • Contradiction to Nash equilibrium
Theorem 3.1 – cont’d • We are left with showing tightness • Consider a game with k-1 stars • k-1 players play at centers of their own stars, and player k plays uniformly over all the star centers • This is NE • The randomizing player earns n/(2k - 2) 2 1 3 k-1
Hotelling – POTA • Let be the strategy of playing an arbitrary strategy from strategies in . • Define • Notice that , since when player i is removed, the rest have average payoff of • Lemma 3.4 For all i, for all , . (Trivial for t = u)
Lemma 3.4 - Proof • Consider a -player game • Each player other than i replicated twice: once as time-t player and once as time-u player, with strategies and . • Average payoff is • If player i replaces a time-t player, that’s his expected payoff • If we further remove time-t players, we only improve
The Imaginary Game, n=10, k=4 time-t players time-u players replacing time-t & removing other time-t players replacing time-t player in imaginary = E ≤ E =
Lemma 3.4 – cont’d • Same argument holds for replacing u-player:
Hotelling – POTA • Theorem 3.2 Each regret minimizing player has at least n/(2k-2) payoff • Proof • Provided a sequence of T plays, select a random time u • The average expected payoff if we played throughout is: • Averaging over different u, we reach:
Hotelling – POTA • We reached • The second term is non-negative due to Lemma 3.4 • There is value for u that achieves the average • For that u, if player i mixes between , he’ll achieve • A regret minimizing player achieves this expected payoff
Hotelling – POTA • Corollary: The price of total anarchy in the Hotelling game is (2k-2)/k, matching the price of anarchy • Notice that in the we haven’t made any assumptions about how other players behave, so the proof holds even in the presence of Byzantine players making arbitrary (or adversarial) decisions!
Generalized Hotelling Game • Notice that in the proof we have used only three features of the hotelling game: • Constant sum – the sum of utilities is constant • Symmetric – the “names” of the stand owners don’t matter • Monotone – any player can “leave” the game and the sum does not change • We call such games with the “fairness” social utility generalized Hotelling games. • Theorem 3.6: In any k-player generalized Hotelling game, the price of total anarchy among regret minimizing players is (2k-2)/k even in the presence of arbitrarily many Byzantine players.
Non-Convergence • Consider the game with: • Players {0, …, k-1} • k-1 n-vertex stars, with centers at v0, …, vk-2and isolated vertex vk-1 • Consider • Each player’s payoff • No single vertex has expected payoff more than • No regrets • However, this is not Nash! Players at the isolated vertex will deviate! 2 1 3 k-1 k
Agenda • Preliminaries • Hotelling games • Valid games • Definition • Market sharing game • POA/POTA • Byzantine players • Atomic congestion games • Algorithmic efficiency
Valid Games – Definitions • Consider a k-player maximization game • For each player, there is a groundset of actions Vi • Player i plays from some feasible set • Definitions • Let • The discrete derivative of at in the direction is • The function is said to be submodular if for This should remind us “concavity” – decreasing marginal utility
Submodularity V • Adding something to a smaller set makes a bigger difference B high-def villa A jacuzzi car house
Valid Games – Definitions • We will notate as the strategies of players with index smaller than i. We will also use both this and as complete strategies (as in apply over them), meaning that the remaining players play the empty set • Definition 4.2: A game with private utility functions and social utility function is valid if: • is submodular • For all i, s: - private fairness • For all s: - social fairness
Valid Games – Example • Market sharing game (Goemans et. al., 2005) • Players are ISP’s • Markets are towns • Each market has price and value • Each player can “enter” the market he has an edge towards, with budget constraint • Player’s payoff per market is the value divided by entrances • Sum social utility • Or – sum of values at entered markets 5 3 9 2 players markets
Valid Games – Price of Anarchy • Vetta, 2002: In a valid game, if is a NE strategy, and is the optimal strategy then: • Corollary: if is non-decreasing, then we have POA 2 (The derivatives are always positive) • Theorem 4.3, Corollary 4.2 (no proofs) POTA matches POA in valid games (up to )
Valid Games – Byzantine Players • Theorem 4.5 In a valid game with nondecreasing social welfare, if k players minimize regret with while the Byzantine players play strategies the average social welfare is: • Proof. Assume the contrary,
Theorem 4.5 – cont’d (non-decreasing) (gradually inserting) (submodularity) (private fairness)
Gradual Insertation B villa A jacuzzi car house
Theorem 4.5 – cont’d (summarizing) (assumption – the first term is less than half) (social fairness) (rearranging sum)
Theorem 4.5 – cont’d • At least one player must match that, so for him we have • Contradictory to regret minimization! • Note that it’s compared to the old OPT (without the Byzantine players) • But it’s fair – Byzantine players may be acting even against their own interest – we can’t say anything about them
Agenda • Preliminaries • Hotelling games • Valid games • Atomic congestion games • Definition • Sum social utility – POTA • Makespan utility – Lower bounds • Algorithmic efficiency
Congestion Games • A congestion game is a minimization game, with k players • For each player, there is a set of facilities Vi • Player i plays from some feasible set • In weighted games, player i has a weight wi • For unweighted games, we assume wi = 1 • The load on facility e is defined as • Each facility e has an associated latency function fe • Player i playing ai experiences cost
Atomic Congestion Games • We’ll consider a specific kind of congestion game • Unweighted • Linear latencies – • We will use sum social utility: • Previously known results: • POA for pure strategies is 2.5 (Awerbuch et. al., 2005) • POA for mixed strategies is also 2.5 (Chirstodoulou and Koutsoupias, 2007) • Theorem 5.1: POTA in this setting is 2.5 • This asserts the previously known results!
Theorem 5.1 – Proof • Let be the optimal play • Since we have no regret, for all i • Summarizing for each player, and rearranging sum: • Or more simply:
Theorem 5.1 – cont’d • Geometric mean is smaller than arithmetic mean, so: • Recall our equation (1) (2) (3)
Theorem 5.1 – cont’d • Multiplying both sides by two: • Further relaxing the inequality: • We’re done!
Parallel Link Congestion Game • Consider n identical links and k weighted players • Each player selects which link to use (single link) • Each player pays the sum of the weights on the link
Parallel Link Congestion Games – cont’d • Claim: In Parallel link congestion game with social cost function as the maximum expected job latency, POTA is 2 • Proof. • Rescale the weights, so that OPT = 1 • Total weight is less than n, weights are less than 1 • Total latency in T plays is Tn, at least one link e* with latency less than T in total, average latency - l(e*) ≤ 1 • Regret minimizing player will be competitive to moving to e* • We expect at most l(e*) +wi ≤ 2
Parallel Links Congestion Game – cont’d • The unweighted case with the sum social utility, is called “load balancing game” • It’s a specific case of the discussed before, thus we will have POA and POTA of 2.5 • If k >> n (a likely case) and the server speeds are relatively bounded, we can say even more • Theorem 5.6 (no proof): In this formation, POTA is 1 + o(1) • Corollary 5.7: In this formation, POA is 1 + o(1), even for mixed strategies