Recent progress in computing approximate Nash equilibria

Recent progress in computing approximate Nash equilibria Paul W. Goldberg Dept. of Computer Science University of Liverpool

Nash equilibrium 2 players, each with a set of n pure strategies • For each pair (i,j), a payoff is specified for each player: R(i, j) for player 1 and C(i, j) for player 2. • These payoffs can be placed into 2 n×n matrices R and C. • We want probability distributions x and y over the players’ strategies such that their expected payoffs cannot be increased by either player changing his distribution: xTRy ≥ (x’)T Ry for all distributions x’ over player 1’s strategies xTCy ≥ xTC(y’) for all distributions y’ over player 2’s strategies

Nash equilibrium 1/3 1/3 1/3 C R R P S 1/3 R -1 1 1 -1 1/3 P 1 -1 -1 1 1/3 S -1 1 1 -1 Rock-paper-scissors

Nash equilibrium 1/3 5/12 1/4 C R R P S 1/3 R -1 2 1 -1 1/3 P 1 -1 -1 1 1/3 S -1 1 1 -1 (thanks to Rahul Savani’s on-line NE program.)

Computing Nash equilibria • Some pre-history: Nash equilibria are “hard” to compute exactly • But, there are notions of approximate NE… (ε-Nash equilibrium) • So, for what values of ε can we compute approximate NE? • (obvious analogy with approximation algorithms for NP-complete problems)

ε-Nash equilibrium • exact NE: “no incentive to deviate” • ε-NE: gain of at most ε when you deviate • let x and y denote the row and column players’ mixed strategies; let eibe vector with 1 in compt i, zero elsewhere. • For all i, xTRy ≥ eiTRy-ε. • For all j, xTCy ≥ xTCei-ε. • Assume payoffs are re-scaled into [0,1]

A simple algorithm [Daskalakis, Mehta and Papadinitriou, WINE 2006] C R 1 2 3 1/2 1 0 0.1 0.2 0.2 0.9 0.2 2 0.3 0.4 0.5 0.2 0.1 0.2 3 0.6 0.7 0.8 0.2 0.2 0.8 ●1 Player 1 chooses arbitrary strategy i; gives it probability ½

A simple algorithm [Daskalakis, Mehta and Papadinitriou, WINE 2006] 1 C R 1 2 3 1/2 1 0 0.1 0.2 0.2 0.9 0.2 2 0.3 0.4 0.5 0.2 0.1 0.2 3 0.6 0.7 0.8 0.2 0.2 0.8 ●2 Player 2 chooses best response j; gives it probability 1

A simple algorithm [Daskalakis, Mehta and Papadinitriou, WINE 2006] 1 C R 1 2 3 1/2 1 0 0.1 0.2 0.2 0.9 0.2 2 0.3 0.4 0.5 0.2 0.1 0.2 1/2 3 0.6 0.7 0.8 0.2 0.2 0.8 ●3 Player 1 chooses best response to j; gives it probability ½

Can we improve this algorithm? (i.e. is there an “incremental” improvement?) e.g. Player 1 did not choose a “good” strategy to begin with…

No! [Feder, Nazerzadeh and Saberi, EC 2007]: To get a better approximation than ½, strategies need support of size Θ(log n), where n is number of strategies. Proof: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ●1 Consider a u.a.r. zero-sum win-lose n×n game

[Feder, Nazerzadeh and Saberi, EC 2007]: To get a better approximation than ½, strategies need support of size Θ(log n), where n is number of strategies. Proof (continued): 1/n 1 1 1 1 1 1 1/n 1 1 1 1 1 1 1/n 1 1 1 1 1 1 1/n 1 1 1 1 1 1 1/n 1 1 1 1 1 1 1/n 1 1 1 1 1 1 ●2 If player 1 uses uniform dist, he gets payoff about ½, whatever player 2 does…

[Feder, Nazerzadeh and Saberi, EC 2007]: To get a better approximation than ½, strategies need support of size Θ(log n), where n is number of strategies. Proof (continued): 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ●3 If player 1 uses only one strategy, player 2’s best response leaves him with nothing!

[Feder, Nazerzadeh and Saberi, EC 2007]: To get a better approximation than ½, strategies need support of size Θ(log n), where n is number of strategies. Proof (continued): 1 1 1 1 1 1 1 0.4 1 1 1 1 1 1 1 1 1 1 1 1 0.6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ●4 Indeed, if player 1 mixes just 2 strategies, w.h.p. player 2 has a response that leaves player 1 with nothing…

[Feder, Nazerzadeh and Saberi, EC 2007]: To get a better approximation than ½, strategies need support of size Θ(log n), where n is number of strategies. Proof (continued): 1 1 1 1 1 1 1 0.4 1 1 1 1 1 1 1 1 1 1 1 1 0.6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Similarly for any constant-sized support, indeed less than (say) log(n)/2, in general.

How big a support do you need? • O(log(n)) is also an upper bound (for any constant ε) [Althofer 1994; Lipton, Markakis and Mehta, EC 2003 (extended result from 2-player to multi-player)] ●Define an “empirical NE” as: draw N samples from x and y; replace x,y with resulting empirical distributions x and y.

Example 0.27 0.30 0.43 C R R P S 0.36 R -1 1 1 -1 0.29 P 1 -1 -1 1 0.35 S -1 1 1 -1 If N=100, empirical NE for rock-paper-scissors might look like this

From player 1’s perspective, suppose player 2 replaces y with an empirical distribution based on N = O(log(n/ε2). With high probability, any pure strategy i gets about the same payoff as before, eiTRy= eiTRy +O(ε) yhas support O(log(n/ε2)), so if we do the same thing with x we get the desired result.

Support enumeration Note that it follows that for any εwe can find ε-NE in time O(nlog(n)). (This was pointed out in the Lipton et al paper; another context where support enumeration “works” is on randomly-generated games [Bárány, Vempala and Vetta, FOCS ’05].)

Breaking the ε=½ Barrier [Bosse, Byrka and Markakis, WINE 07] Recall player 1’s initial strategy i may be poor, but now we know that alternative pure strategy won’t necessarily help. Original game is (R,C); solve zero-sum game (R-C,C-R); let x0and y0be player 1 and 2’s strategies in the solution. Let α be a parameter of the algorithm; if x0and y0are an α-NE, use them, else continue…

Breaking the ε=½ Barrier [Bosse, Byrka and Markakis, WINE 07] Let j be player 2’s best response to x0; player 2 uses pure strategy j. (BTW, assume player 2’s regret is at least player 1’s) Let k be player 1’s pure best response to j; player 1 uses a mixture of x0and k. Mixture coefficient of k is (1-r)/(2-r) where r is player 1’s regret in the solution to the zero-sum game.

Breaking the ε=½ Barrier [Bosse, Byrka and Markakis, WINE 07] Optimal choice of α is (3-√5)/2=0.382… . Comment: Why does this work? When player 2 changes his mind (from using y0) he is to some extent helping player 1; y0arose from a game where player 2 tries to hurt player 1 as well as help himself. In the paper, they tweak the algorithm to reduce the ε-value down to 0.364. If fact, a previous paper obtained 0.384+ζ…

0.384+ζ approximation [Daskalakis, Mehta and Papadimitriou, EC 2007] General idea:construct a LP that is satisfied by approximate solutions (x,y) to the game (R,C) Suppose (x*,y*) is a NE with payoffs v1, v2 to players 1 and 2 resp. Suppose (x, y)is an empirical NE for N= 4/ζ2. We can assume we have been given v1, v2, (x, y). ●(1): check that xTRy≈ v1 (and similarly for column player)

0.384+ζ approximation (x*,y*) is a NE with payoffs v1, v2 to players 1 and 2 resp. (x, y)is an empirical NE for N= 4/ζ2. ●(1): check that xTRy≈ v1 (and similarly for column player) ●2: Find (x’,y’) that satisfy xTRy’ ≥ v1 - 3ζ/2 For all i, eiTRy’ ≤ v1 + ζ/2 x’TRy ≥ v1 - 3ζ/2 plus a similar set of constraints for the C matrix.

0.384+ζ approximation ●(1): check that xTRy≈ v1 (and similarly for column player) ●2: Find (x’,y’) that satisfy xTRy’ ≥ v1 - 3ζ/2 For all i, eiTRy’ ≤ v1 + ζ/2 x’TRy ≥ v1 - 3ζ/2 plus a similar set of constraints for the C matrix. ●3: If max(v1,v2) ≥ ⅓ then return a certain mixture of x with x’; y with y’; else return (x’,y’).

●(1): check that xTRy≈ v1 (and similarly for column player) ●2: Find (x’,y’) that satisfy xTRy’ ≥ v1 - 3ζ/2 For all i, eiTRy’ ≤ v1 + ζ/2 x’TRy ≥ v1 - 3ζ/2 plus a similar set of constraints for the C matrix. ●3: If max(v1,v2) ≥ ⅓ then return a certain mixture of x with x’; y with y’; else return (x’,y’). If v1 and v2 are both <⅓, these constraints ensure that there is not too much to gain by defecting to any pure strategy.

●(1): check that xTRy≈ v1 (and similarly for column player) ●2: Find (x’,y’) that satisfy xTRy’ ≥ v1 - 3ζ/2 For all i, eiTRy’ ≤ v1 + ζ/2 x’TRy ≥ v1 - 3ζ/2 plus a similar set of constraints for the C matrix. ●3: If max(v1,v2) ≥ ⅓ then return a certain mixture of x with x’; y with y’; else return (x’,y’). If v1 (say) is at least ⅓, these constraints ensure that the mixture distribution has a good performance.

Conclusions • The algorithms are not randomized, but the analysis often uses randomness • plenty of open problems…

Recent progress in computing approximate Nash equilibria

Recent progress in computing approximate Nash equilibria

Presentation Transcript

Complexity Results about Nash Equilibria

Oblivious AQM and Nash Equilibria

The Complexity of Pure Nash Equilibria

Approximate Nash Equilibria in interesting games

Minimax strategies, Nash equilibria, correlated equilibria

Nash Equilibria

Computing Nash Equilibrium

Nash Implementation of Lindahl Equilibria

Computing Equilibria in Electricity Markets

Nash Equilibria In Graphical Games On Trees

Computing Equilibria

Nash Equilibria In Graphical Games On Trees Revisited

Complexity Results about Nash Equilibria

Small clique detection and approximate Nash equilibria

LOCATING MULTIPLE NASH EQUILIBRIA IN HIERARCHICAL GAMES

Model Checking Nash Equilibria in MAD Distributed Systems

Computing Nash Equilibrium

Minimax strategies, Nash equilibria, correlated equilibria

Model Checking Nash Equilibria in MAD Distributed Systems

Convergence Time to Nash Equilibria in Load Balancing

Computing Nash Equilibrium

Computing Nash Equilibrium