300 likes | 405 Views
Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses. Ruta Mehta Indian Institute of Technology, Bombay Joint work with Jugal Garg and Albert X. Jiang. A Game: Rock-Paper-Scissor. Rock-Paper-Scissor: A Play. Winner. $ 1. Rock-Paper-Scissor: A Play. Winner. $ 1.
E N D
Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with JugalGarg and Albert X. Jiang
Rock-Paper-Scissor: A Play Winner $1
Rock-Paper-Scissor: A Play Winner $1
Rock-Paper-Scissor: A Play Winner $1
Bimatrix Game S1 = { R, P, C } S2 = { R, P, C } A B Steady State: No player gains by unilateral deviation
Bimatrix Game S1 = { R, P, C } S2 = { R, P, C } A B No Steady State
Mixed Play S1 = { R, P, C } S1 = { R, P, C } ∆1={r1, p1, c1≥0; r1+p1+c1=1} ∆2={r2, p2, c2≥0; r2+p2+c2=1} B A Steady State
John Nash (1951) • Finite Game: Finitely many players, each with finitely many strategies. • Nash: Every finite game has a steady state in mixed strategy. Hence forth called Nash equilibrium (NE) • Proved using Kakutani fixed point theorem: Highly non-constructive.
Nash Equilibrium Computation • Papadimitriou (JCSS’94): PPAD-class • Problems where existence is guaranteed like fixed point, Sperner’s Lemma, Nash equilibrium. • Chen and Deng (FOCS’06): It is PPAD-hard. • CDT (FOCS’06): Even approximation is PPAD-hard.
Rank and Computation • Kannanand Theobald(SODA’07): • Define rank of (A,B) as rank(A+B). • FPTAS for fixed rank games. • Polynomial time algorithms for exact Nash. • Dantzig(1963): Zero-sum (rank-0) is equiv. to LP. • AGMS (STOC’11): Rank-1 games.
Bilinear Games Bimatrix Game with polyhedral strategy sets. • Two players: 1and 2 • Polyhedral strategy sets: • X={x | Ex = e; x ≥ 0}, Y={y | Fy=f; y ≥ 0} • Payoff matrices: A, B • Bilinear Payoff: (x, y) fetches xTAyto player 1, and xTBy to player 2. Motivation: Koller et al. (STOC’94) for two-player extensive form game with perfect recall.
Nash Equilibrium in Bilinear • NE: No player gains by unilateral deviation. • Existence: Corollary of Glicksberg’s result. • Symmetric Game:B=AT and Y=X. • (x, y) is a symmetric profile if y=x. • Existence of symmetric NE: An adaptation of Nash’s proof for symmetric bimatrix games.
Bilinear Contains: • Bimatrix, Polymatrix, Bayesian, etc. • Bimatrix: X = ∆1, Y = ∆2 • Polymatrix: • N players. Each pair plays a bimatrix game. • Player i: Si finite strategy set, ∆i Mixed strategy set. • Goal of i: Choose xi from ∆i to maximize total payoff. i Aij j
Polymatrix to Bilinear • M= |S1|+ … + |Sn|. X = {(x1,…,xn) | xi in ∆i}, Y=X. • A , B=AT Symmetric NE of (A,B) maps to a NE of the polymatrix game j i A =
Best Response (Koller et al.) • Fix a strategy y of player 2. • Player 1 solves max: xT(Ay) min: eTp Ex = e pTE≥ (Ay)T x ≥ 0 At optimal: p s.t. Aiy ≤ pTEi& xi > 0 => Aiy = pTEi • Given x X, for player 2 we get At optimal: q s.t. Bjx ≤ qTFj& yj> 0 => qTFj = Bjx
Best Response Polytopes (BRPs) • (x,y) is a NE iff p: Ay ≤ETp; xi > 0 => Aiy = pTEi q: xTB≤qTF; yj> 0 => qTFj = Bjx xT(Ay - ETp) ≤ 0 and (xTB - qTF)y ≤ 0 xT(A+B)y – eTp – fTy ≤ 0
Nash Equilibrium in BRPs NE iffxT(Ay - ETp)=0 and (xTB - qTF)y=0 xT(A+B)y – eTp – fTy=0 Assumption: P and Q are non-degnerate. (u, v) of P x Q gives a NE => (u, v) is a vertex.
QP Formulation max: xT(A+B)y – eTp – fTy s.t. (y, p) P (x, q) Q • Optimal value 0. • Only vertex solutions.
Our Results • Rank-1 games: rank(A+B)=1 • Extend Adsul et al. algorithm for exact NE. • Fixed rank games: rank(A+B)=k • Extend FPTAS of Kannan et al. • Rank of A or B is constant • Enumerate all NE in polynomial time.
Rank-1 Case • Zero-sum ~ rank(A+B)=0: LP formulation (Charnes’53) • rank(A+B)=1 then A+B = a.bT • The QP formulation: max: (xTa)(bTy) – eTp – fTy s.t. (y, p) P (x, q) Q
Rank-1 Case • Replace (xTa) by z. Recall B = -A + a.bT xT(A+B)y – eTp – fTy=0 z(bTy) – eTp – fTy=0 • N = Points of P x Q’ with z(bTy) – eTp – fTy=0 • Forms paths and cycles, since z gives one degree of freedom. NE of (A,B): Points in intersection of N and z – xTa =0.
Parameterized LP LP(z) = max: z(bTy) – eTp – fTy s.t. (y, p) P (x, z, q) Q’ • Given any c, Optimal value of LP(c) is 0. • OPT(c) lies on N, and • Let N(c)={Points of N with z=c}, then OPT(c)=N(c). • N is a single path on which z is monotonic.
Rank-1: The Algorithm • NE: Intersection of N and H: z – xTa =0. • . c1=amin, c2=amax H N(c1) N H+ H– NE N(c2)
Rank-1: Binary Search Algorithm • NE of (A,B): Points in intersection of N and H. • c=c1+c2/2. H N(c1) N H+ H– NE N(c) N(c2)
Rank-1: Binary Search Algorithm • NE of (A,B): Points in intersection of N and H. • c=c1+c2/2. If N(c) in H–,then c1=c else c2=c. H N H+ H– NE N(c1) N(c2)
Analysis • Terminates because, • z is monotonic on N. • Increase in z on each edge is lower bounded by 1/d where d is polynomial sized in the input. • Time complexity: • Solve LP(c) to get N(c) in each pivot. • log(d) * log(amax – amin) pivots.
Conclusions • Bilinear games: • Bimatrix with polytopal strategy sets. • Fairly general. Contains polymatrix, bayesian, etc. • Polynomial time algorithm for rank based subclasses. • Open problems: • Designing a Lemke-Howson type algorithm. • Degree, index, stability concepts. • Computation of approximate equilibrium.