A Polynomial Time Algorithm for 2 -Player Rank 1 Games

A Polynomial Time Algorithm for 2-Player Rank 1 Games Ruta Mehta Based on a joint work with Bharat Adsul, JugalGarg and MilindSohoni

m strategies n strategies B A

Mixed Strategy/Randomize B A Goal: Maximize expected payoff

Nash Equilibrium John Nash (1951): Given a finite game, there exists a tuple of mixed-strategy vectors, one for each player, such that no player gains by deviating unilaterally.

Computation k-Nash: Computing a Nash equilibrium of a finite k player game. 2-Nash:Rational Solutions PPAD-complete (Papadimitriou’92, DGP’06, CD’06) k-Nash, : Algebraic Solutions

Computation k-Nash: Computing a Nash equilibrium of a finite k player game. 2-Nash:Rational Solutions PPAD-complete (Papadimitriou’92, DGP’06, CD’06) k-Nash, : Irrational Solutions (Nash’51) FIXP-complete (Etessami & Yannakakis’07)

2-Nash • von Neumann (1928):In zero-sum games (), Min-Max strategies are stable. • Linear programming duality In P. (Dantzig’51, Adler’10) • Kannan Theobald (2005): Defined rank of a game as rank(A+B). • Zero-sum rank 0.

2-Nash • von Neumann (1928):In zero-sum games (), Min-Max strategies are stable. • Linear programming duality In P. (Dantzig’51, Adler’10) • Kannan Theobald (2005): Defined rank of a game as rank(A+B). • Zero-sum rank 0. FPTAS for fixed-rank. • Can we solve rank 1 games in polynomial time?

Difficulty in Rank 1 Games Disconnected set of equilibria(KT’05). Exponentially many disconnected equilibria (von Stengel’12).

2-Nash n strategies m strategies A B • Mixed strategy • Probability distribution vectors

2-Nash n strategies m strategies A B • Mixed strategy is a Nash equilibriumiff • fetches the best payoff to Alice, against . • fetches the best payoff to Bob, against

2-Nash Characterization n strategies m strategies A B If Bob plays then for Alice • Expected payoff from her strategy is A

Bob plays . Then for Alice • Expected payoff from her strategy is A • Best strategies are • achieves max payoffiff Related toComplementarity

For Bob Related toComplementarity Fixing for Alice, Bob’s payoff is from strategy. Best strategies are achieves max payoffiff

Define and Complementarity Complementarity and Nash equilibrium

2-Nash Quadratic Program

Polyhedra Variable vector Scalar variable Variable vector Scalar variable

Complementarity: Vertex is a NE are the payoffs

At least the sum of max payoff Sum of payoffs

At least the sum of max payoff Sum of payoffs is a NE are the max payoffs

At least the sum of max payoff Sum of payoff 2-Nash max: Complementarity s.t. is a NE are the max payoffs

Rank 1 Game Bilinear 2-Nash max: s.t.

Rank 1 Game Product of two linear terms 2-Nash max: s.t. Rank 1 QP is NP-hard in general

Think Big! Consider a space of rank 1 games S = {

Think Big! Consider a space of rank 1 games

Think Big! 2-Nash max: s.t.

Think Big! Consider game space 2-Nash max: s.t.

Think Big! Consider game space All NE of S Complementarity Idea Captures max: Solutions of LP( s.t. LP()

(,), a NE of S Proof: At any feasible point, cost of LP() is at most zero. is feasible. Complementarity Cost is exactly zero at it. Claim: LP()

LP() Goal: NE of If one of them then done! (m-1)-dimensional space in S Claim:s.t. (,) is a NE of game

Goal: NE of game LP() If = then done!

Goal: NE of game LP() NE of game Fixed points of If = then done!

NE of game Fixed points of LP() Continuous

1-D Fixed Point Lower Upper 1 Upper Lower 0 1

1-D Fixed Point Lower Upper 1 Upper Lower 0 1 And so on until the difference becomes small enough

This was in hindsight

Kohlberg and Mertens (1986) Finite player Game space x Homeomorphic Its Nash correspondence g

Kohlberg and Mertens (1986) Finite player Game space x Continuous Bijection Its Nash correspondence g Stability, Index (Shapley’74, KM’86, GW’97) To design homotopy methods. (Govindan & Wilson’03,…) No extension to subspaces known

We Show That Game space S = Its Nash correspondence Homeomorphic

Rank 2 Games Open: A polynomial time algorithm.

PPAD = P?

Simplex method: Practical, although (Dantzig’47) exponential in worst case In P or NP-hard? Khachiyan’79: Polynomial time solvable.

Though exponential in worst case

vs. Stumbling Block

PPAD = P? vs. Stumbling Block?? Our result Not in general

Thanks

A Polynomial Time Algorithm for 2 -Player Rank 1 Games

A Polynomial Time Algorithm for 2 -Player Rank 1 Games

Presentation Transcript

AI – Week 8 2 Player Games

A Randomized Polynomial-Time Simplex Algorithm for Linear Programming

N-Player Games

A Polynomial-Time Algorithm for Global Value Numbering

N-Player Games

Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses

A Projection Framework for Near-Potential Polynomial Games

A Random Polynomial-Time Algorithm for Approximating the Volume of Convex Bodies

AI for Connect-4 (or other 2-player games)

A Polynomial-Time Cutting-Plane Algorithm for Matchings

General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning

Polynomial-Time Hierarchy

A Polynomial Time Exact Algorithm for Self-Aligned Double Patterning Layout Decomposition

R-Max: A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning

A (1+ )-Approximation Algorithm for 2-Line-Center

Polynomial-time reductions

A Polynomial-Time Algorithm for Global Value Numbering

A Randomized Polynomial-Time Simplex Algorithm for Linear Programming

Polynomial-Time Hierarchy

A polynomial time algorithm for constructing k-maintainable policies

A Model for Evaluating Player Enjoyment in Games