170 likes | 267 Views
On the Approximation Performance of Fictitious Play in Finite Games . Paul W. Goldberg U. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick Carmine Ventre U. Liverpool. Penalty-kick practice. Scenario : Every day two friends meet to practice penalty-kicks .
E N D
On the Approximation Performance of Fictitious Play in Finite Games Paul W. Goldberg U. Liverpool Rahul Savani U. Liverpool Troels Bjerre Sørensen U. Warwick Carmine Ventre U. Liverpool
Penalty-kick practice Scenario: Every day two friends meet to practice penalty-kicks players shoot on the right, left or center of the goal dive on the right, left or center of the goal actions
Penalty-kick game C R L R C L Q: How would a goalie “learn” to play this game?
Fictitious play [Brown, 51] FP rule: Best respond to the empirical distribution of play of the opponent. Days 1 2 3 4 5 6 7 8 9 10 R C L L L L C L L L ‘s action R C L 1/10 R 2/10 C dives on the L 7/10 L Is it a “good” choice? Ie, is it a good algorithm strategically?
Where does the name come from? • FP can also be seen as an algorithm for playing the game just once • Simulate what would happen in the repeated version of the game up to some predetermined round r • Output the empirical distribution • In the above example for r=10, the empirical distribution of is: R wp 1/10, C wp 2/10, L wp 7/10 • FP is a very simple iterative algorithm • Sometimes, advocated to model bounded rationality • Is FP strategically “good”?
Fictitious play and Nash equilibria • The empirical distribution of play defined by FP converges to Nash equilibria for • constant-sum games [Robinson, 51] • non-degenerate 2 × 2 games [Miyasawa, 61] • 2 × n games [Berger, 05] • ... but it does not converge in general [Shapley, 64] R C L R C L R R C C L L
Fictitious play and approximate NEs • Analysis of the strategic performances of FP done by means of approximate NEs • NE = no incentive to deviate • Ɛ-NE = little (Ɛ) incentive to deviate • Concept which assumed relevance given the PPAD-hardness of computing exact NEs [Daskalakis, Goldberg & Papadimitriou, 06] + [Chen & Deng, 06] • Payoffs normalized to [0,1] and additive approximation • [Conitzer, 09] proves: • For any game, Ɛ ≤ (r+1)/(2r) at round r • There exists an infinite game for which Ɛ = (r+1)/2r
Approximation guarantee of FP round players’ actions Ɛ = 1 Ɛ = 0 By FP rule, si is a best response to the mixture of the first i-1 actions Ɛ for playing si is (r-i+1)/r2 Ɛ of FP is
Approximation for finite games? round players’ actions Ɛ = 0 Ɛ = 1 • Re-using strategies may guarantee a significantly better approximation of FP • Experimentally, Shapley’s game (for which FP does not converge) has Ɛ ≈ 1/4 Ɛ for playing si at round i is less than (r-i+1)/r2
Our contribution • We define a class of 4n × 4n symmetric games, n being a parameter, for which we show that FP fails to obtain any constant Ɛ< ½ • Specifically, we prove a lower bound of ½ - O(1/n1-δ) for any δ > 0 • We also give a “matching” upper bound of ½ - O(1/n)
The game: row player’s payoff matrix • n=5 • α>1, β<1 • Blank entries stand for a 0 • Column player’s payoff matrix is the transpose of the above • Players share the same sequence of actions (simpler analysis)
The role of α and β • α>1, β<1 • Blank entries stand for a 0 • Column player’s payoff matrix is the transpose of the above • Players share the same sequence of actions (simpler analysis)
Next ideas of the analysis • α and β govern the ratio between the probabilities of two consecutive actions • Ratios are such that • probabilities increase in geometric progression last n different actions played occupy all but an exponentially-small fraction of the probability mass best response has payoff around β (1-1/2n) ≈ 1- O(1/n1-δ) • probability distribution does not allocate much probability to any individual strategy payoff from FP distribution is around α/2 ≈ 1/2
Upper bound round players’ actions • Ɛfor an action is defined by its last occurrence in the sequence • The maximum Ɛis given by the sequence m1, ..., m1, ... , mn, ..., mn r/n r/n
Conclusions • FP is not good from a strategic point of view in terms of approximation guarantee to NEs for finite games • There is a class of finite games for which a cyclic behavior persists which leads to a poor guarantee (independently of the number of iterations) • Ie, fully rational player has always beaten his bounded rational friend
Open problems • Is ½ a limit to the approximation performance obtainable by simple or decentralized algorithms? • Cf. algorithm of [Daskalakis, Mehta & Papadimitriou, 09] vs more complex centralized algorithms achieving a ratio better than a half • Consider more general class of algorithms • E.g., uncoupled dynamics defined by [Hart & Mas-Colell, 03] + [Hart & Mas-Colell, 06]