1 / 30

Equlibrium Selection in Stochastic Games

Equlibrium Selection in Stochastic Games. By Marcin Kadluczka Dec 2 nd 2002 CS 594 – Piotr Gmytrasiewicz. Agenda. Definition of finite discounted stochastic games Stationary equilibrium Linear tracing procedure Stochastic tracing procedures

hillaryb
Download Presentation

Equlibrium Selection in Stochastic Games

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Equlibrium Selection in Stochastic Games • By Marcin Kadluczka Dec 2nd 2002 CS 594 – Piotr Gmytrasiewicz CS 594

  2. Agenda • Definition of finite discounted stochastic games • Stationary equilibrium • Linear tracing procedure • Stochastic tracing procedures • Examples of different equlibria depending on the type of stocastic tracing CS 594

  3. Finite discounted stochastic games • Where N – is the finite set of players (N={1,2,…,n} )  - state space with finite number of states  CS 594

  4. Rules of the game Time t Time t+1 Probability of transition Player 1: Player 1: Player 2: Player 2: . . . . . . . . . . Transition Player n: Player n: Current state Rewards CS 594

  5. Other assumption • Perfect recall At each stage each player remembers all past action chosen by all players and all past states occurred • Difference from normal-form game The game does not exist of single play, but jumps according to the probability measure  to the next state and continues dynamically • For rewards it count future states not only immediate payoffs CS 594

  6. Pure & Mixed strategy • Pure strategy • Mixed strategy If mixed strategy is played -> instantaneous expected payoff of player i is denoted by And transition probability by CS 594

  7. Stationary strategy payoffs • History The set of possible histories up to stage k: Consists of all sequences • Behavior strategy • Stationary strategy • Payoffs CS 594

  8. Equilibrium • General equilibrium A strategy-tuple  is an equilibrium if and only if i is a best response to -i for all i • Stationary equilibrium (Nash Eq.) • Payoff for stationary equilibrium  CS 594

  9. Comparison with other games • Comparison to normal-form games • Comparison to MDPs • More than one agent • If strategy is stationary – they are the same • Comparison to Bayesian Games • No discount in Bayesian • Types -> States • We have beliefs inside prior CS 594

  10. Linear tracing procedure • Corresponding normal-form game We fix the state : • Prior probability distributions = prior Expectation of each player about other players strategy choices over the pure strategies Each player has the same assumption about others – Important assumption CS 594

  11. Linear tracing procedure con’t • Family of one-parameter games • Payoff function CS 594

  12. Linear tracing procedure con’t • - set of equilibrium points in It can be collection of piece of one-dim curves, though in degenerate cases it may contain isolated points and/or more dim curves • Feasible path  • Linear tracing procedure • Well-defined l.t.p  t 1 CS 594

  13. Stochastic tracing procedure • Assumption: and prior p is given • Stochastic game • Total expected discounted payoffs • Stochastic tracing procedure T(,p) CS 594

  14. Alternative ways of extension payoff function for stochastic games • There are 4 ways of define player belief: • Correlation within states – C(S) All opponents plays the same strategy • Absence of correlation within states – I(S) Each opponent can play different strategy • Correlation across time – C(T) Each player plays the same strategy accross the time • Absence of correlation across time – I(T) During the time each player can change its strategy CS 594

  15. Alternatives con’t • Alternative 1: C(S),I(T) • Alternative 2: C(S),C(T) CS 594

  16. Alternatives con’t • Alternative 3: I(S),I(T) • Alternative 4: I(S),C(T) CS 594

  17. Example 1 – C(S) versus I(S) • Prior = • Equilibria: • Starting point: CS 594

  18. Ex1: C(S) solution CS 594

  19. Ex1: C(S) calculations • (s1,s2,s3;1): Player 1 expect player 2 plays: (1/2(1-t)+t,1/2(1-t)) Player 1 expect player 3 plays: (2/3(1-t)+t,1/3(1-t)) Expected payoff: (1/2(1-t)+t)(2/3(1-t)+t)*2=1/3(1+t)(2+t) • (s1,s2,s3;2): Player 2 expect player 1 plays: (1/6(1-t)+t,5/6(1-t)) Player 2 expect player 3 plays: (2/3(1-t)+t,1/3(1-t)) Expected payoff: (1/6(1-t)+t)(2/3(1-t)+t)*2=1/9(1+5t)(2+t) • (s1,s2’,s3;1): Player 1 expect player 2 plays: (1/2(1-t)+t,5/6(1-t)) Player 1 expect player 3 plays: (2/3(1-t)+t,1/3(1-t)) Expected payoff: (1/2(1-t))(2/3(1-t)+t)*2=1/9(1-t)(2+t) CS 594

  20. Ex1: C(S) trajectory CS 594

  21. Ex1: I(S) solution CS 594

  22. Ex1: I(S) calculations • (s1,s2,s3;1): Player 1 expect player 2&3 plays s2&s3: t Player 1 expect player 2&3 plays prior(s1&s3) : (1-t) Expected payoff: ((1-t)(1/2)(2/3)+t) *2=2/3(1-t)+2t • (s1,s2,s3;2): Player 2 expect player 1&3 plays s1&s3: t Player 2 expect player 1&3 plays prior(s1&s3) : (1-t) Expected payoff: ((1-t)(1/6)(2/3)+t) *2=2/9(1-t)+2t • (s1,s2’,s3;1): Player 1 expect player 2&3 plays s2’&s3: t (but payoff is 0) Player 1 expect player 2&3 plays prior(s1&s3) : (1-t) Expected payoff: ((1-t)(1/2)(2/3)) *2=2/3(1-t) CS 594

  23. Ex1: I(S) trajectory CS 594

  24. Example 2 – C(I) versus C(S) • Equilibria: • Prior: • Starting point: Payoffs Transition probalilities CS 594

  25. Ex2: C(T) solution 0 Transition probalilities for player 2 Transition probalilities for player 1 CS 594

  26. Ex2: C(T) trajectory CS 594

  27. Ex2: I(T) trajectory CS 594

  28. Summary • Definition of stochastic games • Linear tracing procedure were presented • Some extension were shown with examples • C(S),I(T) is probably the best extension for calculation of strategy CS 594

  29. Reference • “Equlibrium Selection in Stochastic Games” by P. Jean-Jacques Herings and Ronald J.A.P. Peeters CS 594

  30. Questions ? CS 594

More Related