1 / 19

Stochastic Games

Stochastic Games. Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore. Agenda. Stochastic Game Special Class of Stochastic Games Analysis : Shapley’s Result. Applications. Repeated Game.

apu
Download Presentation

Stochastic Games

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stochastic Games Mr Sujit P Gujar. e-Enterprise Lab Computer Science and Automation IISc, Bangalore.

  2. Agenda • Stochastic Game • Special Class of Stochastic Games • Analysis : Shapley’s Result. • Applications e-Enterprise Lab

  3. Repeated Game • When players interact by playing a similar stage game (such as the prisoner's dilemma) numerous times, the game is called a repeated game. e-Enterprise Lab

  4. Stochastic Game • Stochastic game is repeated game with probabilistic/stochastic transitions. • There are different states of a game. • Transition probabilities depend upon actions of players. • Two player stochastic game : 2 and 1/2 player game. e-Enterprise Lab

  5. First Iteration subgame Second Iteration Repeated Prisoner’s Dilemma • Consider Game tree for PD repeated twice. Assume each player has the same two options at each info set: {C,D} 1 2 1 1 1 1 2 2 2 2 What is Player 1’s strategy set?(Cross product of all choice sets at all information sets…) {C,D} x {C,D} x {C,D} x {C,D} x {C,D} 25 = 32 possible strategies e-Enterprise Lab

  6. Issues in Analyzing Repeated Games • How to we solve infinitely repeated games? • Strategies are infinite in number. • Need to compare sums of infinite streams of payoffs e-Enterprise Lab

  7. Stochastic Game : The Big Match • Every day player 2chooses a number, 0 or 1 • Player 1 tries to predict it. Wins a point if he is correct. • This continues as long as player 1 predicts 0. • But if he ever predicts 1, all future choices for both players are required to be the same as that day's choices. e-Enterprise Lab

  8. The Big Match • S = {0,1*,2*} : State space. • s0={0,1} s1={0} s2={1} • P02 = • N = {1,2} • P00 = • A = Payoff Matrix = • P01 = e-Enterprise Lab

  9. The "Big-Match" game is introduced by Gillette (1957) as a difficult example. • The Big Match David Blackwell; T. S. Ferguson The Annals of Mathematical Statistics, Vol. 39, No. 1. (Feb., 1968), pp. 159-163. e-Enterprise Lab

  10. Scenario e-Enterprise Lab

  11. Stationary Strategies • Enumerating all pure and mixed strategies is cumbersome and redundant. • Behavior strategies those which specify a player the same probabilities for his choices every time the same position is reached by whatever route. • x = (x1,x2,…,xN) each xk = (xk1, xk2,…, xkmk) e-Enterprise Lab

  12. Notation • Given a matrix game B, • val[B] = minimax value to the first player. • X[B] = The set of optimal strategies for first player. • Y[B] = The set of optimal strategies for second player. • It can be shown, (B and C having same dimensions) |val[B] - val[C]| ≤ max |bij - cij| e-Enterprise Lab

  13. When we start in position k, we obtain a particular game, • We will refer stochastic game as, Define, e-Enterprise Lab

  14. Shapley’s1 Results 1L.S. Shapley, Stochastic Games. PNAS 39(1953) 1095-1100 e-Enterprise Lab

  15. Let, denote the collection of games whose pure strategies are the stationary strategies of . The payoff function of these new games must satisfy, e-Enterprise Lab

  16. Shapley’s Result, e-Enterprise Lab

  17. Applications • 1When N = 1, • By setting all skij = s > 0, we get model of infinitely repeated game with future payments are discounted by a factor = (1-s). • If we set nk = 1 for all k, the result is “dynamic programming model”. 1von Neumann J. , Ergennise eines Math, Kolloquims, 8 73-83 (1937) e-Enterprise Lab

  18. Example • Consider the game with N = 1, • A = • P2 = • P1 = • x=(0.61,0.39) • y=(0.39, 0.61) • x=(0.6,0.4) • y=(0.4, 0.6) e-Enterprise Lab

  19. Thank You!! e-Enterprise Lab

More Related