The Modified Stochastic Game

The Modified Stochastic Game Eilon Solan, Tel Aviv University with Omri N. Solan Tel Aviv University

Multiplayer Absorbing Games • I = a finite set of players. • Ai = a finite set of actions of player i. A := A1 × … × AI. • r : A → RI= nonabsorbing payoff function. • p : A → [0,1] = probability of absorption. • r* : A → RI= absorbing payoff function. 0 1 Example: the Big Match * * 1 0

Multiplayer Absorbing Games • I = a finite set of players. • Ai = a finite set of actions of player i. A := A1 × … × AI. • r : A → RI= nonabsorbing payoff function. • p : A → [0,1] = probability of absorption. • r* : A → RI= absorbing payoff function. Discounted Payoff γλi(x) = λri(x) + (1- λ)(p(x)r*i(x) + (1-p(x)) γλi(x) Expected absorbing payoff

Multiplayer Absorbing Games • r : A → RI= nonabsorbing payoff. • p : A → [0,1] = prob. of absorption. • r* : A → RI= absorbing payoff. Discounted Payoff γλi(x) = λri(x) + (1- λ)(p(x)r*i (x) + (1-p(x)) γλi(x) λri(x) + (1- λ)p(x)r*i (x) λri(x) + (1- λ)p(x)r*i (x) = γλi(x) = 1-(1- λ)(1-p(x)) λ + (1- λ)p(x)

Multiplayer Absorbing Games • r : A → RI= nonabsorbing payoff. • p : A → [0,1] = prob. of absorption. • r* : A → RI= absorbing payoff. Discounted Equilibrium λri(xλ) + (1- λ)p(xλ)r*i (xλ) γλi (xλ) = ≥ vλi (=min-max value) λ + (1- λ)p(xλ) If p(xλ) = o(λ) then lim γλ(xλ) = r(x0), and x0is a uniform ε-equilibrium. If p(xλ) = ω(λ) then lim γλ(xλ) = lim r*(xλ), and xλis a uniform ε-equilibrium provided λ is sufficiently small.

Multiplayer Absorbing Games • r : A → RI= nonabsorbing payoff. • p : A → [0,1] = prob. of absorption. • r* : A → RI= absorbing payoff. Discounted Equilibrium λri(xλ) + (1- λ)p(xλ)r*i (xλ) γλi (xλ) = ≥ vλi (=min-max value) λ + (1- λ)p(xλ) If p(xλ) = Θ(λ) then lim γλ(xλ) is a convex combination of r*(xλ) and r(xλ). When |I|=2, we have (a) r(x0) ≥ lim γλ(xλ)or (b) lim r*(xλ1,x02) ≥ lim γλ(xλ) or (c) lim r*(x01,xλ2) ≥ lim γλ(xλ). There is a uniform ε-equilibrium (Vrieze and Thuijsman 89).

Modified Absorbing Games • r : A → RI= nonabs. payoff. Ri(x) := min{ ri(x), v0i }. • p : A → [0,1] = prob. of absorption. • r* : A → RI= absorbing payoff. Modified Discounted Payoff λRi (x) + (1- λ)p(x)r*i (x) Γλi (x) := λ + (1- λ)p(x) Theorem: Vλi := min max Γλi (xi,x-i) satisfies V0i=v0i. x-i xi Theorem: The modified game admits a discounted stationary equilibrium.

Modified Stochastic Games 0 Attempt 1: The modified payoff is the minimum between the stage payoff and the stage max-min value. Original game: 0 2 Modified game: 0 1 The max-min value changed!

Modified Stochastic Games 1 Let v1i,…,vLibe the different limit max-min value of player i. tλ(s1,σ;l) := E [Σn=1λ(1-λ)n-11 ] uλi(s1,σ;l) := E [Σn=1λ(1-λ)n-1ri(sn,an) 1 ] Γλi (s1,σ) := Σl=1Lmin{uλi(s1,σ;l) , vli tλ(s1,σ;l) } The max-min value ∞ s1,σ {v0i(s(n)) =vli} s1,σ {v0i(s(n)) =vli} The modified game is the normal-form game ( I, Σi, (Γλi (s1,σ)){i in I} ) . Note: The modified game depends on the initial state.

Modified Stochastic Games 2 Let τk be the k-th time in which the limit max-min value of player i changes: τ0 := 1 τk+1 := min{ n > τk : v0i(s(n)) ≠ v0i(s(τk)) } tλ(s1,σ;k) := E [Σn=1λ(1-λ)n-11 | H(τk) ] uλi(s1,σ;k) := E[Σn=1λ(1-λ)n-1ri(sn,an) 1 | H(τk) ] Γλi (s1,σ) := E[Σk=0min{ uλi(s1,σ;k) , v0i(s(τk)) tλ(s1,σ;k) } ] ∞ s1,σ {τk ≤ n < τk+1} ∞ {τk ≤ n < τk+1} ∞

Results Γλi (s1,σ) := Σl=1Lmin{uλi(s1,σ;l) , vli tλ(s1,σ;l) } Γλi (s1,σ) := E[Σk=0min{ uλi(s1,σ;k) , v0i(s(τk)) tλ(s1,σ;k) } ] ∞ Theorem: In both modified games, V0i (s1) =v0i (s1). Question: Does there exist a stationary strategy that is almost optimal for all initial states? Theorem: The first modified game admits a discounted stationary equilibrium (that depends on the initial state). The second modified game admits a more complex equilibrium. Theorem: Analog results hold for min-max modification.

Monovex Sets Definition: A set X in Rd is monovex if for every x,y in X there is a continuous monotone path from x to y in X. Question: Is every monovex set contractible? Theorem: In the first modified game, if the other players play stationary strategies, then player i has an optimal stationary best response. Moreover, the set of his stationary best responses is monovex.

Monovex Sets Definition: A set X in Rd is monovex if for every x,y in X there is a continuous monotone path from x to y in X. Question: Is every monovex set contractible? Theorem: Every upper semi-continuous set-valued function from a compact convex subset of Rdto itself with monovex nonempty values has a fixed point.

To be continued IHP, February 15, 2016, 10AM תודה רבה Thank you شكرا Merci

The Modified Stochastic Game

The Modified Stochastic Game

Presentation Transcript

The Modified Research Paper

Stochastic Processes

Stochastic Processes

The Game Inside the Game

The Modified MiniMill™

A Stochastic Pursuit-Evasion Game with no Information Sharing

The WSA Academy Modified Laws of the Game (LOTG)

Stochastic algorithms

The stochastic Heisenberg limit

Stochastic Processes

Stochastic Processes

Stochastic modeling

Stochastic Disaggregation

Stochastic Methods

WSA Modified Laws of the Game (LOTG)

The WSA Academy Modified Laws of the Game (LOTG)

Stochastic Differentiation

The Modified Research Paper

Stochastic Processes

Stochastic Methods

Stochastic Resonance

Stochastic environment