210 likes | 352 Views
Uri Zwick Tel Aviv University. Simple Stochastic Games Mean Payoff Games Parity Games. Zero sum games. Mixed strategies. Max-min theorem. …. Stochastic games [Shapley (1953)]. Mixed positional (memoryless) optimal strategies. Simple Stochastic games (SSGs).
E N D
Uri ZwickTel Aviv University Simple Stochastic Games Mean Payoff Games Parity Games
Zero sum games Mixed strategies Max-min theorem …
Stochastic games[Shapley (1953)] Mixed positional (memoryless)optimal strategies
Simple Stochastic games (SSGs) Every game has only one row or column Pure positional (memoryless)optimal strategies
M R m MAX min RAND Simple Stochastic games (SSGs)Graphic representation The players construct an (infinite) path e0,e1,… Terminating version Non-terminating version Discounted version Fixed duration games easily solved using dynamic programming
Simple Stochastic games (SSGs)Graphic representation – example min m Start vertex M M MAX R RAND
M M 0-sink 1-sink Simple Stochastic game (SSGs)Reachability version [Condon (1992)] M R m min RAND MAX No weights Objective: Max / Min the prob. of getting to the 1-sink All prob. are ½ Technical assumption: Game halts with prob. 1
Simple Stochastic games (SSGs)Basic properties Every vertex in the game has a valuev Both players have positional optimal strategies Positional strategy for MAX: choice of an outgoing edge from each MAX vertex Decisionversion: Is value v
“Solving” binary SSGs The values vi of the vertices of a game are the unique solution of the following equations: The values are rational numbersrequiring only a linear number of bits Corollary: Decision version in NP co-NP
M R m MAX min RAND Markov Decision Processes (MDPs) Theorem:[Derman (1970)] Values and optimal strategies of a MDP can be found by solving an LP
NP co-NP – Another proof Deciding whether the value of a game isat least (at most) v is in NP co-NP To show that value v ,guess an optimal strategy for MAX Find an optimal counter-strategy for min by solving the resulting MDP. Is the problem in P ?
M R m MAX min RAND Mean Payoff Games (MPGs)[Ehrenfeucht, Mycielski (1979)] Non-terminating version Discounted version ReachabilitySSGs (PZ’96) MPGs Pseudo-polynomial algorithm (PZ’96)
Mean Payoff Games (MPGs)[Ehrenfeucht, Mycielski (1979)] Value – average of the cycle
8 3 ODD EVEN Parity Games (PGs) Priorities EVEN wins if largest priorityseen infinitely often in even Equivalent to many interesting problemsin automata and verification: Non-emptyness of -tree automata modal -calculus model checking
8 3 ODD EVEN Parity Games (PGs) Mean Payoff Games (MPGs) [Stirling (1993)] [Puri (1995)] Chang priority k to payoff (n)k Move payoff to outgoing edges
Simple Stochastic games (SSGs)Additional properties An SSG is said to be binary if the outdegree of every non-sink vertex is 2 A switch is a change of a strategyat a single vertex A switch is profitable for MAX if it increases the value of the game (sum of values of all vertices) A strategy is optimal iff no switch is profitable
Arandomizedsubexponentialalgorithm for binary SSGs[Ludwig (1995) ][Kalai (1992) Matousek-Sharir-Welzl (1992) ] Start with an arbitrary strategy for MAX Choose a random vertex iVMAX Find the optimal strategy ’ for MAX in the gamein which the only outgoing edge from i is (i,(i)) If switching ’ at i is not profitable, then ’ is optimal Otherwise, let (’)i and repeat
Arandomizedsubexponentialalgorithm for binary SSGs[Ludwig (1995) ][Kalai (1992) Matousek-Sharir-Welzl (1992) ] MAX vertices All correct ! Would never be switched ! There is a hidden order of MAX vertices under which the optimal strategy returned by the first recursive call correctly fixes the strategy of MAX at vertices 1,2,…,i
Exponential algorithm for PGs[McNaughton (1993)] [Zielonka (1998)] Vertices of highest priority(even) Firstrecursivecall Second recursivecall In the worst case, both recursive calls are on games of size n1 Vertices from whichEVEN can force thegame to enter A
Deterministic subexponential alg for PGsJurdzinski, Paterson, Z (2006) Idea:Look for small dominions! Second recursivecall Dominions of size s can be found in O(ns) time Dominion A (small) set from which one of the players can without the play ever leaving this set
Open problems • Polynomial algorithms? • Faster subexponential algorithms for parity games? • Deterministic subexponential algorithms for MPGs and SSGs? • Faster pseudo-polynomial algorithms for MPGs?