1.39k likes | 1.41k Views
This article explores partial-observation stochastic games and strategies to win when belief fails. It discusses repair/synthesis as a game, the role of hidden and visible states, and observation-based strategies.
E N D
Partial-Observation Stochastic Games: How to Win when Belief Fails Laurent DoyenLSV, ENS Cachan & CNRS & Krishnendu ChatterjeeIST Austria Luxembourg, December 8th, 2011
Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: got_lock++; } 4: if (got_lock != 0) { 5: unlock (); } 6: got_lock--; } while (*); } void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }
Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: got_lock++; } 4: if (got_lock != 0) { 5: unlock (); } 6: got_lock--; } while (*); } void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; } Wrong!
Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; s0 ≡ got_lock = 0 s1 ≡ got_lock = 1 Inc ≡ got_lock++ dec ≡ got_lock-- void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }
Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; Repair/synthesis as a game: • System vs. Environment • Turn-based game graph • ω-regular objective void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }
Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }
Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; • A winning strategy may use the value of L to update got_lock accrodingly: • if L==0 then play s0 (got_lock = 0) • if L==1 then play s1 (got_lock = 1) void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }
Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; Repair/synthesis as a game of partial observation: visible hidden void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; } States that differ only by the value of L have the same observation
Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }
Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s1; s0; void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }
Partial-Observation Stochastic Games
Examples • Poker • - partial-observation • - stochastic
Examples • Poker • - partial-observation • - stochastic • Bonneteau
Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card
Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card • Rules: • Player 1 points a card • Player 2 flips one remaining black card • Player 1 may change his mind, wins if pointed card is red
Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card • Rules: • Player 1 points a card • Player 2 flips one remaining black card • Player 1 may change his mind, wins if pointed card is red
Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card • Rules: • Player 1 points a card • Player 2 flips one remaining black card • Player 1 may change his mind, wins if pointed card is red
Observation-based strategy This strategy is observation-based, e.g. after it plays
Observation-based strategy This strategy is observation-based, e.g. after it plays
Optimal ? This strategy is winning with probability 2/3
Game Model • This game is: • turn-based • (almost) non-stochastic • player 2 has perfect observation
Interaction General case: concurrent & stochastic Player 1’s move Player 2’s move Players choose their moves simultaneously and independently
Interaction -player games General case: concurrent & stochastic Probability distribution on successor state Player 1’s move Player 2’s move
Interaction Turn-based games Special cases: • player-1 state • player-2 state
Partial-observation General case: 2-sided partial observation Two partitions and Observations: partitions induced by coloring
Partial-observation General case: 2-sided partial observation Two partitions and Observations: partitions induced by coloring Player 2’s view Player 1’s view
Partial-observation Observations: partitions induced by coloring Special case: 1-sided partial observation or
Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions.
Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions. randomized History-depedent
Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions. Reachability objective: Winning probability of :
Qualitative analysis The following problem is undecidable: (already for probabilistic automata [Paz71]) Decide if there exists a strategy for player 1 that is winning with probability at least ½. [Paz71] Paz. Introduction to Probabilistic Automata. Academic Press 1971.
Qualitative analysis The following problem is undecidable: (already for probabilistic automata [Paz71]) Decide if there exists a strategy for player 1 that is winning with probability at least ½. Qualitative analysis: • Almost-sure: … winning with probability 1 • Positive: … winning with probability > 0
Applications in verification tank control • Control with inaccurate digital sensors • multi-process control with private variables • multi-agent protocols • planning with uncertainty/unknown void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: got_lock++; } 4: if (got_lock != 0) { 5: unlock (); } 6: got_lock--; } while (*); } void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }
Example 1 Player 1 partial, player 2 perfect
Example 1 Player 1 partial, player 2 perfect No pure strategy of Player 1 is winning with probability 1(example from [CDHR06]).
Example 1 Player 1 partial, player 2 perfect No pure strategy of Player 1 is winning with probability 1(example from [CDHR06]).
Example 1 Player 1 partial, player 2 perfect Player 1 wins with probability 1, and needs randomization Belief-based-only randomized strategies are sufficient
Example 2 Player 1 partial, player 2 perfect