Laurent Doyen LSV, ENS Cachan & CNRS

Partial-Observation Stochastic Games: How to Win when Belief Fails Laurent DoyenLSV, ENS Cachan & CNRS & Krishnendu ChatterjeeIST Austria Luxembourg, December 8th, 2011

Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: got_lock++; } 4: if (got_lock != 0) { 5: unlock (); } 6: got_lock--; } while (*); } void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: got_lock++; } 4: if (got_lock != 0) { 5: unlock (); } 6: got_lock--; } while (*); } void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; } Wrong!

Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; s0 ≡ got_lock = 0 s1 ≡ got_lock = 1 Inc ≡ got_lock++ dec ≡ got_lock-- void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; Repair/synthesis as a game: • System vs. Environment • Turn-based game graph • ω-regular objective void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; • A winning strategy may use the value of L to update got_lock accrodingly: • if L==0 then play s0 (got_lock = 0) • if L==1 then play s1 (got_lock = 1) void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; Repair/synthesis as a game of partial observation: visible hidden void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; } States that differ only by the value of L have the same observation

Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s1; s0; void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

Partial-Observation Stochastic Games

Examples • Poker • - partial-observation • - stochastic

Examples • Poker • - partial-observation • - stochastic • Bonneteau

Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card

Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card • Rules: • Player 1 points a card • Player 2 flips one remaining black card • Player 1 may change his mind, wins if pointed card is red

Bonneteau: Game Model

Game Model

Observations (for player 1)

Observation-based strategy This strategy is observation-based, e.g. after it plays

Optimal ? This strategy is winning with probability 2/3

Game Model • This game is: • turn-based • (almost) non-stochastic • player 2 has perfect observation

Interaction General case: concurrent & stochastic Player 1’s move Player 2’s move Players choose their moves simultaneously and independently

Interaction -player games General case: concurrent & stochastic Probability distribution on successor state Player 1’s move Player 2’s move

Interaction Turn-based games Special cases: • player-1 state • player-2 state

Partial-observation General case: 2-sided partial observation Two partitions and Observations: partitions induced by coloring

Partial-observation General case: 2-sided partial observation Two partitions and Observations: partitions induced by coloring Player 2’s view Player 1’s view

Partial-observation Observations: partitions induced by coloring Special case: 1-sided partial observation or

Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions.

Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions. randomized History-depedent

Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions. Reachability objective: Winning probability of :

Qualitative analysis The following problem is undecidable: (already for probabilistic automata [Paz71]) Decide if there exists a strategy for player 1 that is winning with probability at least ½. [Paz71] Paz. Introduction to Probabilistic Automata. Academic Press 1971.

Qualitative analysis The following problem is undecidable: (already for probabilistic automata [Paz71]) Decide if there exists a strategy for player 1 that is winning with probability at least ½. Qualitative analysis: • Almost-sure: … winning with probability 1 • Positive: … winning with probability > 0

Applications in verification tank control • Control with inaccurate digital sensors • multi-process control with private variables • multi-agent protocols • planning with uncertainty/unknown void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: got_lock++; } 4: if (got_lock != 0) { 5: unlock (); } 6: got_lock--; } while (*); } void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

Example 1 Player 1 partial, player 2 perfect

Example 1 Player 1 partial, player 2 perfect No pure strategy of Player 1 is winning with probability 1(example from [CDHR06]).

Example 1 Player 1 partial, player 2 perfect Player 1 wins with probability 1, and needs randomization Belief-based-only randomized strategies are sufficient

Example 2 Player 1 partial, player 2 perfect

Laurent Doyen LSV, ENS Cachan & CNRS