1 / 139

Partial-Observation Stochastic Games: How to Win when Belief Fails

This article explores partial-observation stochastic games and strategies to win when belief fails. It discusses repair/synthesis as a game, the role of hidden and visible states, and observation-based strategies.

tyrellb
Download Presentation

Partial-Observation Stochastic Games: How to Win when Belief Fails

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Partial-Observation Stochastic Games: How to Win when Belief Fails Laurent DoyenLSV, ENS Cachan & CNRS & Krishnendu ChatterjeeIST Austria Luxembourg, December 8th, 2011

  2. Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: got_lock++; } 4: if (got_lock != 0) { 5: unlock (); } 6: got_lock--; } while (*); } void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

  3. Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: got_lock++; } 4: if (got_lock != 0) { 5: unlock (); } 6: got_lock--; } while (*); } void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; } Wrong!

  4. Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; s0 ≡ got_lock = 0 s1 ≡ got_lock = 1 Inc ≡ got_lock++ dec ≡ got_lock-- void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

  5. Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; Repair/synthesis as a game: • System vs. Environment • Turn-based game graph • ω-regular objective void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

  6. Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

  7. Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; • A winning strategy may use the value of L to update got_lock accrodingly: • if L==0 then play s0 (got_lock = 0) • if L==1 then play s1 (got_lock = 1) void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

  8. Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; Repair/synthesis as a game of partial observation: visible hidden void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; } States that differ only by the value of L have the same observation

  9. Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s0 | s1 | inc | dec; s0 | s1 | inc | dec; void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

  10. Example void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: s0 | s1 | inc | dec; } 4: if (got_lock != 0) { 5: unlock (); } 6: s0 | s1 | inc | dec; } while (*); } s1; s0; void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

  11. Partial-Observation Stochastic Games

  12. Examples • Poker • - partial-observation • - stochastic

  13. Examples • Poker • - partial-observation • - stochastic • Bonneteau

  14. Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card

  15. Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card • Rules: • Player 1 points a card • Player 2 flips one remaining black card • Player 1 may change his mind, wins if pointed card is red

  16. Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card • Rules: • Player 1 points a card • Player 2 flips one remaining black card • Player 1 may change his mind, wins if pointed card is red

  17. Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card • Rules: • Player 1 points a card • Player 2 flips one remaining black card • Player 1 may change his mind, wins if pointed card is red

  18. Bonneteau: Game Model

  19. Bonneteau: Game Model

  20. Game Model

  21. Game Model

  22. Game Model

  23. Game Model

  24. Game Model

  25. Observations (for player 1)

  26. Observations (for player 1)

  27. Observations (for player 1)

  28. Observations (for player 1)

  29. Observations (for player 1)

  30. Observation-based strategy This strategy is observation-based, e.g. after it plays

  31. Observation-based strategy This strategy is observation-based, e.g. after it plays

  32. Optimal ? This strategy is winning with probability 2/3

  33. Game Model • This game is: • turn-based • (almost) non-stochastic • player 2 has perfect observation

  34. Interaction General case: concurrent & stochastic Player 1’s move Player 2’s move Players choose their moves simultaneously and independently

  35. Interaction -player games General case: concurrent & stochastic Probability distribution on successor state Player 1’s move Player 2’s move

  36. Interaction Turn-based games Special cases: • player-1 state • player-2 state

  37. Partial-observation General case: 2-sided partial observation Two partitions and Observations: partitions induced by coloring

  38. Partial-observation General case: 2-sided partial observation Two partitions and Observations: partitions induced by coloring Player 2’s view Player 1’s view

  39. Partial-observation Observations: partitions induced by coloring Special case: 1-sided partial observation or

  40. Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions.

  41. Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions. randomized History-depedent

  42. Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions. Reachability objective: Winning probability of :

  43. Qualitative analysis The following problem is undecidable: (already for probabilistic automata [Paz71]) Decide if there exists a strategy for player 1 that is winning with probability at least ½. [Paz71] Paz. Introduction to Probabilistic Automata. Academic Press 1971.

  44. Qualitative analysis The following problem is undecidable: (already for probabilistic automata [Paz71]) Decide if there exists a strategy for player 1 that is winning with probability at least ½. Qualitative analysis: • Almost-sure: … winning with probability 1 • Positive: … winning with probability > 0

  45. Applications in verification tank control • Control with inaccurate digital sensors • multi-process control with private variables • multi-agent protocols • planning with uncertainty/unknown void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: got_lock++; } 4: if (got_lock != 0) { 5: unlock (); } 6: got_lock--; } while (*); } void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

  46. Example 1 Player 1 partial, player 2 perfect

  47. Example 1 Player 1 partial, player 2 perfect No pure strategy of Player 1 is winning with probability 1(example from [CDHR06]).

  48. Example 1 Player 1 partial, player 2 perfect No pure strategy of Player 1 is winning with probability 1(example from [CDHR06]).

  49. Example 1 Player 1 partial, player 2 perfect Player 1 wins with probability 1, and needs randomization Belief-based-only randomized strategies are sufficient

  50. Example 2 Player 1 partial, player 2 perfect

More Related