1 / 47

Games where you can play optimally without any memory

Games where you can play optimally without any memory. Authors: Hugo Gimbert and Wieslaw Zielonka. Presented by Moria Abadi. Arena and Play. Play. Max. Min. color(play) = blue blue yellow …. Payoff Mapping of Player. means that y is good for the player at least as x.

Download Presentation

Games where you can play optimally without any memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Games where you can play optimally without any memory Authors:Hugo Gimbert and Wieslaw Zielonka Presented by Moria Abadi

  2. Arena and Play Play Max Min color(play) = blue blue yellow …

  3. Payoff Mapping of Player means that y is good for the player at least as x Player wins payoff u(x) in play x

  4. Example 1 – Parity Game Max wins 1 if the highest color visited infinitely often is odd, otherwise his payoff is 0

  5. Example 2 – Sup Game Max wins the highest value seen during the play

  6. Example 4 – Mean Payoff Game Does not always exist

  7. Example 4 – Mean Payoff Game 1 1 1 1 2 0 1 0 1 0 0 0 0 1

  8. Preference Relation of Player  iscomplete preorder relation on C x  ymeans y is good for the player at least as x x y denotes x  y but not y  x u induces : x  y iff u(x)≤u(y)

  9. Antagonistic Games • x -1 y iff y  x • is preference relation of Max • -1is preference relation of Min

  10. Games, Strategies • Game (G,) • G is finite arena G = (SMax, SMin, E) •  is a preference relation for player Max •  strategy for Max •  strategy for Min • pG(t,,) is a play in G with source t consistent with both  and .

  11. pG(t,#,#) is a play # and #are optimal if: For Max and Min it is not worth to exchange his strategy unilaterally Optimal Strategies Intuition

  12. Optimal Strategies Definition (G,) is given # and #are optimal if For all states s and all strategies  and   

  13. The Main Question Under which conditions Max and Min have optimal memoryless strategies for all games? Some conditions on  will be defined Min and Max have optimal memoryless strategies iff  satisfies these conditions Parity games, mean payoff games,…

  14. [L] Rec(C) all languages recognizable by automata LRec(C) Pref(L) all prefixes of the words in L [L]={ | every finite prefix of x is in Pref(L)}

  15. [L] Example

  16. Lemma 3 [L  M] = [L]  [M] xPref(M), xPref(L) xPref(L), xPref(M)

  17. Co-accessible Automaton • From any state there is a (possibly empty) path to a final state C={0,1} 0 1 0 i 0 1 1 0 1

  18. Lemma 4 • Let A=(Q,i,F,Δ) be a co-accessible finite automaton recognizing a language L. Then [L]={color(p) | p is an infinite path in A, source(p)=i} 0 1 0 i 0 1 1 0 1 p=e0e1e2… n there is a path from target(en) to a final state color(p)[L]

  19. Lemma 4 • Let A=(Q,I,F,Δ) be a co-accessible finite automaton recognizing a language L. Then [L]={color(p) | p is an infinite path in A, source(p)=i} 0 1 0 i 0 1 1 0 1 x=c0c1c2… There is an infinite path p: color(p)=x n there is a path matching c0…cn

  20. Extension of  and  For X,YC XY iff xX yY, xy XY iff yY xX, xy

  21. Monotony • is monotone if M,NRec(C) xC* [xM]  [xN]  yC* [yM]  [yN] Intuitively: at each moment during the play the optimal choice between two possible futures does not depend on the preceding finite play M x y N

  22. Example of non-monotone  C=R 1 1 1  1 2 y: v=20 0 1 0 0 0 0  1 w=1 x: u(xv) = 2/5, u(xw) = 1, u(yv) = 6/5, u(yw) = 1 u(xv)<u(xw) while u(yw)<u(yv)

  23. Selectivity • is selective if xC* M,N,KRec(C) [x(MN)*K]  [xM*]  [xN*]  [xK] Intuitively: the player cannot improve his payoff by switching between different behaviors M N K

  24. Example of non-selective  C={0,1} 1 if the colors 0 and 1 occur infinitely often 0 otherwise M = {1k | 0≤k} 1 0 N = {0k | 0≤k} (01) [(MN)*] [M*] = {1} [N*] = {0} u((01) > u(1) and u((01) > u(0)

  25. The Main Theorem Given a preference relation , both players have optimal memoryless strategies for all games (G,) over finite arenas G if and only if the relations  and -1 are monotone and selective

  26. Proof of Necessary Condition Given a preference relation , if both players have optimal memoryless strategies for all games (G,) over finite arenas G then the relations  and -1 are monotone and selective

  27. Simplification 1 A, , # B, -1, # SB SA Max Min SA SB A, ,# B, -1,# It is enough to prove only for 

  28. Simplification 2 • It turns out that already for one-player games if Max has optimal strategy,  has to be monotone and selective Two-player arenas One-player arenas

  29. Lemma 5 Suppose that player Max has optimal memoryless strategies for all games (G,) over finite one-player arenas G=(SMax,Ø,E). Then  is monotone and selective.

  30. Prove of Monotony x,yC* and M,NRec(C) and [xM]  [xN] We shall prove [yM]  [yN] • Ax and Ay are deterministic co-accessible automata recognizing {x} and {y} • AN and AM are co-accessible automata recognizing N and M • W.l.o.g. AN and AM have no transition with initial state as a target

  31. Prove of Monotony x,yC* and M,NRec(C) and [xM]  [xN]  [yM]  [yN] If [M] = Ø – trivial. [M]  Ø and [N]  Ø  by Lemma 4 there is an infinite path from initial state of AM and AN

  32. Automaton A Ax Ay Recognizes x(MN) i F F All plays are [x(MN)] t AN AM i =[xM][xN] i F F

  33. p play consistent with # x,yC* and M,NRec(C) and [xM]  [xN]  [yM]  [yN] Ax Ay i i q play consistent with # t color(q)[yN], AN AM [yM][yN] color(q) F F [yM]  [yN]

  34. Proof of Sufficient Condition Given a monotone and selective preference relations  and -1, both players have optimal memoryless strategies for all games (G,) over finite arenas G.

  35. Arena Number • G=(S,E) • nG = |E|-|S| • Each state has at least one outgoing transition  nG0 • The proof by induction on nG

  36. Induction Basis For arena G, where nG=0.  strategies are unique Hypothesis Let G be an arena and  is monotone and selective. Suppose Max and Min have memoryless strategies in all games (H,) over arenas H such that nH<nG. Then Max has optimal memoryless strategy in (G,).

  37. # • We need to find # such that (#,#) optimal • We will find #m which requires memory such that (#, #m) optimal • Permuting Max and Min we will find (#m, #) optimal • (#, #m) and (#m, #) are optimal  (#,#) optimal

  38. Induction Step G G0 t G1 (#i, #i) – optimal strategies in Gi

  39. Induction Step G G0 Ki colors of finite plays from in Gi from t consistent with #i t G1 KiRec(C),  monotone  xC* [xK0] [xK1] or xC* [xK1] [xK0] W.l.o.gxC* [xK1] [xK0] So let # = #0

  40. # G G0 t G1 #0(target(p)) if last transition from t was to G0 #1(target(p)) if last transition from t was to G1

  41. color(pG(s,,#))color(pG(s,#,#))color(pG(s,#,))color(pG(s,,#))color(pG(s,#,#))color(pG(s,#,)) G G0 t G1

  42. color(pG(s,#,#))color(pG(s,#,)) G G0 t G1 All plays are in G0

  43. color(pG(s,,#))color(pG(s,#,#)) G G0 t G1 pG(s,,#) traverse the state t All plays are in G0

  44. color(pG(s,,#))color(pG(s,#,#)) G G0 Mi colors of finite plays from in Gi from t to t consistent with #i t G1 x - color of the shortest path to t consistent with # color(pG(s,,#)) [x(M0M1)*(K0K1)]  [x(M0)*] [x(M1)*][x(K0K1)] (Mi*)Ki color(pG(s, ,#))  [x(K0K1)] = [xK0][xK1]  [xK0]

  45. color(pG(s,,#))color(pG(s,#,#)) G G0 t G1 color(pG(s, ,#))  [xK0]  color(pG0(s,#0,#0)) = color(pG(s,#,#))

  46. A Very Important Corollary Suppose that  is such that for each finite arena G=(SMax,SMin,E) controlled by one player (SMax=Ø or SMin=Ø), this player has an optimal memoryless strategy in (G,). Then for all finite two-player arenas G both players have optimal memoryless strategies in the games (G,).

  47. Mean Payoff Game S

More Related