1 / 34

Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs

Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs. Rosemary Emery-Montemerlo joint work with Geoff Gordon, Jeff Schneider and Sebastian Thrun July 21, 2004 AAMAS 2004. Robot Teams. Robot Teams.

jamieross
Download Presentation

Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs Rosemary Emery-Montemerlo joint work with Geoff Gordon, Jeff Schneider and Sebastian Thrun July 21, 2004 AAMAS 2004

  2. Robot Teams Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  3. Robot Teams • With limited communication, existing paradigms for decentralized robot control are not sufficient • Game theoretic methods are necessary for multi-robot coordination under these conditions Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  4. Decentralized Decision Making Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  5. Decentralized Decision Making Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  6. Decentralized Decision Making Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  7. Decentralized Decision Making Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  8. Decentralized Decision Making Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  9. Decentralized Decision Making Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  10. Decentralized Decision Making Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  11. Decentralized Decision Making Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  12. Decentralized Decision Making Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  13. Decentralized Decision Making • A robot cannot choose actions based only on joint observations consistent with its own sensor readings • It must consider all joint observations that are consistent with its possible sensor readings Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  14. Relationship Between Decision Theoretic Models ? MDP POMDP State Space State Space Belief Space Belief Space Distribution over Belief Space Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  15. Models of Multi-Agent Systems • Partially observable stochastic games • Generalization of stochastic games to partially observable worlds • Related models • DEC-POMDP [Bernstein et al., 2000] • MTDP [Pynadath and Tambe, 2002] • I-POMDP [Gmystrasiewicz and Doshi, 2004] • POIPSG [Peshkin et al., 2000] Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  16. Partially Observable Stochastic Games • POSG = {I, S, A, Z, T, R, O} • I is the set of agents, I= {1,…,n} • S is the set of states • A is the set of actions, A= A1  An • Z is the set of observations, Z= Z1  Zn • T is the transition function, T: S  A  S • R is the reward function, R: S  A   • O are the observation emission probabilities O: S  Z  A  [0,1] Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  17. Solving POSGs • POSGs are computationally infeasible to solve Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  18. Solving POSGs • We can approximate a POSG as a series of smaller Bayesian games One-Step Lookahead Game at time t (Bayesian Game) Full POSG Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  19. Bayesian Games • Private information relevant to game • Uncertainty in utility • Type • Encapsulates private information • Will limit selves to games with finite number of types • In robot example • Type 1: Robot doesn’t see anything • Type 2: Robot sees intruder at location x Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  20. Bayesian Games • BG = {I, , A,p(), u} •  is the joint type space,  = 1  n •  is a specific joint type,  = {1,…, n} • p() is common prior on the distribution over  • u is the utility function, u= {u1,…,un} • ui(ai,a-i,(i, -i)) • i is a strategy for player i • Defines what player i does for each of its possible types • Actions are individual actions, not joint actions Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  21. Bayesian-Nash Equilibrium • Set of best response strategies • Each agent tries to maximize its expected utility conditioned on its probability distribution over the other agents’ types p() • Each agent has a policy i that, given -i , maximizes ui(i,-i, -i) Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  22. POSG to Bayesian Game Approximation • {I,S,A,Z,T,R,O} to {I, , A,p(), u}t • I = I • A = A • Type space it = all possible histories of agent i’s actions and observations up to time t • p()t calculated from S0,A,T,Z,O, t-1 • Prune low probability types • Each joint type  maps to a joint belief • u given by heuristic and ui = uj • QMDP Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  23. Agent i Agent j Initialize t=0, hi = {},p(0) 0=solveGame(0,p(0)) Initialize t=0, hj = {},p(0) 0=solveGame(0,p(0)) Make Observation hi = obsit U ait-1 U hi Make Observation hj = obsjt U ajt-1 U hj Determine Type it= bestMatch(hi,it) Determine Type jt= bestMatch(hj,2t) Execute Action ait= it(it) Execute Action ajt= jt(jt) Propagate Forward t+1,p(t+1) Propagate Forward t+1,p(t+1) Find Policy for t+1 t+1=solveGame(t,p(t)) t= t+1 Find Policy for t+1 t+1=solveGame(t,p(t)) t= t+1 Algorithm Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  24. Robotic Team Tag • Version of Team Tag • Environment is portion of Gates Hall • Full teammate observability • Opponent can be captured by a single robot in any state • QMDP used as heuristic • Two pioneer-class robots Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  25. Robot Policies Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  26. Lady And The Tiger [Nair et al. 2003] Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  27. Contributions • Algorithm for finding approximate solutions to POSG with common payoffs • Tractability achieved by modeling POSG as a sequence of Bayesian games • Performs comparably to the full POSG for a small finite-horizon problem • Improved performance over ‘blind’ application of utility heuristic in more complex problems • Successful real-time game-theoretic controller for indoor robots Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  28. Questions? • remery@cs.cmu.edu • www.cs.cmu.edu/~remery Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  29. Back-Up Slides Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  30. Lady And The Tiger [Nair et al. 2003] Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  31. Robotic Team Tag • I = {1,2} • S = S1 X S2 X Sopponent • Si = {s0,…,s28}, sopponent= {s0,…,s28,stagged} • |S| = 25230 • Ai = {N,S,E,W,Tag} • Zi = [{si,-1},s-i,a-i] • T: adjacent cells • O: see opponent if on same cell • R: minimize capture time • Modified from [Pineau et al. 2003] Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  32. Environment Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  33. Robotic Team Tag Results Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

  34. Robotic Team Tag Results Approximate Solutions for Partially Observable Stochastic Games with Common PayoffsRosemary Emery-Montemerlo

More Related