150 likes | 237 Views
Twenty Second Conference on Artificial Intelligence. AAAI 2007. Improved State Estimation in Multiagent Settings with Continuous or Large Discrete State Spaces. Prashant Doshi Dept. of Computer Science University of Georgia. Speaker Yifeng Zeng Aalborg University, Denmark.
E N D
Twenty Second Conference on Artificial Intelligence AAAI 2007 Improved State Estimation in Multiagent Settings with Continuous or Large Discrete State Spaces Prashant Doshi Dept. of Computer Science University of Georgia Speaker Yifeng Zeng Aalborg University, Denmark
State Estimation Single agent setting Physical State (Loc, Orient,...)
State Estimation Multiagent setting Interactive state Physical State (Loc, Orient,...) (See AAMAS 05)
Estimate the interactive state State Estimation in Multiagent Settings Ascribe intentional models (POMDPs) to other agents Update the other agents' beliefs (See JAIR’05)
Previous Approach • Interactive particle filter (I-PF; see AAMAS'05, AAAI'05) • Generalizes PF to multiagent settings • Approximate simulation of the state estimation • Limitations of the I-PF • Large no. of particles needed even for small state spaces • Distributes particles over the physical state and model spaces • Poor performance when the physical state space is large or continuous
Factoring the State Estimation Update the physical state space Update other agent's model
Factoring the State Estimation Sample particles from just the physical state space Substitute in state estimation Implement using PF Perform as exactly as possible Rao-Blackwellisation of the I-PF
Assumptions on Distributions • Prior beliefs • Singly nested and conditional linear Gaussian (CLG) • Transition functions • Deterministic or CLG • Observation functions • Softmax or CLG • Why these distributions? • Good statistical properties • Well-known methods for learning these distributions from data • Applications in target tracking, fault diagnosis
Belief Update over Models Step 1: Update other agent's level 0 beliefs • Product of a Gaussian and Softmax • Use variational approximation of softmax (see Jordan '99) • Softmax Gaussian – tight lower bound • Update is then analogous to the Kalman filter
Belief Update over Models Step 2: Update belief over other's beliefs • Solve other's models – compute other's policy • Large variance – Listen • Obtain piecewise distributions L Updated belief over other's belief Updated Gaussian if prior belief supports the action 0 otherwise = • Approximate piecewise with Gaussian using ML
Belief Update over Models Step 3: Form a mixture of Gaussians • Each Gaussian is for the optimal action and possible observation of the other agent • Weight the Gaussian with the likelihood of receiving the observation • Mixture components grow unbounded • components after one step • components after t steps
Comparative Performance • Compare accuracy of state estimation with I-PF (L1 metric) • Continuous multi-agent tiger problem • Public good problem with punishment RB-IPF focuses particles on the large physical state space Updates beliefs over other's models more accurately (supporting plots in paper)
Comparative Performance • Compare run times with I-PF (Linux, Xeon 3.4GHz, 4GB RAM) • Sensitivity to Gaussian approximation of piecewise distribution
Discussion • How restrictive are the assumptions on the distributions? • Can we generalize RB-IPF, like I-PF? • Will RB-IPF scale to large number of update steps? • Closed form mixtures are needed • Is RB-IPF applicable to multiply-nested beliefs • Recursive application may not improve performance over I-PF