Approximate Solutions of Interactive Dynamic Influence Diagrams Using Model Clustering

Twenty Second Conference on Artificial Intelligence (AAAI’07) Approximate Solutions of Interactive Dynamic Influence Diagrams Using Model Clustering Yifeng Zeng Aalborg University Denmark Prashant Doshi Univ. of Georgia USA Qiongyu Chen National University of Singapore

Outline • Interactive Dynamic Influence Diagrams (I-DIDs) • Curses of History and Dimensionality • Model Clustering • Computational Savings and Error Bound • Experimental Results

Interactive Dynamic Influence Diagrams (I-DIDs) (Doshi et al. AAMAS’07) • Graphical models for decision-making in multiagent settings • Sequential decision-making over multiple time steps in multiagent settings • Generalize dynamic IDs to multiagent domains • Differ from MAIDs (Koller&Milch01) and NIDs (Gal&Pfeffer04) • Online solutions to I-POMDPs (Gmytrasiewicz&Doshi, JAIR’05) • Allow nested modeling of agents

Aj Mj,l-1 Level l I-ID Overview of I-ID Ri Ai • A generic level l Interactive-ID (I-ID) for agent i situated with one other agent j • Model Node: Mj,l-1 • Models of agent j at level l-1 • Policy link: dashed line • Distribution over the other agent’s actions given its models • Beliefs on Mj,l-1 • P(Mj,l-1|s) • Update? S Oi

Details of the Model Node • Members of the model node • Different chance nodes are solutions of models mj,l-1 • Mod[Mj] represents the different models of agent j • CPT of the chance node Aj is a multiplexer • Assumes the distribution of each of the action nodes (Aj1, Aj2) depending on the value of Mod[Mj] Mj,l-1 Aj S Mod[Mj] mj,l-11 Aj1 mj,l-11, mj,l-12 could be I-IDs or IDs mj,l-12 Aj2

Ri Ait+1 St+1 Ajt+1 Oit+1 Mj,l-1t+1 Interactive Dynamic Influence Diagrams (I-DIDs) Ri Ait Ajt St Oit Mj,l-1t Model Update Link

Semantics of Model Update Link Ajt+1 Mj,l-1t+1 Ajt st+1 Mj,l-1t Mod[Mjt+1] st mj,l-1t+1,1 Aj1 Mod[Mjt] mj,l-1t+1,2 Oj Aj2 mj,l-1t+1,3 mj,l-1t,1 Aj3 Aj1 Oj1 mj,l-1t+1,4 mj,l-1t,2 Aj4 Aj2 Oj2 These models differ in their initial beliefs, each of which is the result of j updating its beliefs due to its actions and possible observations

Curse of history of agent j Curses of History and Dimensionality • Primary complexity of solving I-DIDs is due to the large number of models that must be solved over time Curse of dimensionality • At time step t: • Nested property of modeling • More Agents • N+1 agent setting: (NM)l models (M is bounded # of models at each level)

Model Clustering • Idea: Prune the model space to K representative models from M candidate models, K << M, at each time step • Approach • Cluster Models • k-means clustering method (MacQueen67) • Note: k is not equal to K • Clusters contain models that are likely behaviorally equivalent • Select Krepresentative models from the clusters

Selection of Initial Means • Facilitate clustering of behaviorally equivalent models • Behaviorally equivalent regions • Prescribe the same optimal behavior for j • [0,0.1], [0.1,0.9], [0.9,1] • Select region boundary points as initial means • 0, 0.1, 0.9, 1 10 -1 Value L OL OR 1 0 0.1 0.9 P(TR) Sensitivity points

Selection of Initial Means • Sensitivity points • Models that induce policies that are different from those by surrounding models • Vertices of the belief simplex • One dimension: 0, 1 • Two dimensions: [0,0], [0,1],[1,0], and [1,1]

LP for Computing Sensitivity Points SPs are non-dominated points on intersections between value functions SP Non-dominated Intersection

Example of Iterative Clustering P(TR) 0.1 0.9 0 1 Initial Means Iteration 1 . . . . . . Iteration n Select K=10

Cluster models Re-compute means K Model Selection Algorithm Clustering Select Initial Means Selection Compute SPs Select K nearest models

Approximate Solution of I-DID • Exact algorithm • Expansion phase • Expand all M models over time • Look-ahead phase • Approximation – Modify exact algorithm • Prune model space using KModelSelection • Maintain only K models over time • Look-ahead phase

Computational Savings and Error Bound • (NM)lV.S.(NK)l • Mgrows exponentially over time • Retain K models (Mk) and discard M-K models (M/K) • Error bounded by finding the model among the K retained models that is the closest to the discarded one (PBVI; Pineau et al. 03)

Error Bound Let Error bound for agent j Expected error bound for agent i

Empirical Results • Two Problem Domains • Multiagent tiger • Multiagent machine maintenance • Comparison with • Exact solution of I-DID for different M • Interactive particle filtering on I-DID • Measure • Average rewards solving the level 1 I-DIDs • Variance over 50 runs • Run time

Run Time Comparison • Slower than the I-PF • Reason: convergence step • Solve I-DIDs up to 8 horizons

Future Work • Variants of model clustering • Application domains • Compose our package for I-DIDs

Thank You!

Notes • Updated set of models at time step (t+1) will have at most models • :number of models at time step t • :largest space of actions • :largest space of observations • New distribution over the updated models uses • original distribution over the models • probability of the other agent performing the action, and • receiving the observation that led to the updated model

One Example

K Model Selection • Initial Means • Sensitivity points + Vertices of the belief simplex • Iteration • Re-compute the cluster mean • Assign new models to clusters • Selection • Select K models • Kn: In proportion to the size of cluster n

Approximate Solutions of Interactive Dynamic Influence Diagrams Using Model Clustering

Approximate Solutions of Interactive Dynamic Influence Diagrams Using Model Clustering

Presentation Transcript

CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling

UML Diagrams: StateCharts The Dynamic Analysis Model

UML Diagrams: Sequence Diagrams The Requirements Model, and The Dynamic Analysis Model

Constructing influence diagrams

Chameleon: A hierarchical Clustering Algorithm Using Dynamic Modeling

Chapter 2 – Influence Diagrams

UML Diagrams: Sequence Diagrams Dynamic Analysis Model

Software Design The Dynamic Model Design Sequence Diagrams and Communication Diagrams

CHAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling

Perceptual Influence of Approximate Visibility in Indirect Illumination

Interactive Exploration of Hierarchical Clustering Results HCE (Hierarchical Clustering Explorer)

Scaling Model Checking of Dataraces Using Dynamic Information

CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling

Solving Influence Diagrams

Extracting Sequence Diagrams from Execution Traces using Interactive Visualization

UML Diagrams: StateCharts The Dynamic Analysis Model

Influence Line Diagrams-I

CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling

Constructing influence diagrams