160 likes | 276 Views
History-Dependent Graphical Multiagent Models. Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA Yevgeniy Vorobeychik Computer and Information Sciences University of Pennsylvania, USA.
E N D
History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA Yevgeniy Vorobeychik Computer and Information Sciences University of Pennsylvania, USA
Modeling Dynamic Multiagent Behavior • Design a representation that: • expresses a joint probability distribution over agent actions over time • supports inference (e.g., prediction) • exploits locality of interaction • Our solution: • history-dependent graphical multiagent models (hGMMs)
Example Consensus Voting [Kearns et al. ’09]: shown from agent 1’s perspective t=10s 2 5 3 1 6 4 time
Graphical Representations • Exploit locality in agent interactions • MAIDs [Koller & Milch ’01], NIDs [Gal & Pfeffer ’08], Action-graph games [Jiang et al. ’08] • Graphical games [Kearns et al. ’01] and Markov random field for graphical games [Daskalakis & Papadimitriou ’06]
Graphical Multiagent Models (GMMs) 2 • [Duong, Wellman and Singh UAI-08] • Nodes: agents • Edges: dependencies between agents • Neighborhood Ni includes i and its neighbors • accommodates multiple sources of belief about agent behavior for static (one-shot) scenarios 5 3 1 6 4 potential of neighborhood’s joint actions Joint probability distribution of system’s actions normalization
Contribution Extend static GMM for modeling • dynamic joint behaviors • by conditioning on local history
History-dependent GMM (hGMM) • Extend static GMM: condition joint agent behavior on abstracted history of actions • directly captures joint behavior using limited action history potential of neighborhood’s joint actions at t Joint probability distribution of system’s actions at time t abstracted history normalization neighborhood-relevant abstracted history
Joint vs. Individual Behavior Models Autonomous agents’ behaviors are independent given complete history. Agent i’s actions depend on past observations, specified by strategy function σi(Ht) • Individual behavior models (IBMM): conditional independence of agent behavior given complete history. Pr(at| Ht)= Πiσi(Ht) History is often abstracted/summarized (limited horizon h, frequency function f, etc.), resulting in correlations in observed behavior. • Joint behavior models (hGMM) • no independence assumption σ2(Ht2) σ3(Ht3) 3 σ1(Ht1) 2 3 3 1
Voting Consensus Simulation • Simulation (treated as the true model): smooth fictitious play [Camerer and Ho ’99] • agents respond probabilistically in proportion to expected rewards (given reward function and beliefs about others’ behavior) • Note: • This generative model is individual behavior • Given abstracted history, joint behavior models may better capture behavior even if generated by an individual behavior model
Voting Consensus Models Reward for action ai,regardless of neighbor’s actions Individual Behavior Multiagent Model (IBMM) Joint Behavior Multiagent Model (hGMM) Frequency that actionaiis previously chosen by each of i’s neighbors normalization Frequency that aNiis previously chosen by neighborhood Ni Expected reward for aNi, discounted by the number of dissenting neighbors
Model Learning and Evaluation • Given a sequence of joint actions over m time periods X = {a0,…,am}, the log likelihood induced by the model M: LM(X;θ) • θ: model’s parameters • Potential function learning: • assumes a known graphical structure • employs gradient descent • Evaluation: • computes LM(X;θ) to evaluate M
Experiments • 10 agents • i.i.d. payoffs for consensus red and blue results (between 0 and 1), 0 otherwise. • max node degree d • T = 100 or when the vote converges • 20 smooth fictitious play game runs generated for each game configuration (10 for training, 10 for testing)
Results Evaluation: log likelihood for hGMM / log likelihood for IBMM • Green: hGMM > IBMM • Yellow: hGMM < IBMM hGMMs outperform IBMMs in predicting outcomes for shorter history lengths. Shorter history horizon more abstraction of history more induced behavior correlation hGMM > IBMM hGMMsoutperformIBMMs in predicting outcomes across different values of d
Asynchronous Belief Updates • hGMMs outperform IBMMs more for longer summarization intervals v (which induce more behavior correlations)
Direct Sampling • Compute the joint distribution of actions as the empirical distribution of the training data • Evaluation: Log likelihood for hGMM / log likelihood for direct sampling • Direct sampling is computationally more expensive but less powerful
Conclusions • hGMMs support efficient and effective inference about system dynamics, using abstracted history, for scenarios exhibiting locality • hGMMs provide better predictions of dynamic behaviors than IBMMs and direct fictitious play sampling • Approximation does not deteriorate performance • Future work: • More domain applications: authentic voting experimental results, other scenarios • (Fully) dynamic GMM that allows reasoning about unobserved past states