Temporal action graph games a new representation for dynamic games
Download
1 / 35

Temporal Action-Graph Games: A New Representation for Dynamic Games - PowerPoint PPT Presentation


  • 152 Views
  • Uploaded on

Temporal Action-Graph Games: A New Representation for Dynamic Games. Albert Xin Jiang University of British Columbia. Kevin Leyton-Brown University of British Columbia. Avi Pfeffer Charles River Analytics. Game Theory In One Slide . A game:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Temporal Action-Graph Games: A New Representation for Dynamic Games' - gaylord


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Temporal action graph games a new representation for dynamic games

Temporal Action-Graph Games:A New Representation for Dynamic Games

Albert Xin Jiang

University of British Columbia

Kevin Leyton-Brown

University of British Columbia

Avi Pfeffer

Charles River Analytics


Game theory in one slide
Game Theory In One Slide

  • A game:

    • an interaction between two or more self-interested agents

    • each agent independently chooses a strategy

    • each agent derives utility from the resulting strategy profile

  • Strategies:

    • simultaneous-move games: choosing from a set of actions

    • dynamic games: choosing actions at multiple points in time; conditioned on observations

    • can randomize over actions

  • Reasoning about games:

    • Often involves computation of solution concepts e.g. Nash equilibrium






Overview
Overview

  • AGGs

  • TAGGs

  • Computation

  • Experiments


AGGs

  • played on a set of action nodes

  • each agent chooses an action node from a subset of action nodes

  • for each action node, an action count is tallied

  • utility dependence expressed by the action graph

    • utility of an agent depends only on

      • action chosen by the agent

      • action counts on the neighbors of the chosen action (configuration)

  • representation is compact when action graph has small in-degrees


Example simultaneous move tollbooth game
Example: simultaneous-move tollbooth game

L1

L2

L3

  • tollbooth with 3 lanes

    • 5 cars arrive

    • cars choose lanes simultaneously

    • utility depends on number of cars choosing same lane

  • context-specific independence (CSI): different independencies under diff context (player’s own action)

  • anonymity: utility depends on the numbers of agents taking certain actions, not their identities


Example dynamic tollbooth game
Example: Dynamic Tollbooth Game

0

2

3

0

0

0

3

4

3

  • tollbooth with 3 lanes

    • 20 cars arrive in 4 waves of 5 cars each

    • in each wave, cars choose lanes simultaneously

    • driver can observe number of cars in each lane

    • utility depends on number of cars choosing same lane, either before him or at the same time

  • Extending AGGs to multiple time steps:

    • Action counts accumulate over time

    • Need to be able to specify agents’ observations

    • Model uncertainty using chance variables


Defining taggs
Defining TAGGs

  • A TAGG is a tuple

    • N: set of players

    • T: duration

    • set of actions

    • set of chance variables

    • set of decisions

    • set of utility functions


Taggs
TAGGs

  • A decision D

    • Action set: a subset of

    • Observation set O[D]: of actions, decisions and chance vars

    • Set of payoff times pt(D)

  • Utility function

    • One for each action at each time step

    • Set of parents

    • Utility depends only on its parents’ instantiation at time

    • Evaluated at payoff times of decisions


Strategies
Strategies

  • A behavior strategy at decision D is a mapping from an instantiation of O[D] at time t(D)-1 to a distribution over its action set.

  • A behavior strategy for player i is a tuple consisting of a behavior strategy for each of her decisions

  • Behavior strategy profile: tuple of behavior strategies for all players


Induced bayes net
Induced Bayes Net

  • Induced BN of a TAGG given a strategy profile

    • Formally describes how the TAGG is played out

    • Decisions, chance variables and utilities correspond to random variables in the BN

    • Action counts are time-dependent: we have a separate action count variable for each action at each time step

    • Decision-payoff variable as utility of decision D received at payoff time

    • Expected utility of a player is the sum of the expected values of her decisions’ decision-payoff variables.

    • Can similarly define induced MAID of a TAGG



Taggs and maids
TAGGs and MAIDs

  • Any TAGG is utility-equivalent to its induced MAID

    • However, induced MAID (and BN) has large in-degree; exponentially larger representation size

  • For the other direction, any MAID can be efficiently encoded as a TAGG with same space complexity


Overview1
Overview

  • AGGs

  • TAGGs

  • Computation

  • Experiments


Computing expected utility
Computing Expected Utility

  • Computing expected utility of a player, given a behavior strategy profile

    • An essential step in many game-theoretic computations

  • Can be cast as inference problem on the induced BN: compute expected values of

    • Can apply standard BN inference algorithm

    • TAGGs have additional structure; can be exploited to speedup computation


Exploiting anonymity
Exploiting anonymity

  • Induced BN has large in-degrees for action-count variables

    • Their CPDs are counting functions with causal independence structure [Heckerman&Breese 1996]

    • Can reduce in-degree by transforming the BN

      • Create nodes representing intermediate counts


Exploiting temporal structure
Exploiting Temporal Structure

  • A network satisfies the Markov property if parents of variables at time t are at t or t-1.

    • Parts of the transformed BN (action counts) satisfy MP

    • Can transform it to one satisfying MP by duplicating variables

  • Adapt the interface algorithm for Dynamic BNs

    • interface: set of variables in time t that have children in time t+1

      • d-separates “past” from “future”

    • algorithm eliminates variables in the temporal order; keeping distribution over interface variables


Tollbooth example
Tollbooth example

  • interface at t: the action count variables at time t


Computation ctd
Computation, Ctd

  • Further exploiting the structure of transformed BN within the same time step

  • We can exploit CSI for further speedup

  • Our algorithm computes EU in poly time if

    • the number of interface variables at each time are bounded

    • inference over the chance variables can be done efficiently

  • Our methods an be applied to speedup computation of Nash equilibria


Overview2
Overview

  • AGGs

  • TAGGs

  • Computation

  • Experiments


Experiments
Experiments

  • Compute EU for tollbooth games (3 lanes)

  • Approach 1: standard clique tree algorithm on induced BN

  • Approach 2: same clique tree algorithm on transformed BN

  • Approach 3: our algorithm


Experiments tollbooth 5 cars per time step varying t
Experiments: Tollbooth (5 cars per time step, varying T)


Experiments tollbooth t 3 varying cars per time step
Experiments: Tollbooth(T=3, varying # cars per time step)


Conclusions
Conclusions

  • Temporal Action-Graph Games (TAGGs)

    • novel compact representation for dynamic games

    • extends AGGs to dynamic setting

    • compactly express wider variety of utility structure, including CSI and anonymity

    • exploit such structure for efficient computation

      • expected utility

      • Nash equilibrium


2

3

0

3

4

3


Game theory in one slide1
Game Theory In One Slide

  • A game:

    • an interaction between two or more self-interested agents

    • each agent independently chooses a strategy

    • each agent derives utility from the resulting strategy profile

  • Strategies:

    • simultaneous-move games: choosing from a set of actions

    • dynamic games: choosing actions at multiple points in time; conditioned on observations

    • can randomize over actions

  • Best Response:

    • I play a strategy that maximizes my own utility, given a particular strategy profile for the other agents

  • Nash Equilibrium:

    • a strategy profile with the property that everyagent’s strategy is a best response to the strategies of the others


Representations of games4
Representations of Games

  • Goal: modeling real-world systems

  • Standard Representations

    • normal form for simultaneous-move games: specify utilities for every action profile

    • extensive form (game trees) for dynamic games: specify utilities for every possible sequence of actions

    • such representations ignore structure in the games’ utilities

    • sizes grow exponentially in parameters of the games

  • CompactRepresentations

    • simultaneous-move: Graphical Games [Kearns, Littman & Singh 2001], Action-Graph Games (AGGs) [Bhat & Leyton-Brown 2004][Jiang & Leyton-Brown 2006]

    • dynamic: Multi-Agent Influence Diagrams (MAIDs) [Koller & Milch 2001]


Representing dynamic games
Representing Dynamic Games

  • Multi-Agent Influence Diagrams(MAIDs): directed acyclic graph over

    • Chance nodes

    • Decision nodes: a player chooses an action, observing instantiation of parents

    • Utility nodes: utility for a player, function of instantiation of parents

  • MAIDs are compact when utility functions exhibit strict independencies

    • Does not capture other kinds of structure


Temporal action graph games
Temporal Action-Graph Games

  • TAGGs: compact representation for dynamic games

    • Extension of AGGs to dynamic setting

    • Can efficiently encode arbitrary MAIDs

    • Can compactly represent dynamic games with CSI and/or anonymity structure

    • Efficient computation


Taggs at a high level
TAGGs at a high level

  • Played over a series of time steps 1 … T

    • At each time step, an AGG is played by a set of players, on a set of action nodes

    • Action counts on the action nodes are accumulated over time

  • Model uncertainty using chance variables

  • Decisions: player at each decision may observe a set of previous decisions, chance variables and action counts

  • Utility functions: action-specific and time-specific


Tagg tollbooth example
TAGG: tollbooth example

  • N correspond to the cars

  • T=4

  • One action node for each lane

  • At each time step, we have 5 decisions

    • Action set is the entire set

    • Payoff time same as decision time: pt(D) = {t(D)}

  • Utility function has A as its only parent

  • Size of representation


ad