Loading in 5 sec....

Temporal Action-Graph Games: A New Representation for Dynamic GamesPowerPoint Presentation

Temporal Action-Graph Games: A New Representation for Dynamic Games

- 152 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Temporal Action-Graph Games: A New Representation for Dynamic Games' - gaylord

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Temporal Action-Graph Games:A New Representation for Dynamic Games

Albert Xin Jiang

University of British Columbia

Kevin Leyton-Brown

University of British Columbia

Avi Pfeffer

Charles River Analytics

Game Theory In One Slide

- A game:
- an interaction between two or more self-interested agents
- each agent independently chooses a strategy
- each agent derives utility from the resulting strategy profile

- Strategies:
- simultaneous-move games: choosing from a set of actions
- dynamic games: choosing actions at multiple points in time; conditioned on observations
- can randomize over actions

- Reasoning about games:
- Often involves computation of solution concepts e.g. Nash equilibrium

Overview

- AGGs
- TAGGs
- Computation
- Experiments

AGGs

- played on a set of action nodes
- each agent chooses an action node from a subset of action nodes
- for each action node, an action count is tallied
- utility dependence expressed by the action graph
- utility of an agent depends only on
- action chosen by the agent
- action counts on the neighbors of the chosen action (configuration)

- utility of an agent depends only on
- representation is compact when action graph has small in-degrees

Example: simultaneous-move tollbooth game

L1

L2

L3

- tollbooth with 3 lanes
- 5 cars arrive
- cars choose lanes simultaneously
- utility depends on number of cars choosing same lane

- context-specific independence (CSI): different independencies under diff context (player’s own action)
- anonymity: utility depends on the numbers of agents taking certain actions, not their identities

Example: Dynamic Tollbooth Game

0

2

3

0

0

0

3

4

3

- tollbooth with 3 lanes
- 20 cars arrive in 4 waves of 5 cars each
- in each wave, cars choose lanes simultaneously
- driver can observe number of cars in each lane
- utility depends on number of cars choosing same lane, either before him or at the same time

- Extending AGGs to multiple time steps:
- Action counts accumulate over time
- Need to be able to specify agents’ observations
- Model uncertainty using chance variables

Defining TAGGs

- A TAGG is a tuple
- N: set of players
- T: duration
- set of actions
- set of chance variables
- set of decisions
- set of utility functions

TAGGs

- A decision D
- Action set: a subset of
- Observation set O[D]: of actions, decisions and chance vars
- Set of payoff times pt(D)

- Utility function
- One for each action at each time step
- Set of parents
- Utility depends only on its parents’ instantiation at time
- Evaluated at payoff times of decisions

Strategies

- A behavior strategy at decision D is a mapping from an instantiation of O[D] at time t(D)-1 to a distribution over its action set.
- A behavior strategy for player i is a tuple consisting of a behavior strategy for each of her decisions
- Behavior strategy profile: tuple of behavior strategies for all players

Induced Bayes Net

- Induced BN of a TAGG given a strategy profile
- Formally describes how the TAGG is played out
- Decisions, chance variables and utilities correspond to random variables in the BN
- Action counts are time-dependent: we have a separate action count variable for each action at each time step
- Decision-payoff variable as utility of decision D received at payoff time
- Expected utility of a player is the sum of the expected values of her decisions’ decision-payoff variables.
- Can similarly define induced MAID of a TAGG

TAGGs and MAIDs

- Any TAGG is utility-equivalent to its induced MAID
- However, induced MAID (and BN) has large in-degree; exponentially larger representation size

- For the other direction, any MAID can be efficiently encoded as a TAGG with same space complexity

Overview

- AGGs
- TAGGs
- Computation
- Experiments

Computing Expected Utility

- Computing expected utility of a player, given a behavior strategy profile
- An essential step in many game-theoretic computations

- Can be cast as inference problem on the induced BN: compute expected values of
- Can apply standard BN inference algorithm
- TAGGs have additional structure; can be exploited to speedup computation

Exploiting anonymity

- Induced BN has large in-degrees for action-count variables
- Their CPDs are counting functions with causal independence structure [Heckerman&Breese 1996]
- Can reduce in-degree by transforming the BN
- Create nodes representing intermediate counts

Exploiting Temporal Structure

- A network satisfies the Markov property if parents of variables at time t are at t or t-1.
- Parts of the transformed BN (action counts) satisfy MP
- Can transform it to one satisfying MP by duplicating variables

- Adapt the interface algorithm for Dynamic BNs
- interface: set of variables in time t that have children in time t+1
- d-separates “past” from “future”

- algorithm eliminates variables in the temporal order; keeping distribution over interface variables

- interface: set of variables in time t that have children in time t+1

Tollbooth example

- interface at t: the action count variables at time t

Computation, Ctd

- Further exploiting the structure of transformed BN within the same time step
- We can exploit CSI for further speedup
- Our algorithm computes EU in poly time if
- the number of interface variables at each time are bounded
- inference over the chance variables can be done efficiently

- Our methods an be applied to speedup computation of Nash equilibria

Overview

- AGGs
- TAGGs
- Computation
- Experiments

Experiments

- Compute EU for tollbooth games (3 lanes)
- Approach 1: standard clique tree algorithm on induced BN
- Approach 2: same clique tree algorithm on transformed BN
- Approach 3: our algorithm

Experiments: Tollbooth (5 cars per time step, varying T)

Experiments: Tollbooth(T=3, varying # cars per time step)

Conclusions

- Temporal Action-Graph Games (TAGGs)
- novel compact representation for dynamic games
- extends AGGs to dynamic setting
- compactly express wider variety of utility structure, including CSI and anonymity
- exploit such structure for efficient computation
- expected utility
- Nash equilibrium

Game Theory In One Slide

- A game:
- an interaction between two or more self-interested agents
- each agent independently chooses a strategy
- each agent derives utility from the resulting strategy profile

- Strategies:
- simultaneous-move games: choosing from a set of actions
- dynamic games: choosing actions at multiple points in time; conditioned on observations
- can randomize over actions

- Best Response:
- I play a strategy that maximizes my own utility, given a particular strategy profile for the other agents

- Nash Equilibrium:
- a strategy profile with the property that everyagent’s strategy is a best response to the strategies of the others

Representations of Games

- Goal: modeling real-world systems
- Standard Representations
- normal form for simultaneous-move games: specify utilities for every action profile
- extensive form (game trees) for dynamic games: specify utilities for every possible sequence of actions
- such representations ignore structure in the games’ utilities
- sizes grow exponentially in parameters of the games

- CompactRepresentations
- simultaneous-move: Graphical Games [Kearns, Littman & Singh 2001], Action-Graph Games (AGGs) [Bhat & Leyton-Brown 2004][Jiang & Leyton-Brown 2006]
- dynamic: Multi-Agent Influence Diagrams (MAIDs) [Koller & Milch 2001]

Representing Dynamic Games

- Multi-Agent Influence Diagrams(MAIDs): directed acyclic graph over
- Chance nodes
- Decision nodes: a player chooses an action, observing instantiation of parents
- Utility nodes: utility for a player, function of instantiation of parents

- MAIDs are compact when utility functions exhibit strict independencies
- Does not capture other kinds of structure

Temporal Action-Graph Games

- TAGGs: compact representation for dynamic games
- Extension of AGGs to dynamic setting
- Can efficiently encode arbitrary MAIDs
- Can compactly represent dynamic games with CSI and/or anonymity structure
- Efficient computation

TAGGs at a high level

- Played over a series of time steps 1 … T
- At each time step, an AGG is played by a set of players, on a set of action nodes
- Action counts on the action nodes are accumulated over time

- Model uncertainty using chance variables
- Decisions: player at each decision may observe a set of previous decisions, chance variables and action counts
- Utility functions: action-specific and time-specific

TAGG: tollbooth example

- N correspond to the cars
- T=4
- One action node for each lane
- At each time step, we have 5 decisions
- Action set is the entire set
- Payoff time same as decision time: pt(D) = {t(D)}

- Utility function has A as its only parent
- Size of representation

Download Presentation

Connecting to Server..