280 likes | 442 Views
Continuation Methods for Structured Games. Ben Blum Christian Shelton Daphne Koller Stanford University. Outline. Game Theory Description Nash Equilibria: Solutions to Games How to Find Them: Continuation Methods Normal Form Games: Govindan and Wilson 2002 Graphical Games
E N D
Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University
Outline • Game Theory Description • Nash Equilibria: Solutions to Games • How to Find Them: Continuation Methods • Normal Form Games: Govindan and Wilson 2002 • Graphical Games • Results for Graphical Games • Multi-Agent Influence Diagrams • Overview of MAIDs • Continuation methods for MAIDs
Game Theory • Normal-form games • Model the joint behavior of multiple agents • All players move once, simultaneously • Each player’s payoff depends on actions of all others • Representation exponential in number of players • Structured games (graphical games, MAIDs) • Computer science’s contribution to game theory: exploit structure, independencies • More compact, elegant representations
Nash Equilibria • Strategy profile • Assigns a strategy to every player • A pure strategy chooses one action • A mixed strategy is a distribution over pure strategies • A Nash equilibrium is a “solution” to the game • A strategy profile where no player can improve his payoff by unilaterally deviating from his strategy • Not the perfect notion of a solution, but useful
Why Compute Equilibria? • Descriptive power • Describes stable outcomes of systems • Useful for testing accuracy of an economic model • See if model’s equilibria correspond to real behavior • Fast computation of equilibria required • Prescriptive power • Choose the way an agent should act • The minimum requirement for an optimal strategy • Lets us prune the continuous space of strategies to a few discrete possibilities
Big Computational Idea • Govindan and Wilson ‘02:Continuation method for normal-form games • Perturb game by a vector of bonuses • Pay each player an additional bonus for each one of their actions independently • If the perturbation is large, the game is easily solvable • Unique pure equilibrium in which each player chooses the action with the highest bonus
Big Idea: The Picture • Choose a random bonus vector b, perturb game g by , l a scale factor • Follow the path back from to • Finds equilibria of • Solutions at vertical axis • Multiple equilibria found with a single ray b
Continuation Methods • General framework for satisfying continuous constraints • Start at easily solved perturbed problem, then trace the solution back to the original problem • Task Specification • l is the scale factor for perturbation: 1 is fully perturbed, 0 is unperturbed • Formulate set of constraints on the solution, s, to the l–perturbed problem: • F continuous, zero iff s is a solution to the perturbed problem at l.
Continuation Methods • General Methodology • Start with such that • Take time derivative • If lambda and s change by small amounts in these directions, F is unchanged • Follow differential system with small discrete steps • Requires inverting the matrix at each step
Continuation Method for Games • Set of constraints, F, based on homeomorphism between space of games and their equilibria [Kohlberg & Mertens ’86] • One constraint for every action, of each player, p • In , the entry at corresponds to the payoff to player p when p and p’ deviate from s by playing and respectively
Implementation of GNM • Implemented Govindan and Wilson’s normal form algorithm (the Global Newton Method) • Much faster than other leading algorithms for normal-form games • Available on the web under the GNU Public License • http://dags.stanford.edu/Games/gametracer.html • Still too slow for large games • Calulcation of requires exponential time
Graphical Games • How to reduce exponential representation? • Exploit structure! • Only connect players whose payoffs depend on each other • Each player has a payoff matrix, a function of neighbors and self only • Representation: where d is number of neighbors
Example Graphical Games • Grid game • Territorial issues • Road game • Landowners along a road
Solving Graphical Games • Want to find equilibria in graphical games • Graphical structure is a useful way to represent large games • Current exact algorithms are impractical • Enforce unreasonable restrictions [Kearns, Littman, Singh ’01] • 2 actions per player • Tree structure • Too slow • Approximate equilibria are problematic [Vickrey & Koller ’01] • No guarantee of an exact equilibrium in the vicinity • Granularity must be crude for reasonable execution time • Best of both worlds: a general exact algorithm
Our Algorithm • Based on Govindan and Wilson’s • Uses the same continuation method constraint function • Trace solutions along a ray of perturbed games • Efficient computation using structure • Game structure allows us to compute the components of locally • If two players aren’t adjacent, payoffs don’t depend on each other so derivative is zero • Otherwise, can use the local game matrix • Exponential in family size, NOT in game size
Graphical Game Results • 6x6 grid (intractable for most algorithms): 27s • Results for different road sizes: • Equilibrium error • 10-4 (Vickrey & Koller) • <10-14 (GNM)
Multi-Agent Influence Diagrams • Directed acyclic graph, like a BN • Three types of nodes • Chance nodes: acts of nature • Decision nodes: acts of players • Utility nodes: payoffs for a player, can’t be parents • Multiple agents (players) • Payoff is expected sum of owned utility nodes • Strategies: entries in the CPTs of owned decision nodes
Tree Killer Example • Alice (dark gray) must decide whether to poison Bob’s tree to get a better patio view • Bob (light gray) must decide whether to call a tree doctor
MAIDs vs. Other Games • MAIDs correspond to extensive form games (game trees) • Different from normal form games: sequential actions • Different homeomorphism and constraint function needed • Different strategy representation required
Finding Equilibria in MAIDs • Another continuation method • Based on Govindan’s and Wilson’s extensive form constraint function • A MAID induces an extensive form game • Exponentially larger than the MAID itself • Induced strategy profiles are just as compact as in the MAID • We can therefore use the extensive form constraint function, with all computations done inside the MAID • New compact strategy representation:non-exclusion probabilities
Non-Exclusion Probabilities • Assumes a game with perfect recall • No player “forgets” anything that he has learned • If comes after , then • Non-exclusion probability representation: • For player i, topologically sort decision variables • One non-exclusion probability for each instantiation of • For outcome z,
Non-Exclusion Constraints Strategy representation now has constraints Non-exclusion probabilities: A1,B1 a0,b0 a0,b1 a1,b0 a1,b1 a2 A2 A1: a0,a1 B1: b0,b1 a3 A1,B1 Decision node CPTs: a0,b0 A2: a2,a3 A a0,b1 a1,b0 a1,b1 a2 a0 A1 A2 a3 a1
Non-Exclusion Constraints Strategy representation now has additional constraints Non-exclusion probabilities: A1,B1 a0,b0 a0,b1 a1,b0 a1,b1 a2 A2 A1: a0,a1 B1: b0,b1 a3 A1,B1 Decision node CPTs: a0,b0 A2: a2,a3 A a0,b1 a1,b0 a1,b1 a2 a0 A1 A2 a3 a1
Non-Exclusion Constraints Strategy representation now has additional constraints Non-exclusion probabilities: A1,B1 a0,b0 a0,b1 a1,b0 a1,b1 a2 A2 A1: a0,a1 B1: b0,b1 a3 A1,B1 Decision node CPTs: a0,b0 A2: a2,a3 A a0,b1 a1,b0 a1,b1 a2 a0 A1 A2 a3 a1
Non-Exclusion Constraints Strategy representation now has additional constraints Non-exclusion probabilities: A1,B1 a0,b0 a0,b1 a1,b0 a1,b1 a2 A2 A1: a0,a1 B1: b0,b1 a3 A1,B1 Decision node CPTs: + + a0,b0 A2: a2,a3 A a0,b1 a1,b0 a1,b1 a2 a0 A1 A2 a3 a1
Calculation of Jacobian • One component of F for each non-exclusion probability • In , each element is again the payoff to one player when he and another deviate • This can be calculated by changing the decision node CPTs (no longer probabilities) • Zero out entries leading to other outcomes • Run “inference” to find reward node expectations
What’s next? • Implement a MAID algorithm • We have a continuation method constraint function F • Calculation of can be done with standard probabilistic inference in the MAID • Calculation of the retraction operator is linear in the strategy representation, because constraints are orthogonal; can project onto each one in turn • Now we just have to implement
Conclusions • Applied new methods from economics to structured games • Graphical games • Fastest general algorithm • Exact • MAIDs • Adapted extensive form theoretical framework to calculations entirely within the MAID • Software available for download