270 likes | 332 Views
Graphical Models. Michael Kearns Michael L. Littman Satinder Signh. Presenter: Shay Cohen. So far we have seen …. Players payoffs and the games are represented in tabular form n agents with 2 actions: n matrices of exponential size:
E N D
Graphical Models Michael Kearns Michael L. Littman Satinder Signh Presenter: Shay Cohen
So far we have seen… • Players payoffs and the games are represented in tabular form • n agents with 2 actions: n matrices of exponential size: • Needed: More compact representations and algorithms for manipulating them
Graphical models (not formal) • n-player game is given by undirected graph with n vertices and n matrices • Payoff is determined only by the neighbors • “local games” composing “global game”
Examples • Games with geographical aspects involved (salespersons) • Topology of computer networks with a limited set of neighbors • … and so on
Reminder… • n-player two-action game: n matrices of size • specifies the payoff for pure strategy x • Nash-Equilibrium: (for all i and for all p’) -Nash-Equilibrium:
Graphical Games • Graphical game: (G,M) • G is undirected graph on n vertices • M is a set of n matrices representing the payoff of player i with its neighbors • Size of is when
Algorithm TreeNash • Works in two passes: the downstream pass and the upstream pass • Downstream: passes indicator tables (with witnesses) from the leafs to the root • Upstream: selects witnesses from root to the leafs (see the attached appendix)
TreeNash – more details • Downstream: A parent U will send to a child V a binary-valued table T(v,u) s.t.: T(v,u)=1 there is NE for in which U=u (v,u – mixed strategies) • Upstream: A child V will be V=v s.t. for all its parents :
Downstream in general • W – child, V – current node, U – parents (b.r. – best response)
U V W Z How? - Downstream T(w,v)=1 v b.r. to w T(w,u)=1 u b.r. to w • T(z,w)=1 for some (u,v): • T(w,u)=1, T(w,v)=1 • W=w b.r. to U=u,V=v,Z=z • T(z)=1for some w: • T(z,w)=1 • Z=z b.r. to W=w (b.r. – best response)
U V W Z How? – Upstream Choose U=u, V=v s.t. T(w,u)=1 and T(w,u)=1 Choose Z=z, W=w s.t. T(z,w)=1
TreeNash • Theorem: TreeNash computes a Nash equilibrium for the tree game (G,M) • Non-deterministic choices: select all of them, and all NE will be found • But the tables are continuous… How do we compute them?
Approximate TreeNash • Tables will be of finite size: • All computations of best responses are computations of -best responses in the grid • Each table has entries, therefore running time is (k parents)
Approximate TreeNash (2) • Lemma: Let p be a NE for (G,M)and let q be the nearest (in metric) mixed strategy on the . Then provided q is a -NE for (G,M)
Approximate TreeNash (3) • Theorem: For any >0, let Then ApproximateTreeNash computes an -NE for the tree game (G,M).
Exact TreeNash • Tables will be made of finite unions of rectangles • Each table T(v,u) will be represented by a v-list: For each interval there is a subset of [0,1] of disjoint intervals: where T(v,u)=1
Exact TreeNash (2) • Assume share a common v-list (by merging) • Downstream: How do we find T(w,v) using them, and keep such representation of rectangles?
Exact TreeNash (3) • Fix a v-interval and set of intervals appropriate to the v-interval for each parent: • T(w,v)=1 is of the form WxI - why? • What would be the region Wfor which some v in the interval is b.r. to u,w?
Exact TreeNash (4) • Denote expected payoff of V • Lemma: If then W is either empty, a continuous interval in [0,1] or union of two intervals.
Exact TreeNash (5) • Can be shown that the leafs can be represented using at most 3 rectangles • Therefore, the representation can be kept and is exponential in the number of vertices • Witnesses can be found easily, because representation is finite
ExactTreeNash • Theorem: ExactTreeNash computes a Nash equilibrium for the tree game (G,M). The algorithm runs in exponential time in the number of vertices of G
Polynomial algorithm • Use downstream pass and upstream pass as well • Pass breakpoints policies (W child of V): Interpretation (“b.p. for V”):
How? - Downstream • Denote: - ordered set of breakpoints of V’s parents - Set of values that W can play that allow V to play any strategy, given - Set of values that W can play, and V’s parents play according to V=b, then V=b is a best response -
How? - Downstream • Lemma: is either empty, a single interval or the union of two intervals • Lemma: • Construct the policy for V by covering [0,1] with them – will produce at most set of 2+l breakpoints. • How do we start with the leafs?
How? - Upstream • Add a dummy root with constant payoff and no influence on the real root • Once we select a value for the child, the value for the parents are determined according to the policies
Running time • Sorting and computing new breakpoint policy: (t – number of breakpoints) • Number of breakpoints is bounded by 2n, therefore total running time:
Summary • First framework gave us: 1. Finding approximation for NE in graphical games which are trees in polynomial time 2. Finding NE for trees in exponential time (ALL of the NEs representation) • Second algorithm: finding NE in polynomial time