400 likes | 885 Views
Bayesian Games. Bayesian Games. Definitions and Static Bayesian Games Example of solving a static Bayesian game Sender-Receiver Bayesian Games Equilibrium concepts An example of Bayesian games in Computer Science: Ad Hoc Networks: Modelling cooperation. What is a Bayesian game?.
E N D
Bayesian Games © Petteri Nurmi 2003
Bayesian Games • Definitions and Static Bayesian Games • Example of solving a static Bayesian game • Sender-Receiver Bayesian Games • Equilibrium concepts • An example of Bayesian games in Computer Science: Ad Hoc Networks: Modelling cooperation © Petteri Nurmi 2003
What is a Bayesian game? • A strategic form game with incomplete information • What is incomplete information? • Some players don’t know the payoff of the others • Incomplete information is not imperfect information • Imperfect information = Players don’t observe the actions of others correctly • Classic reference: • Harsanyi J., 1967-1968, • Games with incomplete information played by Bayesian players • Management Science 14: 159-182; 320-334; 486-502 © Petteri Nurmi 2003
Harsanyi’s Model • The game is transformed into a game of imperfect information • A prior move by nature • Nature’s move determines player’s ”type” • The equilibrium of this game is the Bayes-Nash equilibrium • John C. Harsanyi (29.5. 1920 – 2000) • Nobel prize in 1994 together with • John F. Nash Jr. • Reinhard Selten • "for their pioneering analysis of equilibria in the theory of non-cooperative games" © Petteri Nurmi 2003
Mathematical Intermezzo • A Bayesian Game consists of the following components • A (finite) set of players N = {1,…, n} • An action set for each player Ai ; A = χi ∈ N Ai • A type set Θi ; Θi = χi ∈ N Θ i • A probability function pi: Θ i Δ (Θ -i) • A payoff function: ui: A x Θ ℝ © Petteri Nurmi 2003
Type? • In Harsanyi model nature selected player’s type What is a type? • Any private information (or is not common knowledge) that is relevant to the players’ decision making • Such as? • Player’s payoff function • Player’s beliefs about other players’ payoff functions • Beliefs about what players believe his beliefs are • … and so on © Petteri Nurmi 2003
Type continued • Let us denote Θi player i’s type • Type Θi is observed by player i only • p(Θi | Θ-i ) denotes player i’s conditional probability about his opponents types given his type Θ-i = (Θ1 , …, Θi-1, Θi+1 , …, ΘI ) • We assume that the marginal pi (Θi) is strictly positive. © Petteri Nurmi 2003
Strategy • A pure-strategy space Si represents choices of actions • σ(Θi) is the strategy player i chooses when his type is Θi • Mathematically defined: • Strategy is a mapping from set of types to the set of Actions • σ(Θi) = Ai ; σi : Θi Ai∀ i ∈ N (pure) • σ(Ai | Θi) = α ; σi : Θi Δ(Ai) ∀ i ∈ N (mixed) • α represents the (conditional) probability that player i chooses action i when the type is Θi © Petteri Nurmi 2003
Equilibrium • Bayesian equilibrium = Bayes-Nash equilibrium • Player i maximizes her expected utility conditional on type Θi si (Θi) ∈ arg max p(Θi | Θ-i) ui (s i’, s-i (Θ-i), (Θi, Θ-i)) si’ ∈ SiΘ-i © Petteri Nurmi 2003
Enter Don’t 0, -1 2, 0 1.5, -1 3.5, 0 B 2, 1 3, 0 2, 1 3, 0 DB LOW Example • Consider the following game • Player I decides whether to build a new factory • Simultaneously player II decides whether to enter or not • Player I’s decision depends on her building cost that is unknown to player II Enter Don’t B DB HIGH © Petteri Nurmi 2003
Example cont. • This can be seen as a Bayesian game • Set of players N = { I, II } • Action sets A1 = { B, DB} ; A2 = {Enter, Don’t } • Type sets Θ1 = { HIGH, LOW } ; Θ2 = { X } • Player II has a singleton set as the type space so we can ignore it. • Let cl be low cost and ch high cost types © Petteri Nurmi 2003
Enter Don’t Enter Don’t 0, -1 2, 0 1.5, -1 3.5, 0 B B 2, 1 3, 0 DB 2, 1 3, 0 DB HIGH LOW Example cont. • A strategy for player I is an action for EACH of it types • For the high-cost type of player I we have a dominant strategy don’t build © Petteri Nurmi 2003
Enter Don’t 1.5, -1 3.5, 0 B 2, 1 3, 0 DB LOW Example cont. • The best-response for player I low-cost type depends on player II’s strategy u1(B; y; cl) = 1.5y + 3.5(1-y) = 3,5 – 2y u1(DB; y; ch) = 2y + 3(1 – y) = 3 – y • Player I’s low-cost type prefers building IF y ≤ ½ © Petteri Nurmi 2003
Example cont. • For player II we must first consider the possibility that the cost is actually low u2(E;x) = p + (1-p) [-2x + 2 (1 –x)] = 2 – 4(1 – p)x x ≤ 1 / 2(1 – p) (:=w) • Now we need to compare the best-response correspondences, for player II the correspondence is {1} x < w y*(x) = [0, 1] x = w {0} x > w © Petteri Nurmi 2003
Example cont. • In a similar way we get the correspondence mapping for player I. • Now the Bayesian equilibrium is the intersection of the correspondence functions (with a fixed value for p1) © Petteri Nurmi 2003
Sender-Receiver Games • Game has two players a sender and a receiver • Sender sends a signal to receiver who then chooses an appropriate action • Player I has private information about his type • Player II has only one type, which is considered common knowledge • “A move by nature” © Petteri Nurmi 2003
Strategies in S-R games • A pure strategy for the Sender is a one-to-one correspondence mapping m: Θ M • Let σ(m | Θ) be the probability that a type Θ-sender sends message m mixed-strategy for sender • For receiver let p(a | m) be a mixed-strategy (choose action a if message = m) • On-the-path messages ℳ+(Θ) = {m: ∃ θ ∈ Θσ(m | Θ) > 0 } = supp σ(Θ) © Petteri Nurmi 2003
Payoffs in S-R games • Sender’s (expected) payoff: a∈Aρ(a|m)u(m,a,θ) • For the receiver? • Must consider every type and every message • E(v(a,σ)) = m∈Mθ∈Θρ(θ) σ(m| θ)u(m,a,θ) © Petteri Nurmi 2003
Bayes’ rule in action • For any on-the-path message m, the receiver’s posterior belief that player I is of type θ, is pB(θ |m) pB(θ | m) = p(θ)σ(m|θ) θ’∈Θp(θ’)σ(m| θ) • POSTERIOR rule for updating PRIOR beliefs! © Petteri Nurmi 2003
Bayes Equilibrium in S-R games • In a Sender-Receiver game the Bayesian equilibrium is a triple (σ, ρ, θ) ∈ℳθxAMx(Δ(θ))M satisfying the following conditions: • For all types θ ∈ Θ supp σ(Θ) ⊂M´(ρ, Θ) • For all on-path-messages: ∀ m∈ M+(σ) supp ρ(m) ⊂A´(ρ´, Θ) • The conditional posterior belief system is consistent with Bayes’ rule whenever possible © Petteri Nurmi 2003
Perfect Bayesian Equilibrium • In a Sender-Receiver game the perfect Bayesian equilibrium is a triple (σ, ρ, θ) ∈ℳθxAMx(Δ(θ))M satisfying the following conditions: • For all types θ ∈ Θ supp σ(Θ) ⊂M´(ρ, Θ) • For all messages: ∀ m∈ M(σ) supp ρ(m) ⊂A´(ρ´, Θ) • The conditional posterior belief system is consistent with Bayes’ rule whenever possible © Petteri Nurmi 2003
Other types of Equilibrium • Perfect Bayesian equilibrium in Multi-Stage Games • Posterior beliefs are independent, and all types of player i have the same beliefs • Bayes rule to update beliefs (history information?) • “no signalling what you don’t know” • Posterior beliefs need to be consistent with a common joint distribution • Extensive-Form games: Sequential Equilibrium • See Fudenberg, D., and J.Tirole Game Theory 1991 The MIT Press p. 337-341 • Trembling-Hand Perfect Equilibrium (p. 351-356) • Proper equilibrium (p.356-359) © Petteri Nurmi 2003
Bayesian games in CS • Ad Hoc Networks • Auctions • Social learning • The web search game • Voting © Petteri Nurmi 2003
Ad hoc networks • MANET = Mobile Ad Hoc Networks • A set of mobile hosts, each with a transceiver • No base stations; no fixed network infrastructure • Multi-hop communication • Routing and packet forwarding takes place in a dynamical network topology • Game Theory and MANET? • Routing mechanisms for “selfish cooperation” © Petteri Nurmi 2003
Cooperation in MANET • Reference: Modelling cooperation in Mobile Ad Hoc Networks: A formal description of Selfishness; Urpi A., Bonuccelli M., and Giordano S. • Modelling Ad Hoc Networks with Bayesian games • The nodes are the players • Nodes have to periodically select whether to forward or not • Nodes have incomplete information about the total traffic in the network • Nodes have local information about their neighbourhood © Petteri Nurmi 2003
cont. • Important issues for each node • Energy consumption • The packets are forwarded by someone • “A shared medium” • Packets are send to every node that is within the transmission range • Prior to choosing its next action, a node has an opportunity to analyze its neighbours past behaviour • Node most decide to whom to send packets and to whom to discard packets. © Petteri Nurmi 2003
The Model • Time is discrete and divided into timeslots t1,…, tn • Node i has the following information in the beginning of frame tk • Ni(tk): Set of neighbours, assumed to be fixed during a single frame • Bi(tk): The remaining energy units (in the battery) • Tij(tk): The traffic node i generated as a source and has to send to node j during frame k. (for each node j in node i’s neighbourhood) • Fij(tk-1): The number of packets that j forwarded for i during the previous frame • Rij(tk-1): The number of packets i received from j during the previous frame • Ȓij(tk-1): The number of packets i received from j during the previous frame as a final destination © Petteri Nurmi 2003
The forwarding game • Nodes are the players • Player i’s type is its energy class e(i) = α, where 0 ≤ α≤ 1 • Player i as an action sets Sij(tk) i.e. the number of packets she will send to node j, and Fij(tk) the number of packets received from j during the previous frame she will forward to her • Player i’s payoff is: αe(i) Wi(tk) + (1 – αe(i)) Gi(tk) • Where • Wi(tk) is a measure of the energy spent succesfully • Gi(tk) is a the ratio of sent packets over packets that player i wanted to send. © Petteri Nurmi 2003
The forwarding game cont. • Player i has prior belief for every player j in its neighbourhood, what its energy class is. • A node tries to maximize its payoff function SELFISHNESS • We need to analyze the game as a repeated (dynamic) game and provide a utility function that makes it profitable to player i to cooperate © Petteri Nurmi 2003
Problems • How to get the forwarding information? • Badly defined utility function and/or policy leads to self destruction • The usage of time slots • There is no synchronization! • Too simple decision space? • Possible other constaints. © Petteri Nurmi 2003
Problems cont. • Malicious and selfish users? • Need a stronger policy • Punishing vs. Encouraging • Punished better suitable because • How to reward agents? (better throughput in a network with no authority?) • Punishing more suitable to both malicious and selfish users, encouraging/rewarding suitable only for encouraging cooperation • Theorem: Cooperation can be enforced in a mobile Ad Hoc network, provided that enough members agree on it and that no node has to forward more traffic than it generates. © Petteri Nurmi 2003
Additional References • Cooperation in wireless ad hoc networks. Srinivasan V., Nuggehalli P., Chiasserini C-F, and Ramesh R. R., In Proceedings of IEEE Infocom 2003 http://citeseer.nj.nec.com/568937.html • Game Theoretic analysis of security in mobile ad hoc networks. Michiardi P., and Molva R. Technical Report RR-02-070, Institut Eurecom 2002. © Petteri Nurmi 2003
Summary © Petteri Nurmi 2003
Types of Bayesian games • Static Bayesian games • Dynamic Bayesian games • Sender-Receiver Games • Extensive Form Games • Multi-Stage Games • Equilibrium concepts • Bayesian Equilibrium = Bayes-Nash Equilibrium • Bayes Equilibrium (in dynamic games) • Perfect Bayes Equilibrium © Petteri Nurmi 2003
Applications of static BayesianG • Packet forwarding in Ad Hoc networks • Voting mechanisms • Auction mechanisms • = MULTI-AGENT SYSTEMS • Requires: • Simultaneous competition • Multiple agents with incomplete information • Can also be non-simultaneous competition if the agents/players don’t know each others’ decisions (but have same beliefs that affect their decision-making). © Petteri Nurmi 2003
Applications of Dyn. BayesianG. • Many economic applications • Design model for network protocols? • Design model for multiprocessor architectures? • Bayesian games are a suitable tool for modelling situations where there is interaction between two or more agents and the prior information is incomplete. © Petteri Nurmi 2003
References • Fudenberg, D., and J.Tirole Game Theory 1991 The MIT Press • Kockesen L., Bayesian Games, http://www.columbia.edu/~lk290/ugbayes.pdf • Ratliff J., Static Games of Incomplete Information • Myatt D. P., Who Am I Playing? Incomplete Information and Bayesian Games, http://malroy.econ.ox.ac.uk/dpm/MPhilGameTheory/IncompleteStrategic.pdf • Urpi A., Bonuccelli M., Giordano S., Modelling cooperation in mobile ad hoc networks: a formal description of selfishness © Petteri Nurmi 2003
Additional Material • Eyster E., and M.Rabin Cursed Equilibrium 2000 • Jackson M., Kalai E., Social Learning in Recurring Games • Khoussainov R., and N. Kushmerick Playing the Web Search Game • Tenneholtz M., Robust Decision-Making in Multi-Agent Systems © Petteri Nurmi 2003
The End Bayesian Games by Petteri Nurmi http://www.cs.helsinki.fi/u/ptnurmi/papers.html © Petteri Nurmi 2003