Bayesian Games

Bayesian Games © Petteri Nurmi 2003

Bayesian Games • Definitions and Static Bayesian Games • Example of solving a static Bayesian game • Sender-Receiver Bayesian Games • Equilibrium concepts • An example of Bayesian games in Computer Science: Ad Hoc Networks: Modelling cooperation © Petteri Nurmi 2003

What is a Bayesian game? • A strategic form game with incomplete information • What is incomplete information? • Some players don’t know the payoff of the others • Incomplete information is not imperfect information • Imperfect information = Players don’t observe the actions of others correctly • Classic reference: • Harsanyi J., 1967-1968, • Games with incomplete information played by Bayesian players • Management Science 14: 159-182; 320-334; 486-502 © Petteri Nurmi 2003

Harsanyi’s Model • The game is transformed into a game of imperfect information • A prior move by nature • Nature’s move determines player’s ”type” • The equilibrium of this game is the Bayes-Nash equilibrium • John C. Harsanyi (29.5. 1920 – 2000) • Nobel prize in 1994 together with • John F. Nash Jr. • Reinhard Selten • "for their pioneering analysis of equilibria in the theory of non-cooperative games" © Petteri Nurmi 2003

Mathematical Intermezzo • A Bayesian Game consists of the following components • A (finite) set of players N = {1,…, n} • An action set for each player Ai ; A = χi ∈ N Ai • A type set Θi ; Θi = χi ∈ N Θ i • A probability function pi: Θ i Δ (Θ -i) • A payoff function: ui: A x Θ  ℝ © Petteri Nurmi 2003

Type? • In Harsanyi model nature selected player’s type  What is a type? • Any private information (or is not common knowledge) that is relevant to the players’ decision making • Such as? • Player’s payoff function • Player’s beliefs about other players’ payoff functions • Beliefs about what players believe his beliefs are • … and so on © Petteri Nurmi 2003

Type continued • Let us denote Θi player i’s type • Type Θi is observed by player i only • p(Θi | Θ-i ) denotes player i’s conditional probability about his opponents types given his type Θ-i = (Θ1 , …, Θi-1, Θi+1 , …, ΘI ) • We assume that the marginal pi (Θi) is strictly positive. © Petteri Nurmi 2003

Strategy • A pure-strategy space Si represents choices of actions • σ(Θi) is the strategy player i chooses when his type is Θi • Mathematically defined: • Strategy is a mapping from set of types to the set of Actions • σ(Θi) = Ai ; σi : Θi Ai∀ i ∈ N (pure) • σ(Ai | Θi) = α ; σi : Θi Δ(Ai) ∀ i ∈ N (mixed) • α represents the (conditional) probability that player i chooses action i when the type is Θi © Petteri Nurmi 2003

Equilibrium • Bayesian equilibrium = Bayes-Nash equilibrium • Player i maximizes her expected utility conditional on type Θi si (Θi) ∈ arg max  p(Θi | Θ-i) ui (s i’, s-i (Θ-i), (Θi, Θ-i)) si’ ∈ SiΘ-i © Petteri Nurmi 2003

Enter Don’t 0, -1 2, 0 1.5, -1 3.5, 0 B 2, 1 3, 0 2, 1 3, 0 DB LOW Example • Consider the following game • Player I decides whether to build a new factory • Simultaneously player II decides whether to enter or not • Player I’s decision depends on her building cost that is unknown to player II Enter Don’t B DB HIGH © Petteri Nurmi 2003

Example cont. • This can be seen as a Bayesian game • Set of players N = { I, II } • Action sets A1 = { B, DB} ; A2 = {Enter, Don’t } • Type sets Θ1 = { HIGH, LOW } ; Θ2 = { X } • Player II has a singleton set as the type space so we can ignore it. • Let cl be low cost and ch high cost types © Petteri Nurmi 2003

Enter Don’t Enter Don’t 0, -1 2, 0 1.5, -1 3.5, 0 B B 2, 1 3, 0 DB 2, 1 3, 0 DB HIGH LOW Example cont. • A strategy for player I is an action for EACH of it types • For the high-cost type of player I we have a dominant strategy don’t build © Petteri Nurmi 2003

Enter Don’t 1.5, -1 3.5, 0 B 2, 1 3, 0 DB LOW Example cont. • The best-response for player I low-cost type depends on player II’s strategy u1(B; y; cl) = 1.5y + 3.5(1-y) = 3,5 – 2y u1(DB; y; ch) = 2y + 3(1 – y) = 3 – y • Player I’s low-cost type prefers building IF y ≤ ½ © Petteri Nurmi 2003

Example cont. • For player II we must first consider the possibility that the cost is actually low  u2(E;x) = p + (1-p) [-2x + 2 (1 –x)] = 2 – 4(1 – p)x  x ≤ 1 / 2(1 – p) (:=w) • Now we need to compare the best-response correspondences, for player II the correspondence is {1} x < w y*(x) = [0, 1] x = w {0} x > w © Petteri Nurmi 2003

Example cont. • In a similar way we get the correspondence mapping for player I. • Now the Bayesian equilibrium is the intersection of the correspondence functions (with a fixed value for p1) © Petteri Nurmi 2003

© Petteri Nurmi 2003

Sender-Receiver Games • Game has two players a sender and a receiver • Sender sends a signal to receiver who then chooses an appropriate action • Player I has private information about his type • Player II has only one type, which is considered common knowledge • “A move by nature” © Petteri Nurmi 2003

Strategies in S-R games • A pure strategy for the Sender is a one-to-one correspondence mapping m: Θ M • Let σ(m | Θ) be the probability that a type Θ-sender sends message m  mixed-strategy for sender • For receiver let p(a | m) be a mixed-strategy (choose action a if message = m) • On-the-path messages ℳ+(Θ) = {m: ∃ θ ∈ Θσ(m | Θ) > 0 } = supp σ(Θ) © Petteri Nurmi 2003

Payoffs in S-R games • Sender’s (expected) payoff: a∈Aρ(a|m)u(m,a,θ) • For the receiver? • Must consider every type and every message • E(v(a,σ)) = m∈Mθ∈Θρ(θ) σ(m| θ)u(m,a,θ) © Petteri Nurmi 2003

Bayes’ rule in action • For any on-the-path message m, the receiver’s posterior belief that player I is of type θ, is pB(θ |m) pB(θ | m) = p(θ)σ(m|θ) θ’∈Θp(θ’)σ(m| θ) • POSTERIOR rule for updating PRIOR beliefs! © Petteri Nurmi 2003

Bayes Equilibrium in S-R games • In a Sender-Receiver game the Bayesian equilibrium is a triple (σ, ρ, θ) ∈ℳθxAMx(Δ(θ))M satisfying the following conditions: • For all types θ ∈ Θ supp σ(Θ) ⊂M´(ρ, Θ) • For all on-path-messages: ∀ m∈ M+(σ) supp ρ(m) ⊂A´(ρ´, Θ) • The conditional posterior belief system is consistent with Bayes’ rule whenever possible © Petteri Nurmi 2003

Perfect Bayesian Equilibrium • In a Sender-Receiver game the perfect Bayesian equilibrium is a triple (σ, ρ, θ) ∈ℳθxAMx(Δ(θ))M satisfying the following conditions: • For all types θ ∈ Θ supp σ(Θ) ⊂M´(ρ, Θ) • For all messages: ∀ m∈ M(σ) supp ρ(m) ⊂A´(ρ´, Θ) • The conditional posterior belief system is consistent with Bayes’ rule whenever possible © Petteri Nurmi 2003

Other types of Equilibrium • Perfect Bayesian equilibrium in Multi-Stage Games • Posterior beliefs are independent, and all types of player i have the same beliefs • Bayes rule to update beliefs (history information?) • “no signalling what you don’t know” • Posterior beliefs need to be consistent with a common joint distribution • Extensive-Form games: Sequential Equilibrium • See Fudenberg, D., and J.Tirole Game Theory 1991 The MIT Press p. 337-341 • Trembling-Hand Perfect Equilibrium (p. 351-356) • Proper equilibrium (p.356-359) © Petteri Nurmi 2003

Bayesian games in CS • Ad Hoc Networks • Auctions • Social learning • The web search game • Voting © Petteri Nurmi 2003

Ad hoc networks • MANET = Mobile Ad Hoc Networks • A set of mobile hosts, each with a transceiver • No base stations; no fixed network infrastructure • Multi-hop communication • Routing and packet forwarding takes place in a dynamical network topology • Game Theory and MANET? • Routing mechanisms for “selfish cooperation” © Petteri Nurmi 2003

Cooperation in MANET • Reference: Modelling cooperation in Mobile Ad Hoc Networks: A formal description of Selfishness; Urpi A., Bonuccelli M., and Giordano S. • Modelling Ad Hoc Networks with Bayesian games • The nodes are the players • Nodes have to periodically select whether to forward or not • Nodes have incomplete information about the total traffic in the network • Nodes have local information about their neighbourhood © Petteri Nurmi 2003

cont. • Important issues for each node • Energy consumption • The packets are forwarded by someone • “A shared medium” • Packets are send to every node that is within the transmission range • Prior to choosing its next action, a node has an opportunity to analyze its neighbours past behaviour • Node most decide to whom to send packets and to whom to discard packets. © Petteri Nurmi 2003

The Model • Time is discrete and divided into timeslots t1,…, tn • Node i has the following information in the beginning of frame tk • Ni(tk): Set of neighbours, assumed to be fixed during a single frame • Bi(tk): The remaining energy units (in the battery) • Tij(tk): The traffic node i generated as a source and has to send to node j during frame k. (for each node j in node i’s neighbourhood) • Fij(tk-1): The number of packets that j forwarded for i during the previous frame • Rij(tk-1): The number of packets i received from j during the previous frame • Ȓij(tk-1): The number of packets i received from j during the previous frame as a final destination © Petteri Nurmi 2003

The forwarding game • Nodes are the players • Player i’s type is its energy class e(i) = α, where 0 ≤ α≤ 1 • Player i as an action sets Sij(tk) i.e. the number of packets she will send to node j, and Fij(tk) the number of packets received from j during the previous frame she will forward to her • Player i’s payoff is: αe(i) Wi(tk) + (1 – αe(i)) Gi(tk) • Where • Wi(tk) is a measure of the energy spent succesfully • Gi(tk) is a the ratio of sent packets over packets that player i wanted to send. © Petteri Nurmi 2003

The forwarding game cont. • Player i has prior belief for every player j in its neighbourhood, what its energy class is. • A node tries to maximize its payoff function  SELFISHNESS • We need to analyze the game as a repeated (dynamic) game and provide a utility function that makes it profitable to player i to cooperate © Petteri Nurmi 2003

Problems • How to get the forwarding information? • Badly defined utility function and/or policy leads to self destruction • The usage of time slots • There is no synchronization! • Too simple decision space? • Possible other constaints. © Petteri Nurmi 2003

Problems cont. • Malicious and selfish users? • Need a stronger policy • Punishing vs. Encouraging • Punished better suitable because • How to reward agents? (better throughput in a network with no authority?) • Punishing more suitable to both malicious and selfish users, encouraging/rewarding suitable only for encouraging cooperation • Theorem: Cooperation can be enforced in a mobile Ad Hoc network, provided that enough members agree on it and that no node has to forward more traffic than it generates. © Petteri Nurmi 2003

Additional References • Cooperation in wireless ad hoc networks. Srinivasan V., Nuggehalli P., Chiasserini C-F, and Ramesh R. R., In Proceedings of IEEE Infocom 2003 http://citeseer.nj.nec.com/568937.html • Game Theoretic analysis of security in mobile ad hoc networks. Michiardi P., and Molva R. Technical Report RR-02-070, Institut Eurecom 2002. © Petteri Nurmi 2003

Types of Bayesian games • Static Bayesian games • Dynamic Bayesian games • Sender-Receiver Games • Extensive Form Games • Multi-Stage Games • Equilibrium concepts • Bayesian Equilibrium = Bayes-Nash Equilibrium • Bayes Equilibrium (in dynamic games) • Perfect Bayes Equilibrium © Petteri Nurmi 2003

Applications of static BayesianG • Packet forwarding in Ad Hoc networks • Voting mechanisms • Auction mechanisms • = MULTI-AGENT SYSTEMS • Requires: • Simultaneous competition • Multiple agents with incomplete information • Can also be non-simultaneous competition if the agents/players don’t know each others’ decisions (but have same beliefs that affect their decision-making). © Petteri Nurmi 2003

Applications of Dyn. BayesianG. • Many economic applications • Design model for network protocols? • Design model for multiprocessor architectures? • Bayesian games are a suitable tool for modelling situations where there is interaction between two or more agents and the prior information is incomplete. © Petteri Nurmi 2003

References • Fudenberg, D., and J.Tirole Game Theory 1991 The MIT Press • Kockesen L., Bayesian Games, http://www.columbia.edu/~lk290/ugbayes.pdf • Ratliff J., Static Games of Incomplete Information • Myatt D. P., Who Am I Playing? Incomplete Information and Bayesian Games, http://malroy.econ.ox.ac.uk/dpm/MPhilGameTheory/IncompleteStrategic.pdf • Urpi A., Bonuccelli M., Giordano S., Modelling cooperation in mobile ad hoc networks: a formal description of selfishness © Petteri Nurmi 2003

Additional Material • Eyster E., and M.Rabin Cursed Equilibrium 2000 • Jackson M., Kalai E., Social Learning in Recurring Games • Khoussainov R., and N. Kushmerick Playing the Web Search Game • Tenneholtz M., Robust Decision-Making in Multi-Agent Systems © Petteri Nurmi 2003

Bayesian Games

Bayesian Games

Presentation Transcript

Bayesian Essentials and Bayesian Regression

Bayesian

Bayesian Games Matthew H. Henry November 10, 2004

Bayesian Games

Probability and Bayesian Theory for Games

Bayesian games and their use in auctions

Playing Games for Security: An Efficient Exact Algorithm for Solving Bayesian Stackelberg Games

Bayesian Games

Bayesian Networks, Influence Diagrams, and Games in Simulation Metamodeling

CPS 173 Bayesian games and their use in auctions

Bayesian and non-Bayesian Learning in Games

games Games GAMES

Game Theory Dynamic Bayesian Games II

CPS 590.4 Bayesian games and their use in auctions

Game Theory Dynamic Bayesian Games

Game Theory Static Bayesian Games

CPS 296.1 Bayesian games and their use in auctions

Bayesian Games

Bayesian Essentials and Bayesian Regression

Bayesian and non-Bayesian Learning in Games

CPS 590.4 Bayesian games and their use in auctions