970 likes | 1.29k Views
A Tutorial on Game Theory. Daniel B. Neill Carnegie Mellon University April 2004. Outline. Lecture 1: Basics of rational decision theory. Games in normal form; pure and mixed Nash equilibria. Games in extensive form; backward induction. Lecture 2: Bayesian games.
E N D
A Tutorial on Game Theory Daniel B. Neill Carnegie Mellon University April 2004
Outline • Lecture 1: • Basics of rational decision theory. • Games in normal form; pure and mixed Nash equilibria. • Games in extensive form; backward induction. • Lecture 2: • Bayesian games. • Auctions and negotiation. • Game theory vs. game playing. • Lecture 3: • Evolution and learning in games.
What is game theory? • Game theory can be defined as “the study of rational decision-making in situations of conflict and/or cooperation.” • What is a decision? • What is a rational decision? • What do we mean by conflict and cooperation?
What is game theory? (2) • The study of rational decision-making in situations of conflict and/or cooperation. • Decision: a player’s choice of what action to take, given some information about the state of the world. • The consequences of a player’s decision will be a function of his action, the actions of other players (if applicable) and the current state.
What is game theory? (3) • The study of rational decision-making in situations of conflict and/or cooperation. • Rational: a rational player will choose the action which he expects to give the best consequences, where “best” is according to his set of preferences. • For example, people typically prefer more money to less money, or pleasure to pain.
Rational decisions • A decision maker is assumed to have a fixed range of alternatives to choose from, and his choice influences the outcome of the situation. • Each possible outcome is associated with a real number– its utility. This can be subjective (how much the outcome is desired) or objective (how good the outcome actually is for the player). • In any case, the basic assumption of game theory is: A rational player will make the decision that maximizes his expected utility.
Types of decision situation • Decision making under certainty: would you prefer to be paid $100 or punched in the nose? • Consequences C(A) of each action A are known. • A rational agent chooses the action with the highest utility U(C(A)). • For most people, U(paid $100)>U(punched in nose) so they would choose the former. Note that we are only considering the rationality of actions, not preferences: a person who prefers a punch in the nose can still be rational under our definition!
Types of decision situation (2) • Decision making under risk: would you wager $100 on the flip of a fair coin? • For each action, a probability distribution over possible consequences P(C | A) is known. • A rational agent chooses the action with highest expected utility, • For most people, ½ U(gain $100) + ½ U(lose $100) < 0, so they would not take this wager. Money is not utility, since most people are “risk-averse”!
Types of decision situation (3) • Decision making under uncertainty: would you rather go to the movies or to the beach? • Agents are assumed to have a subjective probability distribution over possible states of nature P(S). • The consequence of an action is assumed to be a deterministic function C(A, S) of the action A and the state S. • A rational agent chooses the action with the highest subjective expected utility,
Types of decision situation (4) • Decision making under uncertainty: would you rather go to the movies or to the beach? • I believe there is a 40% chance that it will rain. • I will enjoy a movie whether it rains or not: U(C(movie, sun) = U(C(movie, rain)) = 1. • I will not enjoy the beach if it is rainy, but I will have a great time if it is sunny: U(C(beach, rain)) = -1, U(C(beach, sun)) = 2. • SEU(movie) = 1. • SEU(beach) = .4(-1)+.6(2) = .8. I’m going to the movies!
What is game theory? (4) • The study of rational decision-making in situationsof conflict and/or cooperation. • Game theory typically deals with situations in which multiple decision-makers interact: the consequences for each player are affected not only by his choice but also by choices of other players. • In zero sum games (ex. chess), one player’s gain is the other’s loss, while in non-zero sum games (ex. business agreements), it is possible for both players to simultaneously gain or lose.
Zero sum games • Examined in depth by Von Neumann and Morgenstern in the 1920s-1940s. • Most important result: the minimax theorem, which states that under common assumptions of rationality, each player will make the choice that maximizes his minimum expected utility. • This choice may be a pure strategy (always making the same choice) or a mixed strategy (a random choice between pure strategies).
Zero sum example The value of the game is -1! Solution: P1 always prefers B; P2 (knowing this) prefers Bb to Ba.
Now consider a simple “soccer” game: There are two players, the kicker and the goalie. The kicker can shoot the ball either left or right; the goalie can dive either left or right. If the goalie chooses correctly, he blocks the ball; if the goalie chooses wrong, it’s a goal! What should each player do? Mixed strategies Goalie dives L R L Kicker shoots R
Non-zero sum games • Can be cooperative (players can make enforceable agreements: “I’ll cooperate if you do”) or non-cooperative (no prior agreements can be enforced). • In non-cooperative games, an agreement must be self-enforcing, in that players have no incentive to deviate from it. • Most important concept: Nash Equilibrium. • A combination of strategy choices such that no player can increase his utility by changing strategies. • Nash’s Thm: every game has at least one NE.
Non-zero sum example Aa is a Nash equilibrium: each player gets 4 at Aa, but only 2 if he plays B instead! Are there any other Nash equilibria for this game?
Formal definitions • A game in normal form consists of: • A list of players i = 1…n • A finite set of strategies Si for each player i • A utility function ui for each player i, where ui : (S1 x S2 x … x Sn) R. (ui maps a combination of players’ pure strategies to the payoff for player i). • Normal form gives no indication of the order of players’ moves; for this we need extensive form (more on this later). For now, assume that all players choose strategies simultaneously!
Formal definitions (2) • A Nash equilibrium is a set of strategies s1 S1 … sn Sn, such that for each player i: si = arg maxs ui (s1, …, si-1, s, si+1, …, sn) • In other words, each player’s strategy si is a strategy s that maximizes his payoff, given the other players’ strategies; no player can do better by switching strategies.
Formal definitions (3) • A mixed strategy σi is a probability distribution over player i’s pure strategies si. • For example, if A and B are the pure strategies for P1, then σ1 might be (¼ A, ¾ B). • Then the utility • Nash equilibrium in mixed strategies: a set of mixed strategies σ1 … σn, such that for each player i: σi = arg maxσ ui (σ1, …, σi-1, σ, σi+1, …, σn)
Computing Nash equilibria • A strategy si is strictly dominated if there exists another strategy ti in Si such that player i always scores better with ti than with si. • Assumption: A rational player will never play a strictly dominated strategy! • Moreover, since the other players know this, we can iteratively delete strictly dominated strategies. • order of deletion doesn’t matter • we will never delete a NE • if, after we do this, there is only one combination of strategies left, this is the unique NE of the game.
Strict domination example a b c a b c A A B C C > B a > c C a a b a b A A A a > b C A > C Thus Aa is the unique Nash equilibrium!
Strict domination example a b c a b c A A If both players are rational, and this is common knowledge, Aa will be the outcome of the game! B C C > B a > c C a a b a b A A A a > b C A > C Thus Aa is the unique Nash equilibrium!
Computing Nash equilibria (2) a b c • Strict domination is great… when it applies. • But for some games, no strategies can be eliminated by strict domination. What to do now? • We can check whether each combination of pure strategies is a Nash equilibrium: • Can any player do better by switching strategies? If not, then it’s NE. A B C A nice example from Andrew: where is the Nash equilibrium?
Computing Nash equilibria (3) a b c • Here’s a neat little trick for finding pure strategy NE in two player games: • For each column, color the box(es) with maximum payoff to P1 red. • For each row, color the box(es) with maximum payoff to P2 blue. • The Nash equilibria are the set of squares colored both red and blue (purple in our picture). A B C A nice example from Andrew: where is the Nash equilibrium?
Computing Nash equilibria (4) a b • What if there are no pure strategy equilibria? • By Nash’s Theorem, a mixed strategy equilibrium must exist. • For a mixed strategy equilibrium, each player chooses a mixture of strategies that makes the other players indifferent between the strategies over which they are mixing. • If P1 chooses (½ A, ½ B), P2 is indifferent between a and b, and if P2 chooses (½ a, ½ b), P1 is indifferent between A and B. • Thus ((½ A, ½ B), (½ a, ½ b)) is a NE. A B This game can be thought of as a variant on “matching pennies,” where the winner gets four points and the loser none.
Computing Nash equilibria (5) a b • What about this game? • Aa and Bb are both pure strategy NE. Are there any mixed NE? • Assume there is a mixed strategy NE with P1(A) = x, and P2(a) = y. • For P1 to be indifferent between A and B: 4y + 0(1-y) = 2y + 1(1-y) y = ⅓. • For P2 to be indifferent between a and b: 4x + 0(1-x) = 3x + 1(1-x) x = ½. A B Mixed strategy NE: ((½ A + ½ B), (⅓ a + ⅔ b)).
Extensive form games 1 • An extensive form game is a “game tree”: a rooted tree where each non-terminal node represents a choice that a player must make, and each terminal node gives payoffs for all players. • For example, in this 3-player game, first P1 must choose between L and R. • Assume he chooses R: then P3 must choose between X and Y. • Assume P3 chooses Y: then player 1 scores 3, players 2 and 3 score 5, and the game ends. L R 2 3 C X Y A B 2 U D 355 101 008 734 961 056
Solving extensive form games 1 • We use a procedure called “backward induction,” reasoning backward from the terminal nodes of the tree. • At node a, P2 would maximize his utility by choosing C, scoring 3 points instead of 0. • At node b, P2 would choose U. • Now, what would P3 choose at his decision node? L R 2 a 3 C X Y A B b 2 U D 355 101 008 734 961 046
Solving extensive form games 1 • Solution depends on common knowledge of rationality: • Since P3 knows P2 is rational and will choose U at node b, P3 knows he will only get 1 if he chooses X. Thus he chooses Y instead. • Now, P1 knows he will get 7 if he chooses L, or 3 if he chooses R. Thus he chooses L. • The value of the game is (7, 3, 4). • Backward induction gives a unique solution to any extensive form game w/ perfect information (see below) and no ties in payoffs. L R 2 a 3 C X Y A B b 2 U D 355 101 008 734 961 046
In games with perfect information, at each node in the tree, the player knows exactly where in the tree he is. In games with imperfect information, this may not be true. For example, in this game, if P1 chooses L or M, P2 must choose U or D, without knowing whether he is at node a or node b. Nodes a and b are part of the same “information set.” Games with imperfect information tend to be much harder to solve! Games with imperfect information 1 L R M 2 a b U D U D 41 14 14 41 k3 What should P1 choose if k = 3? And if k = 2?
Strategies in extensive form games 1 • A pure strategy si for player i consists of a choice for each of player i’s information sets. • In a game with perfect information, each information set consists of a single decision node. • P1 has 2 strategies: (L) and (R). P2 has 6 strategies: (A, U), (A, D), (B, U), (B, D), (C, U), and (C, D). • Mixed strategies σiare defined by randomizing over pure strategies as before; pure and mixed Nash equilibria are defined as above. L R 2 3 C X Y A B 2 U D 355 101 008 734 961 056
Transforming games from extensive to normal form 1 U D L R L M 2 M R U D U D 41 14 14 41 23 This game has no pure Nash equilibria; its mixed equilibrium is (½ L + ½ M), (½ U + ½ D).
Transforming games from extensive to normal form (2) 1 U D L R L M 2 M R U D U D 41 14 14 41 33 This game has no pure Nash equilibria; its mixed equilibria are (R, (kU + (1-k)D)) for ⅓ ≤ k ≤ ⅔.
Transforming games from extensive to normal form (3) 1 Uu Ud Du Dd L R M L 2 2 M U D u d R 41 14 14 41 k3 This game has a pure Nash equilibrium (R, Du) for all k > 1.
A rational agent comes up to you holding a grenade, and says “Give me $100 or I’ll blow us both up.” Do you believe him? “Implausible” threats 1 Pay Don’t pay 2 2 -100 100 -∞ -∞ 00 -∞ -∞ What if you don’t know that he’s rational? Another nice example from Andrew’s slides!
NE says nothing about which equilibrium should be played. Various refinements of NE have been proposed, with the goal of separating “reasonable” from “unreasonable” equilibria. For example, Ab and Ba are both NE of the game at right. We would like to eliminate Ab, since P2 is playing a weakly dominated strategy. Assume “trembling hand”: what if P1 will accidentally play B at equilibrium Ab, or A at equilibrium Ba, with some small probability? What to do when there are multiple Nash equilibria? a b A B Ba is a “perfect equilibrium,” (Selten, 1973), and Ab is not.
In “coordination games” such as this one, traditional refinements fail; for instance, both equilibria are “perfect.” How to choose between equilibria? Choose the “Pareto dominant” NE (Aa)? Choose the “risk dominant” NE (Bb)? Maybe evolutionary games can help! In a population playing this coordination game, where players’ choices evolve over time, which strategy is more likely to dominate the population? More on this in lecture 3! What to do when there are multiple Nash equilibria? (2) a b A B Pareto dominant: higher payoffs if opponent coordinates. Risk dominant: higher payoffs if opponent randomizes 50/50.
Outline • Lecture 1: • Basics of rational decision theory. • Games in normal form; pure and mixed Nash equilibria. • Games in extensive form; backward induction. • Lecture 2: • Bayesian games. • Auctions and negotiation. • Game theory vs. game playing. • Lecture 3: • Evolution and learning in games.
Bayesian games enter don’t • All the games we have examined so far are games of complete information: players have common knowledge of the structure of the game and the payoffs. • What if the players do not know some of the parameters of the game? • For example, consider the “entry” game: • P1 and P2 are businesses; P1 currently controls the market. • P1 must choose whether to invest money in upgrading its product. • P2 must choose whether to enter the market. • If P2 enters the market, it will obtain a decent market share only if P1 has not invested. • The catch: P2 doesn’t know P1’s upgrade cost! invest wait
Solving Bayesian games enter don’t • Harsanyi’s solution: • Each player has a type, representing all the private information the player has that is not common knowledge. • Players do not know other players’ types, but they do know what possible types a player can have. • The game starts with a random move by Nature, which assigns a type to each player. • The probability distribution over types is assumed to be common knowledge. • This transforms the game from incomplete information to imperfect information, and this we can solve! invest wait
Solving Bayesian games (2) enter don’t • For simplicity, let us assume P1 has two possible types. • The cost of investment is either high or low, with corresponding payoff tables given here. • Further assumption: Pr(high) = ¼, Pr(low) = ¾. • P2 only has a single type. • What should each player do in this situation? invest wait High cost enter don’t invest wait Low cost
Solving Bayesian games (3) enter don’t • What should each player do? • If P1’s cost is high, he should always wait. • If P1’s cost is low, he should always invest. • Thus P2 knows that P1 will invest with probability ¾ and wait with probability ¼. • So P2’s expected gain from entering is ¾ (-1) + ¼ (1) < 0, and P2 should not enter. • This game has a unique “Bayes-Nash equilibrium” in pure strategies: s1(high) = wait s1(low) = invest s2 = don’t enter invest wait (¼) High cost enter don’t (¾) invest wait Low cost
Solving Bayesian games (4) enter don’t • Here’s a harder case: • If P1’s cost is high, he should always wait. • Let x = P(P1 invests | low cost), and y = P(P2 enters). • If P1’s cost is low, his expected gain from investing is 3 - (2y + 4(1-y)) = 2y - 1. • P2’s expected gain from entering is: ¾ (-1(x) + 1(1-x)) + ¼ (1) = 1 - 3x/2. • Thus we know: x > ⅔ y = 0. x < ⅔ y = 1. y > ½ x = 1. y < ½ x = 0. invest wait (¼) High cost enter don’t (¾) invest Mixed BNE: σ1(high) = wait σ1(low) = invest w/ prob ⅔. σ2 = enter w/ prob ½. wait Low cost
Bayesian games- formal definition • A Bayesian game consists of: • A list of players i = 1… n. • A finite set of types Θi for each player i. • A finite set of actions Ai for each player i. • A utility function ui for each player i, where ui: A x Θ R. (ui maps a combination of players’ actions and types to the payoff for player i). • A prior distribution on types, P(θ) for θΘ. • The set of pure strategies Si for each player i are defined as Si : ΘiAi. • A strategy is a mapping from types to actions.
Bayesian games: formal definition (2) • A Bayes-Nash equilibrium is a combination of strategies s1 S1 … sn Sn, such that for each player i and each possible type θiΘi: si(θi) = arg maxs∑ ui (s1(θ1), …, si-1(θi-1), s(θi), si+1(θi+1), …, sn(θn)) P(θ-i | θi) • At a BNE, no player type can increase his expected payoff (over the distribution of possible opponents’ types) by changing strategies. • Mixed strategies are distributions over pure strategies as before; mixed BNE defined similarly.
What if the opponent’s set of strategies is unknown? • Consider the following variation on the rock-scissors-paper game: • There are three “taboo” cards, labeled “Rock,” “Scissors,” and “Paper” respectively. • Both players draw a card, then play rock-scissors-paper, with a catch: neither player is allowed to play the action on his taboo card. • What should each player do?
First consider the simplest variant, where each player gets to see the other’s card. There are six possible games, depending on who draws which card. For example, if P1 draws Rock and P2 draws Scissors, we have the game shown here. In half of these games, P1 has an advantage (value of the game = ⅓) and in half P2 has an advantage (value of the game = -⅓). “Taboo” Rock-Scissors-Paper (1) P2S R P S P1R P Mixed NE: P1 plays (⅓ S, ⅔ P), and P2 plays (⅓ R, ⅔ P). Expected payoffs: (⅓, -⅓)
Next consider an unfair variant, where P2 gets to see P1’s card, but not vice-versa. The game is symmetric w.r.t. P1’s draw; assume wlog that he draws Rock, giving the Bayesian game shown here. Bayesian Nash equilibrium: σ1 = (⅓ S, ⅔ P) σ2(S) = (⅔ R, ⅓ P) σ2(P) = (S) “Taboo” Rock-Scissors-Paper (2) P2S R P S P1R P (½) P2P (½) R S S Work this out for practice! P1R P Value of the game is - . P2 does have an advantage!
“Taboo” Rock-Scissors-Paper (3) • Now let’s make it more fun: neither player gets to see the opponent’s card! • What should each player do in this situation? • Play a couple games yourself and try to work out the solution… answer on next slide. • Hint: the game is symmetric, with respect to the players and with respect to the cards drawn.
By symmetry, we know P1,R(S) = P2,R(S) = P1,S(P) = P2,S(P) = P1,P(R) = P2,P(R) = x. Then P1,R(P) = P2,R(P) = P1,S(R) = P2,S(R) = P1,P(S) = P2,P(S) = 1-x. wlog assume P1 draws R; P2 does not know this. Then P2 has S or P with equal probability; what should P1 do? Expected payoff for P1,R(S) = ½ (-P2,S(R) + P2,S(P)) + ½ (-P2,P(R)) = -½ (1-x) + ½x - ½x = ½(x-1). Expected payoff for P1,R(P) = ½ (P2,S(R)) + ½ (P2,P(R) - P2,P(S)) = ½ (1-x) + ½x - ½ (1-x) = ½ x. “Taboo” Rock-Scissors-Paper (4) P2S R P S P1R P (½) P2P (½) R S S P1R P