190 likes | 327 Views
Punishment, Detection, and Forgiveness in Repeated Games. The Stage Game. Prisoners’ dilemma structure applies in many situations Lovers or roommates Colluding oligopolists Arms control agreements Common-pool resources Cooperating vampire bats …many more.
E N D
The Stage Game • Prisoners’ dilemma structure applies in many situations • Lovers or roommates • Colluding oligopolists • Arms control agreements • Common-pool resources • Cooperating vampire bats • …many more
Working Example: Prisoners’ Dilemma Player 2 Cooperate Defect P LAyER 1 Cooperate Defect Assume T>R>0
Stage Game • In the stage game, Defect is a dominant strategy for both players. • So the only Nash equilibrium has them both playing defect and each getting a payoff of 0 • Both would be better off if they both cooperated, but how to enforce that?
Repeated Play • Suppose that after each round of play, players are told their payoff on the previous round and with probability d>0, they go on to play another round. • Can we get cooperative play by having each player threaten to punish a defection.
Punishment and forgiveness • Grim trigger: (No forgiveness) I will cooperate until you defect, but If you ever defect, I will defect in all future rounds. • Conditional N-period punishment. If you defect, I will start to defect and I will keep defecting until I have seen you cooperate N times in a row. Then I will cooperate so long as you do not defect.
Symmetric SPNE with Grim Trigger • Suppose that the other player is playing Grim Trigger. • If you play Grim Trigger as well, then you will both cooperate as long as the game continues and and you will each receive an expected payoff of R×(1+d +d2 + d3 + d4 + ….+ )=R/(1-d)
When does grim trigger sustain cooperation? • If you defect and the other guy is playing Grim Trigger, you will get a payoff of T>R the first time that you defect. But after this, the other guy will always play defect. The best you can do, then is to always defect as well. • You both get zero when you both defect, so expected payoff from defecting is just T+0=T • So both paying grim trigger and always cooperating is a SPNE if T<R/(1-d) • For example if d=.9, grim trigger sustains cooperation if T<10R.
What did we learn? • Cooperation can be sustained if T<R/(1-d) • Equivalently if (1-d)T<R • This is the case if temptation is not too big and if the probability d of playing again is not too small.
Forgiveness • Does a punishment strategy have to be as unrelenting as grim trigger? • In the real world, why might it not be a good idea to have an unforgiving punishment? • What if you get a noisy signal about other player’s action? • What if other player made a one-time mistake or was subjected to unusual temptation • This question is much wrestled with in religion and in politics.
Foregiveness and religion • There is a tension in religious prohibitions. • To make people act as the priests would like them to, it might seem useful to tell them that they will be eternally punished for actions the priests don’t like. • But if they do that, people who have violated the rules might as well continue to do so, since they are damned anyway. • Hence religions often claim a “forgiving” deity in hopes of bringing lost sheep back into the herd.
What if temptation varies over time? • Suppose that in the play of this game, the
The “Folk Theorem”:A general result • The “good news”: In a repeated game with complete information, where the probability d that it will be continued to the next round is sufficiently close to 1, an efficient outcome can always be sustained as a subgame perfect Nash equilibrium.
More about the Folk Theorem • “Not-so-good-news” In a repeated game of incomplete information with d close to one, not only can efficient outcome can be sustained as a Nash equilibrium, so can almost anything else. • Possible explanation for why men wear neckties or women wear absurdly painful high heels.
Details of a Folk Theorem • Consider a repeated game with an inefficient Nash equilibrium. • Consider a strategy called Strategy A: “Do some quite arbitrary sequence of plays” so long as everybody else does their specified drill. If anyone fails to do so, revert to your inefficient Nash equilibrium action. • If everybody prefers the result when all follow the arbitrary sequence to the inefficient Nash equilibrium, then for d close to 1, the strategy profile. Everybody uses Strategy A is a subgame perfect Nash equilibrium.
Tit for Tat: a more forgiving strategy • What is both players play the following strategy in infinitely repeated P.D? • Cooperate on the first round. Then on any round do what the other guy did on the previous round. • Suppose other guy plays tit for tat. • If I play tit for tat too, what will happen?
Payoffs • If you play tit for tat when other guy is playing tit for tat, you get expected payoff of R(1+d +d2 + d3 + d4 + ….+ )=R/(1-d) • Suppose instead that you choose to play “Always defect” when other guy is tit for tat. • You will get T+ P(d +d2 + d3 + d4 + ….+ ) =T+Pd/1-d Same comparison as with Grim Trigger. Tit for tat is a better response to tit for tat than always defect if d>(T-R)/(T-P)
Another try • Sucker punch him and then get him to forgive you. • If other guy is playing tit for tat and you play D on first round, then C ever after, you will get payoff of T on first round, S on second round, and then R for ever. Expected payoff is T+ Sd+d2R(1+d +d2 + d3 + d4 + ….+ )=T+ Sd+d2R/(1-d).
Which is better? • Tit for tat and Cheat and ask forgiveness give same payoff from round 3 on. • Cheat and ask for forgiveness gives T in round 1 and S in round 2. • Tit for tat give R in all rounds. • So tit for tat is better if R+dR>T+dS, which means d(R-S)>T-R or d>(T-R)(R-S) If T=10, R=6, and S=1, this would mean if d>4/5. But if T=10, R=5, and S=1, this would be the case only if d>5/4, which can’t happen. In this case, tit for tat could not be a Nash equilibrium.