570 likes | 985 Views
The Evolution of Cooperation. Robert Axelrod’s. A Computer Game for Political Science. Kentaro Toyama Microsoft Research India Indian Institute of Science August 10, 2005. Outline. Prisoner’s Dilemma Two Contests Some Analysis Real-World Scenarios Agent-Based Simulation Discussion.
E N D
The Evolution of Cooperation Robert Axelrod’s A Computer Game for Political Science Kentaro Toyama Microsoft Research India Indian Institute of Science August 10, 2005
Outline Prisoner’s Dilemma Two Contests Some Analysis Real-World Scenarios Agent-Based Simulation Discussion
Professor of Political Science and Public Policy at U Michigan, Ann Arbor First paper on cooperation published in 1980. Book (left) published in 1984 to wide acclaim. Best known for this and related work; still active in this area and publishing new research. http://www-personal.umich.edu/~axe/ Robert Axelrod
Outline Prisoner’s Dilemma Two Contests Some Analysis Real-World Scenarios Agent-Based Simulation Discussion
The Prisoner’s Dilemma Two-player game Non-zero-sum Model for many real-world scenarios Story based on two criminals caught by police and interrogated separately…
The Prisoner’s Dilemma Player B Payoff Matrix Cooperate Defect -2 -2 -5 0 Cooperate Player A -5 0 -4 -4 Defect Think of payoffs as number of years of life lost, spent in jail.
The Prisoner’s Dilemma Player B Payoff Matrix Cooperate Defect 3 3 0 5 Cooperate Player A 0 5 1 1 Defect (For ease of thinking, add 5 to each payoff. The larger the payoff, the better.)
The Prisoner’s Dilemma Player B Payoff Matrix Cooperate Defect 3 3 0 5 Cooperate Player A 0 5 1 1 Defect If Player A cooperates, Player B should defect.
The Prisoner’s Dilemma Player B Payoff Matrix Cooperate Defect 3 3 0 5 Cooperate Player A 0 5 1 1 Defect If Player A cooperates, Player B should defect. If Player A defects, Player B should defect.
The Prisoner’s Dilemma Player B Payoff Matrix Cooperate Defect 3 3 0 5 Cooperate Player A 0 5 1 1 Defect No matter what the other player does, a rational, self-interested player will defect. (This is a Nash equilibrium.)
The Prisoner’s Dilemma Player B Payoff Matrix Cooperate Defect 3 3 0 5 Cooperate Player A 0 5 1 1 Defect No matter what the other player does, a rational, self-interested player will defect. (This is a Nash equilibrium.) The Dilemma: There is a joint strategy that could result in better payoffs for both players. (The Nash equilibrium is not Pareto-optimal.)
Other Kinds of Games C D C D C 11 12 C 44 23 00 01 32 11 D D Exploitation Linked Fates Swerve Straight Heads Tails Swerve 22 13 Heads 1-1 -11 31 00 -11 1-1 Straight Tails Chicken Matching Coins
The Prisoner’s Dilemma Player B Payoff Matrix Cooperate Defect Reward for cooperation Temptation to defect 3 5 Cooperate Player A Sucker’s payoff Punishment for defection 0 1 Defect T > R > P > S R > (S + T) / 2
The Prisoner’s Dilemma Player B Payoff Matrix Cooperate Defect 30 3 0 5 Cooperate Player A 0 50 10 1 Defect T > R > P > S R > (S + T) / 2 Payoffs do not have to be symmetrical.
Competitve advertising “Tragedy of the Commons” Research collaboration Biological relationships Warfare Driving in traffic PD as a Model for Real-Life Scenarios
Iterated Prisoner’s Dilemma Two players Prisoner’s Dilemma played repeatedly History of previous interactions remembered by each player No other outside knowledge
Two-game iteration… C C D D C C 3 3 3 3 0 5 0 5 5 0 5 0 1 1 1 1 D D Iterated Prisoner’s Dilemma No matter what the other player does, a rational, self-interested player will defect on the second (last) game.
Two-game iteration… C C D D C C 3 3 3 3 0 5 0 5 5 0 5 0 1 1 1 1 D D Iterated Prisoner’s Dilemma No matter what the other player does, a rational, self-interested player will defect on the second (last) game. Both players know this, so on the first game, both players will defect, as well.
N-game iteration… C C C D D D C C C 3 3 3 3 3 3 0 5 0 5 0 5 5 0 5 0 5 0 1 1 1 1 1 1 D D D Iterated Prisoner’s Dilemma … No matter what the other player does, a rational, self-interested player will defect on the second game. Both players know this, so on the first game, both players will defect, as well. A rational, self-interested player should defect all N times.
If number of iterations uncertain… C C C C D D D D C C C C 3 3 3 3 3 3 3 3 0 5 0 5 0 5 0 5 5 0 5 0 5 0 5 0 1 1 1 1 1 1 1 1 D D D D Iterated Prisoner’s Dilemma … ? ? ? ? Best strategy is no longer clear! Unlike, e.g., chess, there is no single “best strategy” – it depends on the strategy of the other player.
Outline Prisoner’s Dilemma Two Contests Some Analysis Real-World Scenarios Agent-Based Simulation Discussion
Contest #1 Call for entries to game theorists All entrants told of preliminary experiments 15 strategies = 14 entries + 1 RANDOM Round-robin tournament against all other players and “twin” Each game 200 iterations Games run 5 times against each strategy Scores averaged over all games
And, the winner is… TIT FOR TAT “Cooperate on first move, thereafter reciprocate opponent’s previous action” Shortest program submitted By psychologist, Anatol Rapoport
Analysis: “Nice” Guys Finish First Top 8 strategies never defect first.
Analysis: To Forgive, Divine Top two rules are willing to cooperate even after defections, if other player is “contrite” DOWNING - “Kingmaker” - Tries to learn behavior of other player; starts by defecting twice. - Hurts strategies that are unforgiving.
Other Interesting Strategies TIT FOR TWO TATS • Retaliate only if previous two are D’s • Could have won tournament, if entered NICE DOWNING • Like DOWNING, but start with C’s • Could have won tournament, if entered Variations on TIT FOR TAT • Did well, but none beat TIT FOR TAT
Contest #2 Same set up as Contest #1, except… Entries from first-round contestants as well as open call in magazine 63 strategies = 62 entries + 1 RANDOM Each game iterated an uncertain number of iterations, with probability 0.00346 of ending
And, the winner is… TIT FOR TAT, again! (Again, by Anatol Rapoport)
Analysis: Contest #1 Lessons Validated 14 of top 15 strategies never defect first. 14 of bottom 15 strategies were not “nice”. Forgiveness important.
Analysis: Be Retaliatory Some entrees tried to take advantage of “nice” strategies: TRANQUILIZER – cooperate first, if other cooperates, too, throw in a few defections. TESTER – defect first, if other doesn’t retaliate, cooperate twice, then alternate defection and cooperation. If other ever defects, do TIT FOR TAT. Strategies that were unresponsive to defections get taken advantage of. Top strategies retaliate quickly.
Analysis: Sneaki-ness Doesn’t Pay Entrees that try to take advantage of “nice” strategies, don’t gain as much as they lose. TRANQUILIZER – 27th place in tournament. TESTER – 46th place in tournament (out of 63).
Outline Prisoner’s Dilemma Two Contests Some Analysis Real-World Scenarios Agent-Based Simulation Discussion
Robustness of TIT FOR TAT In six variations of Contest #2, TIT FOR TAT took first place in five and second place in one. In a population simulation with 63 strategies (right), TIT FOR TAT emerges as the winner. In an genetic algorithm experiment (1987), TIT-FOR-TAT-like algorithms prevailed.
Stability of TIT FOR TAT A population of TIT FOR TAT strategists cannot be invaded by a single strategy. Nor can a population of ALWAYS DEFECT strategists. But! A cluster of TIT FOR TATs can invade ALWAYS DEFECT,* while the converse is not true.
Stability of TIT FOR TAT A population of TIT FOR TAT strategists cannot be invaded by a single strategy. Nor can a population of ALWAYS DEFECT strategists. But! A cluster of TIT FOR TATs can invade ALWAYS DEFECT,* while the converse is not true.
Stability of TIT FOR TAT A population of TIT FOR TAT strategists cannot be invaded by a single strategy. Nor can a population of ALWAYS DEFECT strategists. But! A cluster of TIT FOR TATs can invade ALWAYS DEFECT,* while the converse is not true. * Under certain conditions that imply that the future is sufficiently important for all players.
General Lessons Don’t be envious. (It doesn’t matter if others win.) TIT FOR TAT never scores more than the other player. Be nice. (Don’t defect first.) The best way to do well is to cooperate with others who are also nice. Retaliate swiftly. Or, others will take advantage. Forgive. Feuds are costly. Defections shouldn’t prevent cooperation later on. Don’t be too clever. Too much cleverness looks RANDOM.
Outline Prisoner’s Dilemma Two Contests Some Analysis Real-World Scenarios Agent-Based Simulation Discussion
Trench Warfare Common form of battle in World War I Armies in deep trenches on either side of battle line Machine guns and artillery Prolonged engagement with same group of enemy troops
Trench Warfare is an IPD You Payoff Matrix Cooperate Shoot to Kill You live and win a medal. Cooperate Both live. Them You die and they win. Shoot to Kill Both die. For a single round, no matter what the enemy does, it’s better to shoot to kill. But, for an indefinite number of rounds…?
Trench Warfare Cooperation spontaneously evolved: “If the British shelled the Germans, the Germans replied, and the damage was equal.” “[A British staff officer was] astonished to observe German soldiers walking about within rifle range…” “These people … did not know there was a war on. Both sides … believed in … ‘live and let live’.” “Suddenly a salvo arrived but did no damage. Naturally both sides got down and our men started swearing at the Germans, when all at once a brave German got on to his parapet and shouted out ‘We are very sorry about that; we hope no one was hurt. It is not our fault, it is that damned Prussian artillery.’”
Fig Tree and Fig Wasp Wasp Lay eggs without pollinating Lay eggs and pollinate Payoff Matrix Bear fruit; breed good wasps. No fruit; breed bad wasps. Let fig ripen Die, but have kids. Live, and have kids. Tree Bear fruit. No fruit. Cut off fig Die, no kids. Live, no kids. For a single round, trees should cut off figs, wasps should lay eggs without pollinating. But, for an indefinite number of rounds…?
Outline Prisoner’s Dilemma Two Contests Some Analysis Real-World Scenarios Agent-Based Simulation Discussion
Agent-Based Modeling Simulation as a scientific method Simulation allows hypothesis discovery, verification, and prediction. Simulation is particularly valuable for interactions of many agents and the agents are expected to adapt.
Other Modeled Social Theories 1963 Cyert & March – Behavioral theory of the firm 1974 Schelling – Segregated neighborhoods 1980 Axelrod – Cooperation 2003 Axelrod – Ethnocentrism
Outline Prisoner’s Dilemma Two Contests Some Analysis Real-World Scenarios Agent-Based Simulation Discussion
Summary Iterated Prisoner’s Dilemma as a model for many different types of interaction There is no single optimal strategy in an IPD game, but TIT FOR TAT is strong, robust, and stable. In real-world IPD scenarios, TIT-FOR-TAT-like strategies naturally evolve, even among antagonists and unintelligent players. Agent-based modeling is a powerful tool for modeling populations in social and biological sciences.
TIT FOR TAT and Ethics Mahabharata (~3000 BC) “One should not behave towards others in a way which is disagreeable to oneself. This is the essence of morality. All other activities are due to selfish desire.” Hammurabi’s Code (~1750 BC) “If a man put out the eye of another man, his eye shall be put out. If he break another man's bone, his bone shall be broken.” The Golden Rule (~30 AD) “Do unto others as you would have them do unto you.” Kant’s Categorical Imperative “Act so that the maxim of action may be capable of becoming a universal law.” Garrett Hardin (“The Tragedy of the Commons”, 1968) “Conscience is self-eliminating.”