Cooperating Without Looking

Cooperating Without Looking

Suppose a friend asks you to proofread a paper… You hesitate while thinking about how big a pain it is and say, “Hmm. Um. Well, OK.” You get less credit than if you agreed w/o hesitation

Colleague asks you to attend his talk. You ask, “is it related to my research?” before agreeing to attend. You get less credit than if you agreed without asking

Why do you get less credit for cooperating when you deliberate (“look”)?

Cannot be explained by existing models of repeated games, like repeated prisoner’s dilemma In such models, players only attend to your past actions not “deliberation” process

Intuitively… Cooperators who don’t look (CWOL) can be trusted to cooperate even when the temptation to defect is high But how do we know this added trust is worth the cost of losing out on missed opportunities to defect?

Outline going forward… 1) Describe a simple model, “the envelope game” 2) Find (natural, intuitive) conditions under which CWOL is an equilibrium of this game 3) Show that even if agents are not consciously choosing their strategies but instead strategies are learned or evolved, CWOL still emerges 4) Interpret these results in terms of some less straightforward social applications, such as why we: 1) why we admire principled people 2) abhor taboo tradeoffs 3) are blinded by love This analysis will yield novel predictions and lead to some useful prescriptions

Here is our model…

“The Envelope Game”

First… We model variation in costs of cooperation as follows: • With probability p, Low Temptation “card” is chosen stuffed in envelope. • With probability 1-p, High is chosen

2 • Second… • We model player 1’s choice of whether to “look” • 1 chooses whether or not to open the envelope • Crucially we assume others (player 2) can observe whether the envelope was opened

2 Third… 1 then chooses whether or not to cooperate 2 is again able to observe

Fourth… We model others’ “trust” in Player 1 Player 2 chooses whether to continue the interaction or exit If continues, game repeats with probability w If exits, both get 0 in all future periods

We assume the payoffs have the following properties: • Cooperation is costly for Player 1, especially when the temptation is high • Both Players like cooperative interactions, but Player 2 would prefer no interaction to one in which Player 1 sometimes defects

Here’s the payoff matrix at each stage: Low Temptation High Temptation a, b a, b C D cH, d cL, d • Our assumptions then amount to: • 1) cH >cL > a> 0 • 2) b*p + d*(1–p) < 0 < b

Since this is a repeated game, strategies must specify actions for any history Example for player 1: Look on even rounds, cooperate if don’t look, or if temptation is low and it is a prime-numbered round Example for player 2: Exit if player 1 looks three times in a row or defects twice in a row

We are especially interested in these strategies for player 1 CWOL: Don’t look and cooperate CWL: Look and cooperate regardless of the temptation ONLYL: Look and cooperate only if the temptation is low ALLD: Look or don’t look, defect regardless of the temptation

And for player 2 CWOL: Exit if player 1 looks or defects EID: Exit if player 1 defects ONLYL: Exit if player 1 defects when the temptation is low ALLD: Exit

Next, let’s think about payoffs over the course of the entire game For the strategy pair Player 1: Look or don’t look, defect regardless of the temptation (ALLD) Player 2 (ALLD): Exit Player 1 expects pcl + (1-p) ch in first round, 0 thereafter Player 2 gets d in first round, 0 thereafter For the strategy pair Player 1: Don’t look and cooperate (CWOL) Player 2: Exit if player 1 looks (CWOL) Player 1 gets a + wa + w2a + w3a + … = a/(1-w) Player 2 gets b + wb + w2b + w3b + … = b/(1-w) For the strategy pair Look and cooperate only if the temptation is low (ONLYL) Exit if player 1 defects (EID) Player 1 gets ap + ch(1-p) + pw(ap + ch(1-p)) + (pw)2(ap + ch(1-p)) + … = (ap+ ch(1-p))/(1-pw) Player 2 gets bp + d(1-p) + pw(bp + d(1-p)) + (pw)2(bp + d(1-p)) + … = (bp + d(1-p))/(1-pw)

Ready for main result…

The strategy pair (CWOL) Player 1: Don’t look and cooperate Player 2: Exit if player 1 looks or defects is an equilibrium provided a/(1-w) > cLp + cH(1-p)

Check if player 1 can benefit by deviating: If CWOL, get a/(1-w) If 1 deviates to look, might as well defect, in which case expect c1p + c2(1-p) today and 0 ever after Check if player 2 can benefit by deviating: If CWOL, get b/(1-w) If exits when player 1 doesn’t look, gets 0 ever after  CWOL is an equilibrium iffa/(1-w) > cLp + cH(1-p) Interpretation: CWOL is an equilibrium iff EXPECTED gains from defecting today are less than the value of maintaining a cooperative interaction

Let’s contrast this with equilibrium conditions for “cooperate with looking” (CWL) to see when looking matters Player 1: Look and cooperate regardless of the temptation Player 2: Exit if player 1 defects Now player 1 may be tempted to deviate when she knows the temptation is high, in which case she would get cH So we need a/(1-w) > cH Interpretation: CWL is an equilibrium iffMAXIMAL gains from defecting today are less than the value of maintaining a cooperative interaction Hence we predict “Looking” will matter when the expected gains from defecting are small but the maximal gains are large

Likewise, if we relax our assumption b*p+d*(1–p)<0 We get another equilibrium (ONLYL) Player 1: Look and cooperate only if the temptation is low Player 2: Exit if player 1 defects when the temptation is low  For looking to matter we ALSO need thatdefection is sufficiently bad for 2 that he doesn’t want to interact with 1s who even seldomly defect

Summary so far: CWOL is an equilibrium Comparison to CWL and ONLYL tells avoiding/detecting looking is most valuable when… Average temptation low but maximal high Defection is harmful

As always, we check if CWOL emerges from dynamics

Use Replicator Dynamic. Some reminders… -Infinite populations -At any point in time, each strategy has a certain frequency -Payoffs are determined based on the expected opponent’s play, given this frequency -Strategies reproduce proportional to their payoffs

Also… Since replicator requires few strategies, we restrict to the following ones

Restricted Strategy Space

Since we now have a population of player 1s and 2s, we classify equilibria according to those that are “behaviorally equivalent”…

Classifying Equilibria

Simulation For each of many parameter values… For each of 5,000 trials… We seed the population with random mixtures of strategies Numerically estimate the replicator dynamic (which is an ODE) Wait for the population to stabilize Then classify the outcomes (ignoring small errors)

We find… Population ends up at CWOL fairly often in relevant parameter region

a*=cH/cLp + cH(1-p),)a**=cH/(1-w)

So, we showed CWOL is an equilibrium And that emerges in the dynamics And derived equilibria conditions Next, we want to discuss some applications

Before we do, what sorts of things might constitute not looking at the proximal level? Some examples… Making decisions quickly, intuitively, based on gut feelings, emotions, or heuristics Why? Because no chance to carefully deliberate over costs and benefits Making decisions based on principles or ideologies. E.g., “I never flake on my friends” Why? Because act based on something other than costs and benefits in this particular situation

Consistent with CWOL, intuitive decision making is detectable Via pupil dilation, flush skin, elevated hard rate, stuttering, etc. Similar evidence needed for principles

What applications can these proximal mechanisms help us explain?

First application… Why do we admire principled people, and not those who are “strategic”? E.g., Don’t like politician who determinesstance on gay marriage after checking the polls. Such politicians come off as “sleazy”

Explanation based on CWOL: If only pro gay marriage when polls indicate popular, probably don't "believe in the cause” If don't believe in the cause, can't be trusted to fight for it when becomes unpopular So gay rights activists can't trust Activists for other causes can’t trust either. If you arenot into one cause you purportedly "fight for" perhaps you are not really "into" any cause

Note: Model assumes only one kind of looking. Of course we want our politicians to behave strategically about some things CWOL suggests we want them to not look at benefit to self but to look at benefit to country

Not just in politicians…we tend to find “strategic” behavior gross more generally: Like this cop from “the wire” who prefers “better stats” to solving murders Do you respect him? Can you explain why not without CWOL?

In contrast, we respect principled people like Like this statesmen in the West Wing who returns a card that can save his life, out of principle We admire him for being principled…even though turning down the card helps no one Again, can you explain this without CWOL?

But we don’t always care if others are principled, and we don’t always adopt principles Can we predict when?

CWOL predict, will be principled/will care if other’s principled when… 1) Incentives usually aligned 2) But are rare instances where have opportunity to greatly harm other at large benefit to self E.g., Won’t care if doubles partner or chess competitor is principled or even student; will care if wife is Can you think of a good test of this?

Next application… “authentic altruism” Are all acts of altruism motivated (at the proximal level) by self interest? E.g., anticipation of warm glow, avoidance of awkwardness or guilt as in the paper, “Avoiding the Ask” What about Mother Teresa and Gandhi? Were they “authentic” altruists? Does authentic altruism exist?

CWOL explains why authentic altruism may evolve/be learned: it makes you even more trustworthy And, as before, CWOL tells us when to expect authentic altruism…

Note that just because can “explain” why Mother Teresa could be authentically altruistic doesn’t make it any less authentic. Simply explains why we find her behavior beautiful! She still was altruistic without considering the costs or benefits to herself. And she still can be trusted to do so when she knows it would make her unhappy

Next application… why we give anonymously and in “one-shot” scenarios? For example… in double-blind lab experiments, when helping strangers, etc.

CWOL offers two possible explanations First is principles… “right thing” to give regardless of whether one-shot or anonymous Second is intuition… People might adopt the habit of cooperating intuitively, without attending to costs/benefits such as whether cooperative situations are one-shot or anonymous

Cooperating Without Looking

Cooperating Without Looking

Presentation Transcript

Cooperating Intelligent Systems

Competing and Cooperating

Cooperating Agency Status

Cooperating Teacher Orientation

Cooperating Teacher Training

Cooperating Intelligent Systems

Cooperating Intelligent Systems

Cooperating Intelligent Systems

Cooperating Processes

Cooperating sequential processes

Cooperating teacher profile

Cooperating Teacher Profile

Cooperating Intelligent Systems

Cooperating Intelligent Systems

Cooperating Intelligent Systems

Cooperating Threads

Cooperating Intelligent Systems

Cooperating Intelligent Systems

Cooperating Processes

Cooperating Intelligent Systems

Cooperating Intelligent Systems

Cooperating with Windows