Influence of Payoffs on Decision Dynamics: Optimality, Behavior, Physiology, and the Competing Accumulator Model

How Payoffs Influence Decision Dynamics: Optimality, Behavior, Physiology, and the Competing Accumulator Model Jay McClellandStanford University With Juan Gao, Bill Newsome, Phil Holmes, and Marius Usher

A High-Stakes, Time-Critical Decision • A diffuse form is coming toward you rapidly: What should you do? • You could shoot at it, but it may be your friend • You can hold your fire, but it might shoot you! • If you must decide right NOW, what should you do? • How can we optimize our choices, in time-critical, uncertain situations? • How do we take relative costs and benefits of different choices into account, when we must make decisions with different amounts of evidence?

How Payoffs Influence Decision Dynamics: Optimality, Behavior, Physiology, and the Competing Accumulator Model Jay McClellandStanford University With Juan Gao, Bill Newsome, Phil Holmes, and Marius Usher

A Starting Place Near-Optimal Reward Bias ina Perceptual Decision Taskin Primates

Experimental Paradigm(Rorie, Gao, McClelland & Newsome, 2010)

Behavioral Results for Two Monkeys

An Account for the Behavioral Data Based on Signal Detection Theory

Choice Data Exhibits Near-Optimal Bias(Feng, Holmes, Rorie & Newsome, 2009)

Our Questions • Can participants optimize their reward bias • when they must respond at different times after stimulus onset, based on different amounts of accumulated information? • If they deviate from optimality • can we develop a model of decision dynamics that explains the performance we do observe?

Timeline of the Experiment

Proportion of Choices toward Higher Reward

Sensitivity varies with time

Optimal criterion varies with sensitivity

Similar logic applies when there is more than one difficulty level

Optimal vs. Actual Bias

Our Questions • Can participants optimize their reward bias • when they must respond at different times after stimulus onset, based on different amounts of accumulated information? • Not Perfectly • If they deviate from optimality • can we develop a model of decision dynamics that explains the performance we do observe?

Steps toward a model • A model of accuracy dynamics • The inhibition-dominant Leaky Competing Accumulator Model • And it’s one-dimensional reduction • Incorporating reward bias • First in the one-dimensional model • Then in the full model

Usher and McClelland (2001)Leaky Competing Accumulator Model y2 y1 I1 I2 • Addresses the process of decidingbetween two alternatives basedon external input, with leakage, mutual inhibition, and noise: dy1/dt = I1-gy1–bf(y2)+x1 dy2/dt = I2-gy2–bf(y1)+x2 f(y) = [y]+ • Participant chooses the most active accumulator when the go cue occurs • This is equivalent to choosing response 1 iff y1-y2 > 0 • Let y = (y1-y2). While y1 and y2 are positive, the model reduces to: dy/dt = I-ly+x [I=I1-I2; l = g-b; x=x1-x2]

Kiani, Hanks and Shadlen 2008 Random motion stimuli of different coherences. Stimulus duration follows an exponential distribution. ‘go’ cue can occur at stimulus offset; response must occur within 500 msec to earn reward.

The earlier the pulse, the more it matters(Kiani et al, 2008)

These results rule out leak dominance Still viable X

The Full Non-Linear LCAi Model y1 y2 Although the value of the differencevariable is not well-captured by thelinear approximation, the sign of thedifference is approximated very closely.

What Kind of Dynamic Model Can Account for the Data? • A model of accuracy dynamics • The Inhibition Dominant LCA • Incorporating reward bias • First in the one-dimensional model • Then in the full model

Three Hypotheses • Reward acts as an input from reward cue onset til the end of the integration period • Reward influences the state of the accumulators before the onset of the stimulus • Reward introduces an offset into the decision

Matches the pattern of the data!

Consistent Evidence from Physiology (Rorie et al, 2010) HL HH

Fits Based on Linear Model

Fitted Parameters How optimal is each S’s yr given the other parameters?

Fits based on full LCAi

Relationship between response speed and choice accuracy

High-Threshold LCAi

Our Questions • Can participants optimize their reward bias • when they must respond at different times after stimulus onset, based on different amounts of accumulated information? • not perfectly • If they deviate from optimality • can we develop a model of decision dynamics that explains the performance we do observe? • the LCAi with reward affecting the starting point of accumulation provides a good account

Some Future Directions • What prevents participants from achieving a higher degree of optimality? • What are the sources of individual differences in performance, and are there ways to train participants to perform better? • Can Cognitive Neuroscience methods help us further elucidate the mechanisms underlying performance, or shed light on how performance might be optimized?

Preliminary Simulation of High-Threshold LCA

Three motion conditions crossed with 8 coherences. Data shown are percent correct, averaged across coherences We include a switch condition with 6.4% and 12.8% coherences only (no right answer). Each participant has at least 600 trials per data point over at least 10 sessions. 1) Early 2) Late 3) Constant 4) Switch Our Experiment:A Stronger Manipulation Stimulus Duration

Results S1 MT

Results S2 CS

Results: S3 SC

Different levels of activation of correct and incorrect responses in Inhibition-dominant LCA Final time slice errors correct

Influence of Payoffs on Decision Dynamics: Optimality, Behavior, Physiology, and the Competing Accumulator Model

Influence of Payoffs on Decision Dynamics: Optimality, Behavior, Physiology, and the Competing Accumulator Model

Presentation Transcript

Stanford University

James L. McClelland Stanford University

Jay McClelland, Stanford University

Mohammad Alizadeh Stanford University Joint with:

Stanford University

Bill Gao

Jay McClelland

Phil Knight and Bill Bowerman

Stanford University

Joint work with Philip Eckhoff and Phil Holmes

Stanford University

Astronomy with cm – Mpc lenses Phil Marshall KIPAC – SLAC – Stanford University

Jay McClelland

James L. McClelland Psychology 226 Fall, 2008 Stanford University

Stanford University

Stanford University

Jie Gao, Leonidas Guibas, An Nguyen Computer Science Department Stanford University

Stanford University

Jay McClelland

Jay McClelland