410 likes | 417 Views
This study examines how payoffs influence decision-making processes in time-critical and uncertain situations. It explores the optimization of choices based on different amounts of evidence and the consideration of relative costs and benefits. The research utilizes the Competing Accumulator Model to explain decision dynamics.
E N D
How Payoffs Influence Decision Dynamics: Optimality, Behavior, Physiology, and the Competing Accumulator Model Jay McClellandStanford University With Juan Gao, Bill Newsome, Phil Holmes, and Marius Usher
A High-Stakes, Time-Critical Decision • A diffuse form is coming toward you rapidly: What should you do? • You could shoot at it, but it may be your friend • You can hold your fire, but it might shoot you! • If you must decide right NOW, what should you do? • How can we optimize our choices, in time-critical, uncertain situations? • How do we take relative costs and benefits of different choices into account, when we must make decisions with different amounts of evidence?
How Payoffs Influence Decision Dynamics: Optimality, Behavior, Physiology, and the Competing Accumulator Model Jay McClellandStanford University With Juan Gao, Bill Newsome, Phil Holmes, and Marius Usher
A Starting Place Near-Optimal Reward Bias ina Perceptual Decision Taskin Primates
Experimental Paradigm(Rorie, Gao, McClelland & Newsome, 2010)
An Account for the Behavioral Data Based on Signal Detection Theory
Choice Data Exhibits Near-Optimal Bias(Feng, Holmes, Rorie & Newsome, 2009)
Our Questions • Can participants optimize their reward bias • when they must respond at different times after stimulus onset, based on different amounts of accumulated information? • If they deviate from optimality • can we develop a model of decision dynamics that explains the performance we do observe?
Similar logic applies when there is more than one difficulty level
Our Questions • Can participants optimize their reward bias • when they must respond at different times after stimulus onset, based on different amounts of accumulated information? • Not Perfectly • If they deviate from optimality • can we develop a model of decision dynamics that explains the performance we do observe?
Steps toward a model • A model of accuracy dynamics • The inhibition-dominant Leaky Competing Accumulator Model • And it’s one-dimensional reduction • Incorporating reward bias • First in the one-dimensional model • Then in the full model
Usher and McClelland (2001)Leaky Competing Accumulator Model y2 y1 I1 I2 • Addresses the process of decidingbetween two alternatives basedon external input, with leakage, mutual inhibition, and noise: dy1/dt = I1-gy1–bf(y2)+x1 dy2/dt = I2-gy2–bf(y1)+x2 f(y) = [y]+ • Participant chooses the most active accumulator when the go cue occurs • This is equivalent to choosing response 1 iff y1-y2 > 0 • Let y = (y1-y2). While y1 and y2 are positive, the model reduces to: dy/dt = I-ly+x [I=I1-I2; l = g-b; x=x1-x2]
Kiani, Hanks and Shadlen 2008 Random motion stimuli of different coherences. Stimulus duration follows an exponential distribution. ‘go’ cue can occur at stimulus offset; response must occur within 500 msec to earn reward.
The earlier the pulse, the more it matters(Kiani et al, 2008)
These results rule out leak dominance Still viable X
The Full Non-Linear LCAi Model y1 y2 Although the value of the differencevariable is not well-captured by thelinear approximation, the sign of thedifference is approximated very closely.
What Kind of Dynamic Model Can Account for the Data? • A model of accuracy dynamics • The Inhibition Dominant LCA • Incorporating reward bias • First in the one-dimensional model • Then in the full model
Three Hypotheses • Reward acts as an input from reward cue onset til the end of the integration period • Reward influences the state of the accumulators before the onset of the stimulus • Reward introduces an offset into the decision
Consistent Evidence from Physiology (Rorie et al, 2010) HL HH
Fitted Parameters How optimal is each S’s yr given the other parameters?
Our Questions • Can participants optimize their reward bias • when they must respond at different times after stimulus onset, based on different amounts of accumulated information? • not perfectly • If they deviate from optimality • can we develop a model of decision dynamics that explains the performance we do observe? • the LCAi with reward affecting the starting point of accumulation provides a good account
Some Future Directions • What prevents participants from achieving a higher degree of optimality? • What are the sources of individual differences in performance, and are there ways to train participants to perform better? • Can Cognitive Neuroscience methods help us further elucidate the mechanisms underlying performance, or shed light on how performance might be optimized?
Three motion conditions crossed with 8 coherences. Data shown are percent correct, averaged across coherences We include a switch condition with 6.4% and 12.8% coherences only (no right answer). Each participant has at least 600 trials per data point over at least 10 sessions. 1) Early 2) Late 3) Constant 4) Switch Our Experiment:A Stronger Manipulation Stimulus Duration
Results S1 MT
Results S2 CS
Results: S3 SC
Different levels of activation of correct and incorrect responses in Inhibition-dominant LCA Final time slice errors correct