10 likes | 157 Views
Neural Correlates of Learning Models in a Dynamic Decision-Making Task. Darrell A. Worthy 1 , Marissa A. Gorlick 2 , Jeanette Mumford 2 , Akram Bakkour 2 , Russell A. Poldrack 2 & W. Todd Maddox 2 1. Texas A&M University 2. University of Texas at Austin . fMRI Results. Discussion. Results.
E N D
Neural Correlates of Learning Models in a Dynamic Decision-Making Task Darrell A. Worthy1, Marissa A. Gorlick2, Jeanette Mumford2, Akram Bakkour2, Russell A. Poldrack2 & W. Todd Maddox2 1. Texas A&M University 2. University of Texas at Austin fMRI Results Discussion Results Learning Models Method Introduction fMRI Results Method Introduction Introduction Introduction Introduction Introduction Introduction Introduction Introduction Choice-Independent Results Activation Modulated by Prediction Errors We used a reward structure where the option that provides larger immediate rewards is sub-optimal. The Increasing option provides smaller immediate rewards but increases delayed rewards. Performance Recent work has highlighted the value of using learning models in fMRI analyses. A common approach in decision-making tasks has been to use prediction errors from reinforcement learning (RL ) models as regressors. Recently we have found that a heuristic win-stay-lose-shift WSLS model can often provide a better fit to the data than a RL model. This is especially true in dynamic decision-making tasks where participants must learn how current choices affect future outcomes. Our goal is to investigate how prediction errors from RL and WSLS models correlate with neural activity. Despite providing a better fit we did not find activation that correlated with prediction errors from the WSLS model. Correlates of RL model’s prediction errors Increasing-Optimal Task We subtracted the fit of each learning model from the fit of the Baseline model. Higher values indicate a better fit. Prediction error’s from the RL model corresponded with activation in left parietal, left frontal pole, and in the left cerebellum. Performed task in GE 3T scanner. Third level statistics were run cluster corrected -z threshold 2.3 -cluster p threshold .05. Comparison when accounting for WSLS prediction errors We examined activity when prediction errors from the RL model were greater than prediction errors from the WSLS model. Data were fit with a Baseline model, RL model, and a WSLS model. The Baseline model assumes fixed choice probabilities (no learning) and has one free parameter, p(increasing)5. The RL model updates Expected Values of each option based on a prediction error: to develop probabilities for selecting each option (a) using a Softmax decision rule6: WSLS has two free parameters, probability of staying with the same option if reward is ≥ reward on previous trial, and probability of shifting if reward is < the previous trial. Choice-Independent 15 healthy young adults (18-27 yrs.) participated. Used the Mars Farming Task.5 Performed 12 25 trial blocks over 4 functional runs. Each block separated by 16-second fixation block. Both learning models fit better than the Baseline model, with the WSLS model fitting slightly better than the RL model. When WSLS prediction errors were accounted for neural activity in the left cerebellum was modulated by the RL model’s prediction errors. Without Models – All Trials > Fixation Striatal activation was not modulated by the RL model’s prediction errors. We could not identify areas that were modulated by prediction errors from the WSLS model. Activity in the medial and lateral left cerebellum corresponded with prediction errors from the RL model. Some recent work also supports a role for the cerebellum in decision-making under uncertainty. The neo-cerebellum may monitor the consequences of actions by tracking prediction errors from an RL model. Model-Based Predictions RL Model – Prediction errors would correlate with striatal activity. WSLS Model - Prediction errors may correlate with frontal activity. We found task-related activation a) in the striatum (bilateral nucleus accumbens, bilateral putamen, and right caudate), b) in the anterior cingulate, and c) in the lateral orbitofrontal cortex. Participants had to decide which ‘oxygen extraction system’ led to the greatest cumulative oxygen collection. 1. Marschner, A., Mell, T., Wartenburger, I., Villringer, A., Reischies, F.M., Heekeren, H.R., (2005). Reward-based decision-making and aging. Brain Research Bulletin, 67, 382-390. 2. Mell, T. Heekeren, H.R., Marschner, A., Wartenburger, I., Villringer, A., & Reicschies, F.M., (2005). Effect of aging on stimulus-reward association learning. Neuropsychologia, 43, 554-563. 3 Cook, I.A., Bookheimer, S.Y., Mickes, L., Leuchter, A.F., Kumar, A. (2007) Aging and brain activation with working memory tasks: an fMRI study of connectivity. International Journal of Geriatric Psychiatry, 22, 332-342. 4. Park, D.C., & Reuter-Lorenz, P. (2009). The adaptive brain: Aging and neurocognitive scaffolding. Annual Review of Psychology, 60, 173-196. 5. Worthy, D.A., Gorlick, M.A., Pacheco, J.L., Schnyer, D.M., & Maddox, W.T. (2011). With age comes wisdom: Decision-making in older and younger adults. Psychological Science, 22, 1375-1380. Glascher, J., Daw, N., Dayan, P., O’Doherty, J. P. (2010). States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron, 66, 585-595. http://worthylab.tamu.edu/Publications worthyda@tamu.edu Supported by NIMH grant MH077708 to WTM and a supplement to NIMH grant MH077708 to DAW.