Learning, Volatility and the ACC

Learning, Volatility and the ACC • Tim Behrens • FMRIB + Psychology, University of Oxford • FIL - UCL.

i-1 0.8 i-2 i-3 i-4 i-5 i-6 i-7 i-8 CON 0.7 0.6 0.5 0.4 0.3 Reward History Weight (β) 0.2 0.1 0.0 -0.1 -0.2 Trials Into Past B Kennerley, et al., Nature Neuroscience, 2006

i-1 0.8 i-2 i-3 i-4 i-5 i-6 i-7 i-8 CON 0.7 0.6 0.5 0.4 0.3 Reward History Weight (β) 0.2 0.1 0.0 -0.1 -0.2 Trials Into Past B ACCs Kennerley et al. Nature Neuroscience, 2006

ACCG Monkeys will sacrifice food opportunities to look at other monkeys Rudebeck,et al. Science 2005

ACCG Interest in other individuals is reduced after ACC gyrus lesion Rudebeck,et al. Science 2005

Anatomy - Differences in connections between ACCs and ACCg. • Connections unique to the sulcus are mainly with motor regions: • Primary motor cortex • Premotor cortex • Parietal motor areas • Spinal Cord • ACCs has information about our own actions

Anatomy - Differences in connections between ACCs and ACCg. • Connections unique to the gyrus are mainly with regions that process emotional and biological stimuli: • Periacqueductal grey • hypothalamus • STS/STG • Insula/Temporal pole connections are stronger to the gyrus • ACCg has access to information about other agents.

Anatomy - shared connections between ACCs and ACCg. • Some shared connections • Orbitofrontal cortex • Amydala • Ventral striatum • ACCg and ACCs are strongly interconnected • Both regions have access to and influence over reward and value processing.

ACC Sulcus and learning about your actions.

i-1 0.8 i-2 i-3 i-4 i-5 i-6 i-7 i-8 CON 0.7 0.6 0.5 0.4 0.3 Reward History Weight (β) 0.2 0.1 0.0 -0.1 -0.2 Trials Into Past B ACCs Kennerley et al. Nature Neuroscience, 2006

i-1 0.8 i-2 i-3 i-4 i-5 i-6 i-7 i-8 CON 0.7 0.6 0.5 0.4 Reward History Weight (β) 0.3 0.2 0.1 0.0 -0.1 -0.2 Trials Into Past What determines the integration length? Kennerly et al. Nat Neurosci 2006 Sugrue et al. Science 2005

i-1 0.8 i-2 i-3 i-4 i-5 i-6 i-7 i-8 CON 0.7 0.6 0.5 0.4 Reward History Weight (β) 0.3 0.2 0.1 0.0 -0.1 -0.2 Trials Into Past VOLATILE Reward probabilities change approximately every 25 trials STABLE Reward probabilities change only after hundreds of trials Kennerly et al. Nat Neurosci 2006 Sugrue et al. Science 2005

α x δ prediction (Vt) outcome new prediction (Vt+1) δ Reinforcement learning • We need to continually re-appraise the value of an action based each new experience.

The learning rate is the weight given to the current information The prediction error is the information available from this event Updating beliefs on the basis of new information Vt+1=Vt +( α x δ ) 14

The learning rate and the value of information. Vt+1=Vt +( α x δ ) The learning rate should represent the value of the current information for guiding future beliefs.

α=0.01 α=0.1 α=0.4 Relationship with integration length

stable 37 63 Behrens et al., Nature Neuroscience, 2007

Vt+1=Vt+α x δ Behrens, Woolrich, Walton, Rushworth, Nature Neuroscience, 2007

changes in reward estimates occur throughout the task… …as do change in volatility estimates Behrens, Woolrich, Walton, Rushworth, Nature Neuroscience, 2007

Monitor x Volatility Decide Monitor Behrens et al., Nature Neuroscience, 2007

ACC effect size predicts learning rate across subjects Behrens, Woolrich, Walton &Rushworth Nat Neurosci 2007

ACC Gyrus and learning about your social partners.

ACCG Interest in other individuals is reduced after ACC gyrus lesion Rudebeck et al. Science 2005

Rudebeck et al., Science, 2006

Learning about other agents 37 63 Behrens, Hunt, Woolrich, Rushworth Nature 2008

Value of action information Value of social information Sources of information Probability that correct colour is blue Probability that confederate advice is good Behrens, Hunt, Woolrich, Rushworth Nature 2008

Social information is integrated over time - behaviour

Reward Prediction Error Vt+1=Vt +( α xδ ) Reward - Expectation Outcome Effect size Time Behrens, Hunt, Woolrich, Rushworth Nature 2008

Prediction error on a social partner. Vt+1=Vt +( α xδ ) Lie event - Lie prediction Outcome Effect size Time Behrens, Hunt, Woolrich, Rushworth Nature 2008

The value of information and the ACC Vt+1=Vt +( αx δ ) Value of reward information Value of social information 30

Combining Information to drive behaviour Vt+1=Vt+( α x δ )

Conclusions • ACC codes a learning signal when information is observed. • This signal predicts the speed of learning. • Learning from our own and others’ actions are processed in parallel in ACCs and ACCg. • The outputs of these parallel learning processes are combined in the reward system.

Acknowledgments • Matthew Rushworth • Mark Woolrich • Laurence Hunt • Mark Walton 33

Learning, Volatility and the ACC

Learning, Volatility and the ACC

Presentation Transcript

VOLATILITY

Volatility Forecasting

Volatility and Hedging Errors

Realized volatility and acquisitions

Volatility and Returns

Volatility

Volatility

Volatility Surface 1.Implied Volatility 2.Volatility Smile 3.Term Structure of Volatility

The Analysis of Volatility

Volatility

VOLATILITY FORECASTING

Learning, Volatility and the ACC

Requirements Volatility

Requirements Volatility

VOLATILITY AND DEVELOPMENT

The Global Economy Labor Markets and Volatility

Volatility

The Distillation and Volatility of Ionic Liquids

Volatility Models

Volatility Models

Volatility

Volatility