A possible representation of reward in the learning of saccades

A possible representation of reward in the learning of saccades Cornelius Weber and Jochen Triesch Frankfurt Institute for Advanced Studies Johann Wolfgang Goethe Universität Frankfurt am Main, Germany Presentation at the EpiRob 2006, September, Paris

Contents • saccade learning: supervised or reward-driven? • separate control of horizontal & vertical saccades

Learning Signals generic specific unsupervised & reinforcement learning supervised learning imitation exploration development emergence learn from environment genetic description or instructor

Saccades in the adult are quite inaccurate undershoot overshoot Data taken from: A. Lewis, R. Garcia and L. Zhaoping (2003) The distribution of visual objects on the retina: connecting eye movements and cone distributions. Journal of Vision, 3, 893-905.

Saccade Learning Signal Generic? Specific? vectorial error performance

Saccade control downstream of the SC SC: superior colliculus LLBN: long-lead burst neuron EBN: excitatory burst neuron IBN: inhibitory burst neuron VI: abducens nucleus NPH/MVN: cells in nucleus prepositus hypoglossi or medial vestibular nucleus OPN: omnipause neuron La: latch neurons Tr: trigger signal Figure source: D. Sparks (2002) The brainstem control of saccadic eye movements. Nat Rev Neurosci, 3: 952-64.

Site of plasticity SC pre-saccadic activation patch looks like adaptation fields adaptation is downstream exact error signal unknown Figure source: M.A. Frens and A.J. Van Opstal (1997). Monkey superior colliculus activity during short-term saccadic adaptation. Brain Research Bulletin 43(5): 473-84.

Figure source: E.A. Vessel (2004) Behavioral and Neural Investigation of Perceptual Effect. www.cns.nyu.edu/~vessel/pubs/

Sensory neuron responses are modulated by reward • in V1 of adult rat M.G.Shuler, M.F.Bear (2006) Reward Timing in the Primary Visual Cortex. Science 311, 1606-9. • in the inferior colliculus (IC) of adult monkey R.R.Metzger, N.T.Greene, K.K.Porter, J.M. Groh (2006) Effects of Reward and Behavioral Context on Neural Activity in the Primate Inferior Colliculus. J Neurosci 26(28), 7468-76.

Foveal stimuli are magnified on the SC retina SC

Vectorial error vs. Reward signal

Constant-sized error allows no feedback of learning progress (Robinson, 2003) target shifts 1o backward relative to saccade endpoint Figure source: F. Robinson, C. Noto, S. Bevans (2003) Effect on visual error size on saccade adaptation in monkey. J Neurophysiol, 90: 1235-44.

Constant-sized error allows no feedback of learning progress Figure source: F. Robinson, C. Noto, S. Bevans (2003) Effect on visual error size on saccade adaptation in monkey. J Neurophysiol, 90: 1235-44.

Constant-sized error allows no feedback of learning progress gain change error size Figure source: F. Robinson, C. Noto, S. Bevans (2003) Effect on visual error size on saccade adaptation in monkey. J Neurophysiol, 90: 1235-44.

Oblique saccades are a “sum” of horizontal and vertical components • Figure source: lecture "Modelling of sensorimotor systems" by S. Glasauer, Ludwig-Maximilians-Universität München • www.nefo.med.uni-muenchen.de/~sglasauer

Horizontal and vertical control circuits are separate riMLF: rostral interstitial nucleus of the medial longi- tudinal fasciculus NIC: interstitial nucleus of Cajal MRF: midbrain reticular formation PPRF: paramedian pontine reticular formation NPH: nucleus prepositus hypoglossi Med. RF: medullary reticular formation III: oculomotor nucleus IV: trochlear nucleus VI: abducens nucleus Figure source: D. Sparks (2002) The brainstem control of saccadic eye movements. Nat Rev Neurosci, 3: 952-64.

Visual field topography in the SC

Model architecture Model assumption success-based learning for vertical saccades vectorial error for horizontal saccades

Algorithm for vectorial error based learning (horizontal) D = ai - ac aSC ∆wh ≈ D aSC mh mh

Algorithm for performance reward based learning (vertical) mv ∆wv ≈ T aSC mv T = apost - apre aSC

Learnt weights and model errors

Conclusion Two possible implementations of feedback for learning: Performance-based reward • possible for vertical saccades • more generic Vectorial error • for horizontal saccades • simple implementation • specific to brain sub-system * * redoing Robinson (2003) experiment for vertical saccades could tell

Figure source: “Attention and Eye Movement in young Infants: Neural Control and Development” by J.E. Richards and S.K. Hunter; http://cogsci.webedu.ccu.edu.tw/Attention_and_Development_93/Attention and Eye Movement in young Infants.ppt

Sub-cortical and cortical visual systems primary visual system, functional after 2 months of age secondary visual system, mature at birth cerebellum serves both Figure source: J.J. Hopp and A.F. Fuchs (2004) The characteristics and neuronal substrate of saccadic eye movement plasticity. Progress in Neurobiology, 72: 27-53.

A possible representation of reward in the learning of saccades