270 likes | 390 Views
A possible representation of reward in the learning of saccades. Cornelius Weber and Jochen Triesch. Frankfurt Institute for Advanced Studies Johann Wolfgang Goethe Universität Frankfurt am Main, Germany. Presentation at the EpiRob 2006, September, Paris. Contents.
E N D
A possible representation of reward in the learning of saccades Cornelius Weber and Jochen Triesch Frankfurt Institute for Advanced Studies Johann Wolfgang Goethe Universität Frankfurt am Main, Germany Presentation at the EpiRob 2006, September, Paris
Contents • saccade learning: supervised or reward-driven? • separate control of horizontal & vertical saccades
Learning Signals generic specific unsupervised & reinforcement learning supervised learning imitation exploration development emergence learn from environment genetic description or instructor
Saccades in the adult are quite inaccurate undershoot overshoot Data taken from: A. Lewis, R. Garcia and L. Zhaoping (2003) The distribution of visual objects on the retina: connecting eye movements and cone distributions. Journal of Vision, 3, 893-905.
Saccade Learning Signal Generic? Specific? vectorial error performance
Saccade control downstream of the SC SC: superior colliculus LLBN: long-lead burst neuron EBN: excitatory burst neuron IBN: inhibitory burst neuron VI: abducens nucleus NPH/MVN: cells in nucleus prepositus hypoglossi or medial vestibular nucleus OPN: omnipause neuron La: latch neurons Tr: trigger signal Figure source: D. Sparks (2002) The brainstem control of saccadic eye movements. Nat Rev Neurosci, 3: 952-64.
Site of plasticity SC pre-saccadic activation patch looks like adaptation fields adaptation is downstream exact error signal unknown Figure source: M.A. Frens and A.J. Van Opstal (1997). Monkey superior colliculus activity during short-term saccadic adaptation. Brain Research Bulletin 43(5): 473-84.
Figure source: E.A. Vessel (2004) Behavioral and Neural Investigation of Perceptual Effect. www.cns.nyu.edu/~vessel/pubs/
Sensory neuron responses are modulated by reward • in V1 of adult rat M.G.Shuler, M.F.Bear (2006) Reward Timing in the Primary Visual Cortex. Science 311, 1606-9. • in the inferior colliculus (IC) of adult monkey R.R.Metzger, N.T.Greene, K.K.Porter, J.M. Groh (2006) Effects of Reward and Behavioral Context on Neural Activity in the Primate Inferior Colliculus. J Neurosci 26(28), 7468-76.
Foveal stimuli are magnified on the SC retina SC
Constant-sized error allows no feedback of learning progress (Robinson, 2003) target shifts 1o backward relative to saccade endpoint Figure source: F. Robinson, C. Noto, S. Bevans (2003) Effect on visual error size on saccade adaptation in monkey. J Neurophysiol, 90: 1235-44.
Constant-sized error allows no feedback of learning progress Figure source: F. Robinson, C. Noto, S. Bevans (2003) Effect on visual error size on saccade adaptation in monkey. J Neurophysiol, 90: 1235-44.
Constant-sized error allows no feedback of learning progress gain change error size Figure source: F. Robinson, C. Noto, S. Bevans (2003) Effect on visual error size on saccade adaptation in monkey. J Neurophysiol, 90: 1235-44.
Oblique saccades are a “sum” of horizontal and vertical components • Figure source: lecture "Modelling of sensorimotor systems" by S. Glasauer, Ludwig-Maximilians-Universität München • www.nefo.med.uni-muenchen.de/~sglasauer
Horizontal and vertical control circuits are separate riMLF: rostral interstitial nucleus of the medial longi- tudinal fasciculus NIC: interstitial nucleus of Cajal MRF: midbrain reticular formation PPRF: paramedian pontine reticular formation NPH: nucleus prepositus hypoglossi Med. RF: medullary reticular formation III: oculomotor nucleus IV: trochlear nucleus VI: abducens nucleus Figure source: D. Sparks (2002) The brainstem control of saccadic eye movements. Nat Rev Neurosci, 3: 952-64.
Model architecture Model assumption success-based learning for vertical saccades vectorial error for horizontal saccades
Algorithm for vectorial error based learning (horizontal) D = ai - ac aSC ∆wh ≈ D aSC mh mh
Algorithm for performance reward based learning (vertical) mv ∆wv ≈ T aSC mv T = apost - apre aSC
Conclusion Two possible implementations of feedback for learning: Performance-based reward • possible for vertical saccades • more generic Vectorial error • for horizontal saccades • simple implementation • specific to brain sub-system * * redoing Robinson (2003) experiment for vertical saccades could tell
Figure source: “Attention and Eye Movement in young Infants: Neural Control and Development” by J.E. Richards and S.K. Hunter; http://cogsci.webedu.ccu.edu.tw/Attention_and_Development_93/Attention and Eye Movement in young Infants.ppt
Sub-cortical and cortical visual systems primary visual system, functional after 2 months of age secondary visual system, mature at birth cerebellum serves both Figure source: J.J. Hopp and A.F. Fuchs (2004) The characteristics and neuronal substrate of saccadic eye movement plasticity. Progress in Neurobiology, 72: 27-53.