270 likes | 396 Views
How Gaze Patterns are Learned. Neuroeconomics. Fixation on Collider. Learning to Adjust Gaze. Changes in fixation behavior fairly fast, happen over 4-5 encounters (Fixations on Rogue get longer, on Safe shorter). Shorter Latencies for Rogue Fixations.
E N D
How Gaze Patterns are Learned Neuroeconomics
Learning to Adjust Gaze • Changes in fixation behavior fairly fast, happen over 4-5 encounters (Fixations on Rogue get longer, on Safe shorter)
Shorter Latencies for Rogue Fixations • Rogues are fixated earlier after they appear in the field of view. This change is also rapid.
Neural Circuitry for Saccades planning movements target selection saccade decision saccade command inhibits SC Substantia nigra pc (Dopamine) signals to muscles
Neural Substrate for Learning Neurons at all levels of saccadic eye movement circuitry are sensitive to reward. Neurons in substantia nigra pc in basal ganglia release dopamine. These neurons signal expected reward. This provides the neural substrate for learning gaze patterns in natural behavior, and for modeling these processes using Reinforcement Learning.
Dopaminergic neurons in basal ganglia signal expected reward. (Schultz, 2000) SNpc Expected reward is absent. Response to unexpected reward Increased firing for earlier or later reward
Neural Circuitry for Saccades planning movements target selection saccade decision saccade command inhibits SC Substantia nigra pc signals to muscles Substantia nigra pc modulates caudate
Neurons at all levels of saccadic eye movement circuitry are sensitive to reward. LIP: lateral intra-parietal cortex. Neurons involved in initiating a saccade to a particular location have a bigger response if reward is bigger or more likely SEF: supplementary eye fields FEF: frontal eye fields Caudate nucleus in basal ganglia
Cells in caudate signal both saccade direction and expected reward. Hikosaka et al, 2000 Monkey makes a saccade to a stimulus - some directions are rewarded.
This provides the neural substrate for learning gaze patterns in natural behavior, and for modeling these processes using Reinforcement Learning. (eg Sprague, Ballard, Robinson, 2007)
Modelling Natural Behavior in Virtual Environments. Virtual environments allow direct comparison of human behavior and model predictions in the same natural context. Use Reinforcement Learning models with an embodied agent acting in the virtual environment.
Modelling behaviors using virtual agents Sprague, Ballard, Robinson, 2007; Rothkopf ,2008 Assume behavior composed of a set of sub-tasks
Model agent after learning Pickup litter Follow walkway Avoid obstacles
Controlling the Sequence of fixations Choose the task that reduces uncertainty of reward the most obs can side
Avatar path Human path Reward weights estimated from human behavior using Inverse Reinforcement Learning - Rothkopf 2008.
Detection of signs at intersection results from frequent looks. Shinoda et al. (2001) “Follow the car.” or “Follow the car and obey traffic rules.” Time fixating Intersection. Road Car Roadside Intersection
How well do human subjects detect unexpected events? Shinoda et al. (2001) Detection of briefly presented Stop signs. Intersection P = 1.0 Mid-block P = 0.3 Greater probability of detection in probable locations Suggests Ss learn where to attend/look.
Neural signals Broca’s area Wernicke’s area Speech sounds Neural decoder Speech synthesizer
Different regions of motor cortex control different parts of the body
Record from cells in motor cortex while monkey • controls robot arm with a joystick • Find the preferred direction for each cell in the regions that’s active. • Find the “population vector” = vector sum of preferred directions. • Send this signal to the arm. • Add in a correction while the monkey is learning. • Gradually reduce the correction.