Visual Perception: what do we want to explain?

Visual Perception: what do we want to explain? How do we get visual information from the world and use it to control behavior? What neural processes underlie visually guided behavior? Traditional sub-areas - visual sensitivity color vision spatial vision temporal vision binocular vision/ depth perception texture perception motion perception surfaces, segmentation object perception attention perceptual learning Visual control of movement eye movements reaching attention

Traditional sub-areas - visual sensitivity color vision spatial vision temporal vision binocular vision/ depth perception texture perception motion perception surfaces, segmentation object perception attention perceptual learning Visual control of movement eye movements reaching attention

Sources: Kandel, Schwartz & Jessel Principles of Neural Science McGraw-Hill 4thed Gazzaniga, Ivry, Mangun Cognitive Neuroscience Norton, 3rded Squire, Berg, Bloom, du Lac, Ghosh, Spitzer Fundamental Neuroscience 3rded Academic Press Rosenbaum Human Motor Control 2ndedAcademic Press Readings, Class 1: Gazzaniga et al Chapter 5 pp 177-188 Kandell et al Chapters 27, 28 Class 2: Squire et al Ch 46, 47

The Eye and Retina

Cone Photoreceptors are densely packed in the central fovea Note: despite lower density of cones in peripheral retina, color vision is basically the same across the visual field.

Visual Acuity matches photoreceptor density Relative visual acuity Receptor density

Eye movements

Retinotopic Organization and Cortical Magnification The brain uses more physical space for signals from the fovea than the periphery Adjacent points in the world Project to adjacent points in cortex

Signals from each eye are adjacent in LGN but remain segregated in different layers. Convergence occurs in V1. Two kinds of cells in retina project to different layers in LGN M=magno=big P=parvo=small K= konio

Major transformations of the light signal in the retina: Temporal filtering – reduced response to high temporal frequencies – Temporal integration – a strong 1 msec flash is equivalent to a weaker 50 msec flash. 2. Light adaptation – sensitivity regulation - adjustment of operating range to mean light level. (Light level 1010 range, ganglion cells, 102 range.) 3. Anatomical organization of photoreceptors provides high acuity in fovea with rapid fall-off in the periphery. (photoreceptor density) 4. Convergence of photoreceptors onto ganglion cells also leads to acuity limitations in the peripheral retina. (1 cone per midget cell in fovea) 5. Organization of 3 cone photoreceptors into color opponent signals (Luminance, Red-Green, Yellow-Blue)

Magno and parvo cells have different spatial and temporal sensitivities. Function of the different M and P pathways is unclear. Note: attempts to Isolate a pathway psychophysically were unsuccessful

Visual consequences of lesions at different locations in the visual pathway. contralateral hemianopsia (cf Homonymous hemianopia) Foveal sparing

Visual cortex is a layered structure (6 layers). LGN inputs arrive in Layer 4. Layers 2,3 output to higher visual area., Layers 5,6 output to sub-cortical areas (eg superior colliculus) Massive feedback projection from layer 6 to LGN – 800 lb gorilla. Also extensive feedback inputs from extra-striate cortex)

Cells in V1 respond to moving or flashing oriented bars. Little response to diffuse flashes or steady lights. (Note this established the paradigm for neurophysiol investigation – cf natural vision)

LGN cells have circular receptive fields, like retina. Not clear what the role of the LGN is. (Murray Sherman – gates input to cortex) Oriented cells emerge in V1, probably composed Of appropriately aligned LGN cells as shown. (Usrey – dual cell recordings)

Orderly anatomical organization in V1 Cells in V1 are organized into columns. Orientation preference gradually changes as one progresses across cortex. Cells at different depths Have same orientation preference. Binocular convergence: Cells respond more or less to R and L eye inputs. Ocular dominance varies smoothly across cortical surface orthogonal to orientation variation

Regular large scale organization of orientation preference across cortical surface. Does this simplify signal processing?

What is V1 doing? Early idea: edge detectors – basis for more complex patterns Later (1970-80’s) – spatial frequency channels any spatial pattern can be composed of a sum of sinusoids Late 90’s to now: Main idea about V1 is that it represents an efficient recoding of the information in the visual image. Images are not random. Random images would require point-by-point representation like a camera. Images have clusters of similar pixels and cells designed to pick this up. Cells extract information about spatial variation at different scales (clusters of different sizes). Can think of receptive fields as “basis functions” (an alphabet of elemental image components that capture clustering in local image regions)

Approximating an image patch w basis functions The outputs of 64cells in the LGN … … can be coded with only twelve V1 cells … … where each cell has 64 synapses LGNThalamic nucleus V1striate cortex

The neural coding library of learned RFs Because there are more than we need - Overcomplete(192 vs 64) - the number of cells that need to send spikes at any moment is Sparse(12 vs 64).

More complex analysis of image properties in higher visual areas (extra-striate) Defining visual areas: Retinotopic responses Anatomical projections Note old simplistic view: One area, one attribute Is not true. Areas are selective in complex and poorly understood ways Note the case of Mike May.

Mike May - world speed record for downhill skiing by a blind person. Lost vision at age 3 - scarred corneas. Optically 20/20 - functionally 20/500 (cfamblyopia) Answer to Molyneux’s question: Mike May couldn’t tell difference between sphere and cube. Improved, but does it logically rather than perceptually. (cf other cases) Color: an orange thing on a basket ball court must be a ball. Motion: can detect moving objects, distinguish different speeds (structure from motion). Note: fMRI shows no activity in Infero-temporal cortex (corresponding to pattern recognition) but there is activity in MT, MST (motion areas) and V4 (color). Other parts of brain take over when a cortical area is inactive. Cannot recognize faces. (eyes, movement of mouth distracting) Can’t perceive distance very well. Can’t recognize perspective. No size constancy or lightness constancy/ segmentation of scene into objects, shadows difficult. Vision most useful for catching balls and finding things if he drops them.

Hippocampus Connections are bi-directional Fellerman and Van Essen 85

Major sub-division into dorsal and ventral pathways

Cells in MT are sensitive to motion in particular directions. Cells are also tuned for particular speeds

Methods for measuring motion sensitivity: %motion and direction range Direction range – sample randomly from directions over eg 90 deg range MT lesions lead to deficits in motion perception – Often only transient loss however

The Aperture Problem Cells in V1 can only detect motion orthogonal to the receptive field. Output is ambiguous. MT is thought to resolve this ambiguity by combining motion from different V1 cells. Integration of features (corners) is also used.

Two ways of perceiving motion. MST Output of cells goes to brainstem regions controlling pursuit eye movements. MT When the eyes are held still, the image of a moving object traverses the retina. Information about movement depends upon sequential firing of receptors in the retina. B. When the eyes follow an object, the image of the moving object falls on one place on the retina and the information is conveyed by movement of the eyes or the head.

Motion of the body in the world introduces characteristic motion patterns in the image. MST is sensitive to these patterns.

Lesions in monkey MST lead to deficits in pursuit eye movements. Right occipito-parietal lesions in human leads to similar deficits in pursuit eye movements.

dorsal Optic flow patterns ventral Output to pursuit system Motion of animate agents

http://www.michaelbach.de/ot/col_equilu/index.html

MST has input from the vestibular system. Thus the cells have information about self motion from sources other than the flow field. Many cortical areas have inputs from eye movement signals as well, even as early as V1. Presumably this is responsible for the ability of the visual system to process image information independent of image motion on the retina.

Perception of Depth Monocular cues: familiar size occlusion geometric perspective shading motion parallax Global (distance) and local (shape) aspects

Stereopsis Note developmental sensitivity of stereo vision Strabismus, amblyopia Stereo sensitivity is one of the hyperacuities Motion parallax a little less sensitive but probably important because it is ubiquitous. Panum’sfusional areas/binocular rivalry

Disparity – measure of depth Difference Angle a – angleb a b

Neural computation of disparity is complex and not well understood. Disparity signals in V1 and V2, MT.

Investigation of stereo vision using random dot stereograms Such stimuli have no monocular information and so are experimentally useful for isolating stereo processes, but have disadvantage that they are harder than usual.

Perception of Surfaces Subjective Contours Cells in V2 respond to subjective contours.

The general idea about visual processing is that it is organized in a hierarchical fashion, with progressively more abstract representation of information. However, there is no real understanding of how this happens in terms of neural circuits. Some recent modeling is suggestive and usually relies on extensive perceptual learning. Note that this conceptualization is entirely feedforward.

MT/MST (motion) V4 (color) Infero-temporal cortex

Cortical specialization

Visual Perception: what do we want to explain?