Sensory and Motor Systems (G80.2202) Psychophysics of Early & Mid-level Vision

Sensory and Motor Systems (G80.2202) Psychophysics of Early & Mid-level Vision Instructor: Nava Rubin

Di i Di i k Ds = = constant { { Ds Ds { Di Early Psychophysics (history-wise and level-wise) Weber’s Law (1834) let i denote stimulation intensity, and let Di denote the minimal increase in intensity that an observer can detect; the following holds: Fechner’s insight (1860) : this corresponds to ‘constant increments of sensation’, Ds. Therefore : From here we can deduce the form of S(i) : integrate, s = k log(i) + C Di

8 Weber–Fechner law: S = k ln(i) [sensation log(intensity of stimulation)] Example of a behavioral derivation of a neurally-based law (measured physiological only later*) Ernst Weber (1795–1878) Gustav Fechner (1801–1887)

Luminance Gain Control (aka “light adaptation”) Sensitivity & RGC response (normalized) (schematic) “surround” Background lum: 100 1000 10,000 1 0.5 0 “center” lum in center RF 1 10 100 1000 Since luminance can potentially vary over an extremely wide range, the visual system (specifically, RGCs) adjust their sensitivity to match the locally prevalent luminance. This is done by roughly dividing the (within-RF) luminance by the local mean luminance of the immediate surrounding (a few degrees outside the RF).

Contrast Gain Control Contrast gain control begins in the retina and is strengthened at subsequent stages of the visual system. It roughly divides the responses by a measure that grows with the locally prevalent root-mean-square (r.m.s.) contrast, or the standard deviation of the stimulus luminance divided by the mean luminance. http://www.uni-mannheim.de/fakul/psycho/irtel/cvd/C4700.html

Young & Helmholtz: Experiments in Additive Colors(schematic) “The Trichromatic Theory of Color”

Color metamers: Two different spectral distributions that produce the same perceived color (in a given observer) || Two different spectral distributions that produce the same stimulation of the L,M and S cones (of a given observer) Example: Yellow (~570nm or mix red and green) ‘S’‘M’‘L’

Ewald Hering (1834-1918): • Why does red produce a greenish after-effect? • (and vice versa) • Why does yellow produce a bluish after-effect? • (and vice versa) • Why do we perceive the superposition of ‘basic’ • colors as “white”? • -What does ‘white’ mean?? (Is it a property of the ‘outside’ world, or a property of our perceptual machinery?) • Hering’s Theory: • The visual system generates color signals in opponent pairs • (yellow-blue, red-green, white-black). • At the time, it was seen by many to compete with the trichromatic theory, but Hering held that both theories could be valid. • We now know he was correct: the two theories simply describe visual processes that occur at different levels. But it was not until much later in the twentieth century that neural experiments proved him correct.

Color adaptation (‘after-effects’): a demo

The Atomistic Approach to Psychophysics: the search for “atoms” of perception Wilhelm Wundt (1832 -1920)

Limitations of the “atomistic” approach: Color

Color Constancy Color Constancy: the tendency of surfaces to preserve their perceived color even when their emission spectrum changes dramatically (because of a change in the spectrum of the light they are reflecting)

Limitations of the “atomistic” approach: Brightness E.H. Adelson, MIT

The Atomistic Approach, take 2: explaining visual perceptual phenomena with independent filters / channels Fergus Campbell (1924-1993)

Detection Thresholds and Linear Systems Analysis Graham N & Nachmias J (1971), Detection of grating patterns containing two spatial frequencies: a comparison of single-channel and multiple-channels models. Vision Research 11(3) 251-9. (script 1)

Detection Thresholds and Linear Systems Analysis ‘single channel’ prediction ‘multi-channel’ prediction components

Results:

The Appeal of the ‘multiple channel’ approach is tightly linked to the expectation that the response of the system to a ‘compound’ stimulus could be predicted from its response to the constituent components (e.g., in the case of a visual pattern, from the response to its Fourier components). • Such a system is called a linear system: • R(A + B + C …) = R(A) + R(B) + R(C) + … • Another way to put it is that the channels are expected to be non-interacting, i.e. that the response of one channel (to its own component) does not depend on the input to the other channels. • How valid is this expectation [assumption] for sensation and perception?

Limitations of Linear Systems Approach: the role of relative phase Piotrowski LN & Campbell FW, A demonstration of the visual importance and flexibility of spatial-frequency amplitude and phase. Perception. 1982;11(3):337-46. Why is relative phase so crucial to appearance? Hint: what is the Fourier spectra of an edge? (and of 1/f noise?... ) And why is threshold-detection nonetheless linear??? (script 2)

Different levels of visual processing [ (i) the boundaries are not 100% sharp; (ii) not a universal agreement on definitions] Low-level: processes that are achieved by an array of filters that have relatively small receptive fields, tile the visual field (w/ overlap), and are non- or minimally-interacting. Examples: center-surround contrast detection in LGN; orientation selectivity in V1. Mid-level: processes that (i) group visual information about surface fragments that are disjoint in space and/or time; (ii) segment visual information into separate spatial and temporal entities. High-level: visual recognition processes; rely on prior knowledge of specific objects or classes of objects (their visual properties, semantic and/or lexical knowledge). Example . . . . .

One Object or Two Sets of Lines? More lines (Adapted from Lorenceau and Shiffrar 1992) Show All? Will dots help?

Mid-level visual processing: revisit definition Low-level: processes that are achieved by an array of filters that have relatively small receptive fields, tile the visual field (w/ overlap), and are non- or minimally-interacting. Examples: center-surround contrast detection in LGN; orientation selectivity in V1. Mid-level: processes that (i) group visual information about surface fragments that are disjoint in space and/or time*; (ii) segment visual information into separate spatial and temporal entities.  Requires compilation of visual information from spatially and/or temporally disparate sources. a.k.a: “Perceptual Organization”; “Gestalt processing”; … * Note: earliest in the visual pathway (ie retina), even physically contiguous surface portions may not be represented as a unitary entity (‘thing’), and therefore an overall change in neural representation may need to occur in cortex. High-level: visual recognition processes; rely on prior knowledge of specific objects or classes of objects (their visual properties, semantic and/or lexical knowledge).

The Gestalt Psychology Movement (Wertheimer, Kohler, Koffka, …): Perceptions are Gestalts* * -- “a whole that is more than the sum of its parts” put differently: PERCPTION IS FUNDAMENTALLY NON-LINEAR ( the atomistic approach is doomed) Emphasis on “perceptual organization”

Sensory and Motor Systems (G80.2202) Psychophysics of Early & Mid-level Vision Part ii

Motion Integration and Segmentation: Plaids (Wallach 1935, 1976; Adelson & Movshon 1982; Hupe & Rubin 2003) Reminder: show diff Alphas 1

Edges in Motion: Segmentation & integration in real-world images (Rubin and Albert , VSS 2001)

Global Motion Processing: Local velocity measurements are ambiguous … “The aperture problem” Marr & Ullman (1981) ? … is present not only for straight lines: It is really just a subset of … “The correspondence problem” Ullman (1979): “The identity problem” Wallach (1935, 1976): …and do not convey veridical information about the object’s global motion.

1D Cues, 2D cues and 3D motion

1D and 2D motion cues (From Pack et al. 2003)

Using short-bar stimuli and a reverse-correlation technique, Pack et al (2003) showed that the responses of end-stopped cells in V1 reliably signal the 2D motion direction of a bar’s endpoints, regardless of its orientation (i.e., these cells do not suffer from “the aperture problem”). end-stopped cell: non end-stopped cell:

Motion Integration Motion Segmentation “Transparency” “Coherency” Back to Plaid Perception : Back to Plaid Perception : 2D motion signals1D motion signals

Back to ‘One Object, or Two Pairs of Lines?’ Scene Segmentation affects the assignment of local motion cues as ‘intrinsic’ vs. ‘extrinsic’ More lines (After Shimojo, Silverman & Nakayama, VR 1989) (Demo adapted from Lorenceau & Shiffrar VR 1992) Show All? Will dots help?

Sensory and Motor Systems (G80.2202) Psychophysics of Early & Mid-level Vision