380 likes | 394 Views
This research investigates the neural processes that facilitate the perception of 3-D shapes from 2-D images and explores the critical image features and the activation of prior assumptions in these processes. The study focuses on texture patterns and perspective projections to reveal the mechanisms underlying shape perception.
E N D
EARLY NEURAL PROCESSES THAT FACILITATE 3-D SHAPE PERCEPTION Qasim Zaidi Graduate Program in Vision Science State University of New York College of Optometry
From 2-D images to 3-D percepts The Future Building, Manhattan Such misperceptions suggest that prior assumptions required to select particular 3D shapes from 2D retinal images are activated automatically by specific image features and cannot be corrected by cognitive or tactile knowledge. Questions for Psychophysics & Neurobiology: What are the critical image features? How is the activation of prior assumptions embodied in neural processes. North view South view (Griffiths & Zaidi, 2000)
PERCEPTION OF 3-D SHAPE FROM TEXTURE General Assumption: Texture on the surface is statistically homogenous, so inhomogeneities in the image, e.g. gradients of size, density, compression (Gibson, Blake, etc) are due to projection of non-fronto-parallel segments. Inverse Optics Method: Estimate the projective transform and reverse it (Garding, Malik, Rosenholtz, Clerc, Mallat). Problem: Surface texture is generally not homogeneous on carved/stretched solids. Our Solution: Parsing the image in orientation and frequency modulations reveals critical image features that can be detected by neural filters. (Li & Zaidi, 2000, 2001a&b, 2003, 2004, 2006)
TEXTURE PATTERNS (+FFT) DESIGNED TO SEPARATE ORIENTATION & FREQUENCY MODULATIONS I. SUMS OF SINUSOIDAL GRATINGS (i.e. ORIENTED COMPONENTS): HORIZONTAL-VERTICAL PLAID OCTOTROPIC PLAID OCTO PLAID MINUS HORIZONTAL II. DOT PATTERNS (i.e. ISOTROPIC ELEMENTS): GLOBALLY ISOTROPIC, UNIFORM SIZE ORIENTED ALONG H & V, UNIFORM SIZE ORIENTED ALONG H & V, RANDOM SIZE
TEXTURES FOLDED INTO SINUSOIDAL CORRUGATIONS (Folding does not alter the texture on the surface.) Perspective projections were viewed monocularly on a CRT monitor such that the retinal image was identical to that of the 3-dimensional surface.
PERSPECTIVE IMAGES OF DEVELOPABLE CORRUGATIONS SUMS OF GRATINGS H-V PLAID OCTO PLAID OCTO MINUS H • Orientation flows are formed by projections of the horizontal component (parallel to axis of maximum curvature). The vertical component projects with frequency modulations • Orientation flows converge at concavities, diverge at convexities, and are mirror opposites for opposite slants. • In the absence of this component, texture gradients and frequency modulations are insufficient to convey veridical shape.
PLANAR & PERSPECTIVE COMPONENTS OF OCTOTROPIC PLAID TEXTURE COMPONENTS PERSPECTIVE IMAGES OF DEVELOPABLE CORRUGATIONS 0 22.5 45 67.5 90 -67.5 -45 -22.5 Only the orientation flows of the 0 component can distinguish concavities from convexities, and right slants from left slants.
Frequency modulations & perspective projections Spatial frequency on the developable surface is homogeneous, therefore the projected frequency increases with increasing slant. As a function of slant, concave and convex curvatures both exhibit high-low-high frequency gradients.
Image spatial frequency is a function of distance and slant of textured surfaces Texture Pattern Distance: 108,131,208% Left Slant: 20, 40, 60 deg Right Slant: 20, 40, 60 deg Notice: Axis of change in frequency, frequency gradients, and element shapes
PERSPECTIVE IMAGES OF DEVELOPABLE CORRUGATIONS DOT PATTERNS UNIFORM & ISOTROPIC UNIFORM & H-V RANDOM & H-V Veridical curvatures are conveyed by signature orientation flows. When signature orientation flows are not visible, relative spatial frequencies are used to infer relative distance, despite changes in element shapes that signal slant. This leads to non-veridical percepts.
CARVED SURFACES: CONSTANT-Z SOLIDS Inhomogeneous surface textures Constant-Z: the solid contains identical planar patterns along the Z-axis (Can easily be generalized to statistically homogenous along Z-axis).
PERSPECTIVE PROJECTIONS OF CONSTANT-Z CARVED CORRUGATIONS FREQUENCY MODULATIONS The frequency on the surface of the cut decreases with increasing slant (Inhomogeneous surface texture), but the frequency in the projection increases with increasing slant, hence the frequency in the image varies mainly due to distance from the observer.
PERSPECTIVE IMAGES OF CONSTANT-Z CARVED CORRUGATIONS GRATING PATTERNS H-V PLAID OCTO PLAID OCTO MINUS H Veridical curvatures are conveyed by visible signature orientation flows despite non-homogeneous surface textures. Signature orientation flows are physically present but not visible for the octotropic plaid. Hence, visibility of flows is necessary for 3-D shape perception.
COMPONENTS OF OCTOTROPIC PLAID TEXTURE COMPONENTS PERSPECTIVE IMAGES OF DEVELOPABLE CORRUGATIONS PERSPECTIVE IMAGES OF CONSTANT-Z CARVED CORRUGATIONS 0 22.5 45 67.5 90 -67.5 -45 -22.5 Orientation modulations of many components can distinguish curvatures. Orientation modulations of 0 component overlap with orientations of 22.5 and -22.5 components.
Constant-Z carved corrugation with octotropic plaid pattern Octo plaid +22.5-22.50,±45, ±67.5, 90 Visibility of signature orientation flows is masked by neighboring components. Unmasking leads to distinguishing between signs of curvatures.
PERSPECTIVE IMAGES OF CONSTANT-Z CARVED CORRUGATIONS DOT PATTERNS UNIFORM & ISOTROPIC UNIFORM & H-V RANDOM & H-V Veridical curvatures are conveyed by signature orientation flows. Frequency modulations are a function of distance and convey much weaker percepts.
PERSPECTIVE IMAGES OF CARVED DEPTH PLAIDS CONCAVE CONVEX V-SADDLE H-SADDLE
TEXTURED MATERIAL STRETCHED ONTO CORRUGATIONS Texture on surface can be inhomogeneous, but orientation modulations of horizontal component are unaffected by stretching.
Shape from natural textures (Brodatz set) Shape from phase-randomized natural textures
FFTs of flat natural textures crit = 6 deg sur =18 deg Measures of discrete orientation energy parallel to axis of maximum curvature can predict which textures will convey differences in signs of curvatures and slants.
ORIENTATION FLOWS & 3-D SHAPE • Signature patterns of orientation flows: • Occur generically at the locations of concave, convex, slanted and saddle curvatures of developable, carved and stretched surfaces. • Automatically evoke percepts of concave, convex, slanted and saddle shapes. • Are the only image features that are physically distinct for concave, convex, slanted and saddle curvatures across many classes of surface textures. • Since striate cortex parses visual stimuli into local orientations and spatial frequencies, these results suggest that extra-striate neural filters matched to orientation flows could extract 3-D shape from texture cues.
Orientation flow templates for identifying and locating 3-D concave and convex orientation shapes
SUPERVISED HEBBIAN LEARNING • No linear procedure was able to extract orientation flows. • Hebbian learning gave decent matched filters only if the critical orientations are privileged. • Oriented cells are not treated as independent filters.
Orientation flow visibility as a function of surface slant When a textured surface is slanted out of the fronto-parallel plane, the component parallel to the slant appears perceptually more salient than other components as the surface is slanted. Is this due to independently decreased visibility of other orientations, or is this a change in dependence of orientation processing?
Retinal/LGN origin of cross-orientation inhibition in the cat Model Intracellular recordings Based on intra-cellular excitation (Priebe & Ferster, 2006) and temporal dynamics (Li et al, 2006), cross-orientation suppression in cat seems mostly due to compressive contrast-response functions.
Feed-forward models of COS do not predict frequency specificity Cortical interactions?
Figure D14 S1 S2 S2’ Variable T1 Trial Start Fixation S5 S5 S2 S1 S4 S1 S3 S2 S0 S0 S0 S0 S0 S0 S0 S0 S0 750 ms Display Eye in Window Trial End Figure D13 Measuring responses in V1 that can aid in extracting orientation flows
Physiology • Species • 4 macaque monkeys • Anesthesia, analgesia, paralysis • Propofol, sufentanil, vecuronium • Recording Locations • V1, infragranular V2 • Electrodes • Thomas tetrodes • Visual Stimulation • Monocular, dominant eye, 120 Hz CRT • Spike sorting • Off-line • Histology • Recovery of electrolytic lesions
Preferred orientation for neuron Sf grad. Orient. flow Sf grad. Orient. flow V2
RESPONSE MEASURE Spike count and mean spike rate are not linked to the time structure of the stimulus and are often not sensitive enough to discriminate between two sets of neural responses. • Calculate spike spectrum for two conditions, including the ON and OFF phases of the stimulus cycles. • Retain only those values for which the spectra are significantly different. • Sum the significant residuals across all frequencies and normalize by the number of frequencies
- < 0.0
> 0.0 -
Conclusions • It is worth parsing images in many different ways, as each reveals different possible critical features. • Image features are likely to be invariant for non-rigid shapes (clothes, animal skins, etc) if they are not limited to developable surfaces. • Templates for complex image features give few false positives. • The brain can provide simple tricks to facilitate learning/extraction of critical features
Acknowledgments • Thanks to Andrea Li, Carson Wong, Keith Purpura, and Jonathan Victor for doing the work, • To the National Eye Institute (EY13312 & EY07556) for funding it, • To Ken Knoblauch for inviting me, • And to all of you for listening patiently to an Anglo-Saxon language.