520 likes | 660 Views
Cosc 6326/Psych6750X. Vision and Visual Displays. Some theories of perception Empiricism vs Nativism Gestalt Ecological Constructionist Active perception. http://www.nipissingu.ca/stange/courses/p2255/2255-images/gestalt.jpg. Is vision inverse graphics?
E N D
Cosc 6326/Psych6750X Vision and Visual Displays
Some theories of perception • Empiricism vs Nativism • Gestalt • Ecological • Constructionist • Active perception
http://www.nipissingu.ca/stange/courses/p2255/2255-images/gestalt.jpghttp://www.nipissingu.ca/stange/courses/p2255/2255-images/gestalt.jpg
Is vision inverse graphics? • Techniques to study vision include: • Philosophical, Introspective Anaylsis • Psychophysics • Clinical Deficits • Electrophysiology • Anatomical studies • Imaging • Computational Analysis
Levels of Processing • Early vision • low level processing related to extracting image and surface properties such as luminance changes, depth and motion • Intermediate-level vision • Grouping operations, smoothing, cue integration, computation of shape, colour, contour
High-level vision • task based: object recognition, relations between objects, selection for manipulation/navigation
Marr 1982 claimed that a visual system can be described at three levels of abstraction • computational • algorithmic • implementation
Structure of the Eye • About 70% of refractive power in cornea • Lens fine tunes focus depending on distance and pupil size (accommodation) • Pupil size adjusted to trade amount of light with depth of field
Structure of the Eye • Retina contains sensory receptors • In humans, receptor density is non-uniform • Highest density in centre of foveal pit; decreases with eccentricity • Visual axis joins point of fixation with fovea • offset from optic axis
Functional Organisation of the Retina • Two types of photoreceptors • Rods: • high sensitivity • exclusively peripheral distribution • broad spectral tuning • retina contains >108 rods
Functional Organisation of the Retina • Cones: • 3 wavelength selective types • Long ( 40%, 565nm peak) • Medium ( 40%, 535nm peak) • Short ( 10%, 450nm peak) • 5 Million, highest concentration in fovea (>6000 in central 1°)
Limits on Vision • Spectral sensitivity • 400-700 nm range • 3 cone types • Precise discrimination but no resolution • Sensitivity to intensity • Sensitivity to spatial variation • Sensitivity to temporal variation • Field of view • …
Limits on sensitivity to intensity • Phototopic range • day time vision, colour sensitivity, high resolution, cone-mediated • Scotopic • low light vision, achromatic, rod-mediated, more sensitive but low spatial resolution • Mesatopic • transition range (e.g. moonlight)
Dynamic Range • Dynamic range of eye is about three log units • Range of visible stimuli is several log units, we rely on • adaptation • pupil size changes • rod and cone sensitivity ranges
Dynamic Range • Dynamic range of most displays is limited to a small fraction of natural range (100:1 versus up to 10,000:1 in a natural scene) • Issues are both display capability and image generation • Tone mapping is one technique that uses non-linear luminance mapping to distort image & increase range of visibility • Some true HDR displays in development (including one at York)
Tone Mapping Larson, Siggraph 1997
Retinal Microcircuits • Receptive field of a neuron refers to the region of the retinal that contributes (excites or inhibits) to its firing rate • Ganglion receptive fields have a centre-surround organisation • on-centre: light in centre, inhibited by light in surround • off-centre: excited by its surround, inhibited by light in centre
firing rate cannot be negative – leads to rectification of response • off-channels complement response of on-channels • centre-surround organisation gives sensitivity to change in luminance
Contrast dependent processing occurs at higher levels of the brain as well
Visual Pathways • principally lateral geniculate nucleus (LGN) then cortex • also important sub-cortical pathways • Accessory optic tract • Pulvinar, superior colliculus • massive feedback as well as feedforward pathways from Howard & Rogers 2001
neurons usually sensitive to more than one feature • primary visual cortex • spatio-topically mapped • has columnar organization (hyper-columns) for: • orientation • direction of motion • eye dominance columns …
Function of oriented receptive fields in V1? • edge and bar detectors (black inhibitory, white excitatory)? • tuned to a variety of different orientations Receptive Field Stimulus eliciting best response
Fourier analysis of image (Blakemore and Campbell)? Need to combine responses of receptors with similar receptive fields distributed over visual field • spatial-temporal filtering
Visual areas believed to be specialised for features, e.g. • V1: oriented edges, colour • MST, MT: motion, optic flow • V4: colour, form • IT: objects • We’ll have time to look at only a small set of the functions vision supports
2 pathways hypothesis (seminar last Wednesday) • parasol ganglion cells project to magnocellular layers of LGN • high contrast sensitivity, larger receptive fields, insensitive to colour, short latency, sensitive to luminance changes • presumed to form input to a motion and depth pathway (to MT, MST …)
midget ganglion cells project to parvocellular layers of LGN • colour opponent cells, less contrast sensitivity, colour sensitive, high spatial resolution, low temporal resolution • specialised pathways for form and colour (V4, IT …) • Livingston and Hubel (1988) have suggested there may be 4 physiologically distinct pathways: colour, binocular vision, motion and form
Spatial Resolution • Fovea • Optics of eye blur image • Act as spatial frequency filter and prevents aliasing • Optics are diffraction limited for small pupil sizes • Retina is spherical, thus resolution is expressed in degrees of visual angle
best acuity in central fovea • receptor spacing closely matched to optics • grating: 60 cycles per degree • line separation 30 seconds of arc • Periphery • resolution limited by sampling density of receptors • best resolution for scotopic (rod) vision is extrafoveal where rod density increases
Contrast Sensitivity Function • important caveat - acuity is for high contrast patterns under ideal conditions • contrast required for an object to be visible depends on the pattern • sensitivity falls at high and low spatial frequencies
Display Resolution • resolution of displays usually measured in pixels/display or pixels/cm • e.g. 30 cm wide, 1024x768 pixel display (30 pixels per cm) would needs to be located at away for pixel pitch to subtend 30 seconds of arc • effective resolution typically less than pixel pitch
For full foveal acuity in a six walled CAVE how many pixels do we need? • simplify to spherical surface (4π steradians) need about 5.9 x 108 pixels (Hopper, 2000) per eye • typical 6 sided CAVE has 1280x1024x6 = 7.9x106 pixels, a two-order of magnitude difference • this is the key to the appeal of foveated displays
Field of View • each has a visual field extending approx 56° nasally and 95° temporally – 190° total visual field • extended to 290° with eye movements and nearly 360° with head movement • portion seen by both eyes is up to 114° horizontally and 125° vertically
Field of view, resolution trade-offs for displays • Instantaneous field of view (FOV) of a display is angle subtended by the image at the eye • In HMDs FOV is restricted by size of the display and optical aberrations/distortions, which increase with eyepiece FOV • typical values 20-100° horizontal FOV
inherent trade-off between FOV and resolution for a fixed number of pixels • many HMD systems have very poor resolution. For example video see-through HMD (640x480) • 50° horizontal FOV, 12 pixels per degree • 20° horizontal FOV, 32 pixels per degree
large FOV easier in optical see-through HMDs (at least for see-through portion) • tiling of multiple sub-displays has been proposed for large FOV HMDS • resolution/FOV tradeoff also limits large format projection displays and tiling has been also used
Hyperacuities • Hyperacuity (Westheimer & McKee paper) • ability to discriminate relations between features that are finer than the resolving power of the visual system or even photoreceptor spacing • e.g. vernier acuity. Which line is higher? Foveal threshold is on the order of 1-5 seconds of arc • implies interpolation (& population coding)
other hyperacuities • stereopsis • curvature • bisection • orientation • even colour discrimination (three wavelengths detected but can discriminate millions of colours)
closely related to anti-aliasing in graphics. • but need to trade off effective resolution for sub-pixel positioning • displays that provide hyper-acuity level positioning without sub-pixel techniques are usually not feasible, e.g. 5 arcsec pixels: • 21600x16200 pixels for a 30 cm wide display at 57cm • 21 trillion pixels in a spherical cave