250 likes | 277 Views
Explore perception processes including visual illusions, pattern recognition, object recognition, speech perception, and categorical perception. Topics cover template-matching, feature analysis, Biederman's theory, phoneme perception, and categorical perception.
E N D
Cognitive ProcessesPSY 334 Chapter 2 – Perception July 3, 2003
Law of Pragnanz • Of all the possible interpretations, we will select the one that yields the simplest or most stable form. • Simple, symmetrical forms are seen more easily. • In compound letters, the larger figure dominates the smaller ones.
Visual Illusions • Depend on experience. • Influenced by culture. • Illustrate normal perceptual processes. • These are not errors but rather failures of perception in unusual situations.
Visual Pattern Recognition • Bottom-up approaches: • Template-matching • Feature analysis • Recognition by components
Template-Matching • A retinal image of an object is compared directly to stored patterns (templates). • The object is recognized as the template that gives the best match. • Used by computers to recognize patterns. • Evidence shows human recognition is more flexible than template-matching: • Size, place, orientation, shape, blurred or broken (ambiguous or degraded items easily recognized by people.
Feature Analysis • Stimuli are combinations of elemental features. • Features are recognized and combined. • Features are like output of edge detectors. • Features are simpler, so problems of orientation, size, etc., can be solved. • Relationships among features are specified to define the pattern.
Evidence for Feature Analysis • Confusions – people make more errors when letters presented at brief intervals contain similar features: • G misclassified: as C (21), as O (6), as B (1), as 9 (1) • When a retinal image is held constant, the parts of the object disappear: • Whole features disappear. • The remaining parts form new patterns.
Object Recognition • Biederman’s recognition-by-components: • Parts of the larger object are recognized as subobjects. • Subobjects are categorized into types of geons – geometric ions. • The larger object is recognized as a pattern formed by combining geons. • Only edges are needed to recognize geons.
Tests of Biederman’s Theory • Object recognition should be mediated by recognition of object components. • Two types of degraded figures presented for brief intervals: • Components (geons) missing • Line segments missing • At fast intervals (65-100 ms) subjects could not recognize components when segments were missing.
Speech Recognition • The physical speech signal is not broken up into parts that correspond to recognizable units of speech. • Undiminished sound energy at word boundaries – gaps are illusory. • Cessation of speech energy in the middle of words. • Word boundaries cannot be heard in an unfamiliar language.
Phoneme Perception • No one-to-one letter-to-sound correspondence. • Speech is continuous – phonemes are not discrete (separate) but run together. • Speakers vary in how they produce the same phoneme. • Coarticulation – phonemes overlap. • The sound produced depends on the sound immediately preceding it.
Feature Analysis of Speech • Features of phonemes appear to be: • Consonantal feature (consonant vs vowel). • Voicing – do vocal cords vibrate or not. • Place of articulation – where the vocal track is constricted (where is tongue placed). • The phoneme heard by listeners changes as you vary these features. • Sounds with similar features are confused.
Categorical Perception • For speech, perception does not change continuously but abruptly at a category boundary. • Categorical perception – failure to perceive gradations among stimuli within a category. • Pairs of [b]’s or [p]’s sound alike despite differing in voice-onset times.
Two Views of Categorical Perception • Weak view – stimuli are grouped into recognizable categories. • Strong view – we cannot discriminate among items within such a category. • Massaro – people can discriminate within category but have a bias to same items are the same despite differences. • Category boundaries can be shifted by fatiguing the feature detectors.
Top Down Processing • General knowledge (context, high-level thinking) combines with interpretation of low-level perceptual units (features). • Context limits the possibilities so fewer features must be processed: • Word superiority effect – D or K vs WORD or WORK – words do 10% better. • To xllxstxatx, I cxn rxplxce xvexy txirx lextex of x sextexce xitx an x, anx yox stxll xan xanxge xo rxad xt wixh sxme xifxicxltx.
Context and Speech • Phoneme restoration effect: • It was found that the *eel was on the axle. • It was found that the *eel was on the shoe. • It was found that the *eel was on the orange. • It was found that the *eel was on the table. • The identification of the missing word depends on what happens after it.
Faces and Scenes • When parts are presented in isolation, more feature information is needed to recognize them. • Face parts are recognized with less detail when in the context of a face. • Subjects are better able to identify objects when they are part of coherent novel scenes rather than jumbled scenes.
Models of Object Perception • Two competing models explain how context and feature information are combined: • Massaro’s FLMP (fuzzy logic model of perception) -- Context and detail are two independent sources of information. • McClelland & Rumelhart’s PDP model – connectionist model in which both sources of information interact.
Testing the FLMP Model • Four kinds of stimuli: • Only an e can make a real word. • Only a c can make a real word. • Both letters can make a word. • Neither letter can make a word. • Within each group, stimuli go from e to c. • Subjects saw each stimulus word briefly and had to identify the letter, e or c.
FLMP Results • Observed frequencies for naming a letter e increase as it has more e features, but also as the context demands an e. • Baye’s theorem gives a formula for combining the independent contributions of two sources of information. • Massaro’s results conform to predictions of Baye’s theorem, suggesting that the information sources must be independent of each other.
Testing the PDP Model • Activation spreads from features to excite letters and from letters to excite words (bottom up processing). • Activation also spreads from words to the component letters (top-down processing). • The more activation, the more likely the correct letter will be identified: • TRAP vs TRIP
Comparing the Two Models • Subjects heard a phoneme that varied from r to an l in two contexts: • A syllable beginning with t – tr or tl. • A syllable beginning with s – sl or sr. • Both the FLMP and PDP models were compared to actual subject data. • FLMP was close to what subjects did. • PDP was too strongly affected by context.
PDP Model Describes More • The PDP model suggests that information is not separately processed but each letter affects each other letter. • Recognition of “a” in MAVE is almost as good as recognizing it in MADE. • This occurs because MAVE is similar to many other words with an A in that position. • We do not have a context but four letters that each influence the others.
Marr • Depth cues (texture gradient, stereopsis) – where are edges in space? • How are visual cues combined to form an image with depth? • Primal sketch – extracts features. • 2-1/2 D sketch – identifies where visual features are in relation to observer (depth). • 3-D model – refers to the representation of the objects in a scene, combines context.
Putting it All Together • The output of these stages (see Fig 2.31) is a representation of an object and its location. • This output is used as input to higher-level cognitive processes. • Conscious awareness (a higher-level process) involves the recognition stage, but lots of processing occurs first.