330 likes | 505 Views
PS: Introduction to Psycholinguistics. Winter Term 2005/06 Instructor: Daniel Wiechmann Office hours: Mon 2-3 pm Email: daniel.wiechmann@uni-jena.de Phone: 03641-944534 Web: www.daniel-wiechmann.net. Session 4: Understanding speech. Problems with recognition of speech
E N D
PS: Introduction to Psycholinguistics Winter Term 2005/06 Instructor: Daniel Wiechmann Office hours: Mon 2-3 pm Email: daniel.wiechmann@uni-jena.de Phone: 03641-944534 Web: www.daniel-wiechmann.net
Session 4:Understanding speech • Problems with recognition of speech • Segmentation problem (how to seperate sounds in speech) • Possible remedies: • Possible-word constraint • Metrical segmentation strategy • Stress-based segmentation • Syllable-based segmentation
Session 4:Understanding speech • Categorical perception • Experiment Liberman et al. (1957) • Speech synthesizer creates continuum of artificial syllable that differ in the place of articulation of one phoneme • Subjects placed syllables into three categories (/b/, /d/, /g/)
Session 4:Understanding speech • Categorical perception • voice onset time (VOT) • voiced and unvoiced consonants (e.g. /b/,/d/ vs /p/,/t/) differ with respect to VOT (difference ~ 60 ms) • Experimenters varied VOT on a scale (e.g. 30ms) • Subjects make ‚either-or‘ distinctions
Session 4:Understanding speech • Categorical perception • Selective adaptation • Repeated presentation of /ba/ makes people less sensitive to voicing feature (fatigue feature detector) • cut-off point for /b/-/p/ destinction shifts toward /p/-end of continuum
Session 4:Understanding speech • Prelexical (phonetic) vs postlexical (phonemic) code • Prelexical code computed directly from perceptual analysis (bottom-up) • Postlexical coded is computed from higher-level units such as words (top-down) • Foss and Blank (1980) phoneme-monitoring task • But cf. Foss and Gernsbacher (1983 and Marslen-Wilson and Warren (1994)
Session 4:Understanding speech • In summary: • There is a controversy about whether or not we identify phonemes before we recognize higher level units (e.g. syllbles or words)
Session 4:Understanding speech • The role of context in identifying sounds: the phonemic restoration effect (cf. Warren and Warren 1970)
Session 4:Understanding speech • It was found that the *eel was on the orange • It was found that the *eel was on the axle • It was found that the *eel was on the shoe • It was found that the *eel was on the table
Session 4:Understanding speech • It was found that the peel was on the orange • It was found that the wheel was on the axle • It was found that the heel was on the shoe • It was found that the meal was on the table
Understanding speech • Phonemic restoration effect: 2 explanations • 1. Context interacts directly with buttom-up processes (sensitivity effect) • 2. Context may simply provide additional source of information (response bias effect)
Understanding speech:Samuel (1981, 1990) • Method: • Subjects listen to sentences and meaningless noise was presented during each sentence • On some trials, noise was superimposed on one of the phonemes of a word • On other trials, phoneme was deleted • Finally, sometimes phoneme was predicatble from context • Task • decide whether or not crucial phoneme had been presented
Understanding speech: Samuel (1981, 1990) • Phonemic restoration effect: 2 explanations • Hypotheses • 1. If context improves sensitivity, then the ability to dicriminate between phoneme plus noise and noise alone should be improved by predicatble context • 2. If contextaffectsresponse bias, then participants should simply be more likely to decide that the phoneme was presented when the word was presented in predictable context
Understanding speech:Samuel (1981, 1990) • Results: • Context affected response bias but not sensitivity • Contextual information does not have a direct effect on bottom-up processing
Understanding speech:Models of speech recognition • Most influential models • Motor theory (Libermann et al 1967) • Listeners mimic the articulatory movements of the speaker • Cohort theory (Marslen-Wilsen and Tyler 1980) • TRACE model (McClelland and Elman 1986)
Understanding speech:Models of speech recognition: neuron (schematic) Synapse: The junction across which a nerve impulse passes from an axon terminal to a neuron
Understanding speech:Models of speech recognition: neuronal networks The brain is composed of over 10-100 billion nerve cells, or neurons, that communicate with one another through specialized contacts called synapses. Typically, a single neuron receives 2000-5000 synapses from other neurons; these synapses are located almost exclusively on the neuron's dendrites, long projections that radiate out from the neuron's cell body. In turn, the neuron's axon, a long thin process that grows out from the cell body of a neuron, makes synaptic connections with 1000 other neurons. In this way, neuronal signals pass from neuron to neuron to form extensive and elaborate neural circuits.
Understanding speech:Models of speech recognition: number of neurons human brain
Understanding speech:Models of speech recognition: introducing connectionist models
Understanding speech:Models of speech recognition: introducing connectionist models • Two central assumptions artificial neural nets (ANN): 1) processing occurs through the action of many simple, interconnected processing units (neurons) 2) activation spreads around the network in a way determined by the strength of the links, i.e. the connections between units
Understanding speech:Models of speech recognition: introducing connectionist models • Some models learn • back-propagation • Some don‘t • Interactive activation model (IAC) McClelland and Rumelhart (1981) does not learn • TRACE model (McClelland and Elman 1986) is an IAC model
Understanding speech:Models of speech recognition: from neural networks to connectionist models Connections can be inhibitory or excitatory(facilitatory) Threshold: the total amount of activation needed to make the node fire Connections (or links) have different weights
Understanding speech:Models of speech recognition: from neural networks to connectionist models + 0.6 (excitatory) - 0.5 (inhibitory) + 0.7 (excitatory) Threshold: 1.0 -1 to +1 Ergo: no firing
Understanding speech:Models of speech recognition: from neural networks to connectionist models -1 to +1 - 0.5 + 0.9 (excitatory) - 0.2 (inhibitory) - 0.9 + 0.4 (excitatory) + 0.5 Threshold: 1.0 -1 to +1 Ergo: firing
Understanding speech:Models of speech recognition: from neural networks to connectionist models
Understanding speech:Models of speech recognition: from neural networks to connectionist models Interactive activation network (McClelland and Rumelhart 1981)
Understanding speech:Models of speech recognition: TRACE • TRACE model (McClelland and Elman 1986) • There are individual processing units, or nodes, at three different levels: • FEATURES (place & manner of production, voicing) • PHONEMES • WORDS
Understanding speech:Models of speech recognition: TRACE • TRACE model (McClelland and Elman 1986) • Feature nodes are connected to phoneme nodes • Phoneme nodes are connected to word nodes • Connections between levels operate in both directions, and are only facilitatory (i.e. no inhibition)
Understanding speech:Models of speech recognition: TRACE • TRACE model (McClelland and Elman 1986) • There are connections among units or nodes at the same level • These connections are inhibitory
Understanding speech:Models of speech recognition: TRACE • TRACE model (McClelland and Elman 1986) • Nodes influence each other in proportion to their activation levels and the strength of their interconnections • As excitation and inhibition spread among nodes, a pattern of activation, or TRACE, develops
Understanding speech:Models of speech recognition: TRACE • TRACE model (McClelland and Elman 1986) • The word that is recognized is determined by the activation level of the possible candidate words.