310 likes | 400 Views
Visual speech speeds up the neural processing of auditory speech van Wassenhove, V., Grant, K. W., & Poeppel, D. (2005) Proceedings of the National Academy of Sciences, 102(4), 1181-1186. Jaimie Gilbert Psychology 593 October 6, 2005. Audio-Visual Integration.
E N D
Visual speech speeds up the neural processing of auditory speechvan Wassenhove, V., Grant, K. W., & Poeppel, D. (2005) Proceedings of the National Academy of Sciences, 102(4), 1181-1186. Jaimie Gilbert Psychology 593 October 6, 2005
Audio-Visual Integration • Information from one modality (e.g., visual) can influence the perception of information presented in a different modality (e.g., auditory) • Speech in noise • McGurk Effect
Demonstration of McGurk Effect Audiovisual Speech Web-Lab http://www.faculty.ucr.edu/~rosenblu/lab-index.html Arnt Maasø University of Oslo http://www.media.uio.no/personer/arntm/McGurk_english.html
Unresolved questions about AV integration • Behavioral evidence exists for vision altering the perception of speech, but… • When does it occur in processing? • How does it occur?
ERPs can help answer the “when” question • EEG/MEG studies have demonstrated AV integration effects using oddball/mismatch paradigms • These effects occur around 150-250 ms • A non-speech ERP study with non-ecologically valid stimuli demonstrated earlier interaction effects (40-95 ms) (Giard & Peronnet, 1999) • Does AV integration for speech occur earlier than 150-250 ms?
There’s a debate about the “how” question… • Enhancement • Audio-visual integration generates activity at multi-sensory integration sites, information possibly fed back to sensory cortices • VS. • Suppression • Reduction of stimulus uncertainty by two corresponding sensory stimuli reduces the amount of processing required
The Experiments • 3 experiments were conducted • Each had behavioral and EEG measures • Behavioral: Forced choice task • EEG: Auditory P1/N1/P2 • 26 participants • Experiment 1: 16 • Experiment 2: 10 • Experiment 3: 10 (of the 16 who participated in Experiment 1)
Audio /pa/ Audio /ta/ Audio /ka/ Visual /pa/ Visual /ta/ Visual /ka/ AV /pa/ AV /ta/ AV /ka/ Incongruent AV with Audio /pa/ + Visual /ka/ 1 Female face & voice for all stimuli In Exp. 1 & 2, each stimuli presented 100 times; total of 1000 trials The Stimuli
Experiment 1 • Exp. 1 • Stimuli presented in blocks of audio, or blocks of visual, or blocks of AV (congruent and incongruent) • Participants knew before each block which stimuli were going to be presented
Experiment 2 • Exp. 2 • Stimuli presented in randomized blocks containing all stimuli types (A, V, Congruent AV, Incongruent AV) to reduce expectancy • Task for both experiments: choose which stimuli was presented; for AV--choose what was heard while looking at the face
Experiment 3 • Presented 200 Incongruent AV stimuli • Task: choose what syllable you saw, neglect what you heard • In all experiments, correct response to Incongruent AV = /ta/
Waveform Analysis • Retained 75-80% of recordings after Artifact Rejection and Ocular Artifact Reduction • Only correct responses were analyzed • 6 electrodes used in analysis: FC3, FC4, FCz, CPz, P7, P8 • Reference electrodes: Linked mastoids
Results • This study’s answer to “How” • Suppression/Deactivation Hypothesis • AV N1 & P2 amplitude were significantly reduced compared to Auditory-alone peaks • Performed separate analysis to determine if summing the responses to unimodal stimuli would result in the amplitude reduction present in the data—this was not the case; therefore the AV waveform is not a superposition of the 2 sensory waveforms, but reflects actual multisensory interaction.
Results: Experiment 1 • N1/P2 Amplitude • AV < A (p < .0001) • N1/P2 Latency • AV < A (significant, but confounded by interaction) • Modality x Stimulus Identity • P < T < K (p < .0001) • Latency effect more pronounced in P2, but can occur as early as N1
Results: Experiment 2 • N1/P2 Amplitude • AV < A (p < .0001) • N1/P2 Latency • AV < A (p < .0001) • Modality x Stimulus Identity (p < .06)
Results: comparison of Exp. 1 & Exp. 2 • Similar results for Exp. 1 & 2; • Temporal facilitation varied by Stimulus Identity but amplitude reduction did not; • No evidence for attention effect (i.e., for expectancy affecting waveform morphology)
Temporal facilitation depends on visual saliency/signal redundancy • More temporal facilitation is expected to occur if: • The audio and the visual signals are redundant • The visual cue (which naturally precedes the auditory cue) is more salient • (Figure 3)
Results: Experiment 3/Incongruent AV Stimuli • Incongruent AV stimuli in Exp. 1 & 2: • no temporal facilitation • Amplitude reduction present and equivalent to reduction seen for Congruent AV stimuli • Experiment 3: • Both temporal facilitation and amplitude reduction occurred
Visual speech effects on auditory speech • Perceptual ambiguity/salience of visual speech affects processing time of auditory speech • Incorporating visual speech with auditory speech reduces the amplitude of N1/P2 “independent of AV congruency, participant’s expectancy, and attended modality” (p. 1184)
Ecologically valid stimuli • Suggest that AV speech processing is different from general multisensory integration due to the ecological validity of speech
Possible explanation for amplitude reduction • Visemes provide information regarding place of articulation • If this information is salient and/or redundant with auditory place of articulation cues (e.g., 2nd and 3rd formants), the auditory cortex does not need to analyze these frequency regions, resulting in fewer firing neurons
Analysis-by-Synthesis Model of AV Speech Perception • Visual speech activates internal representation/prediction • This representation/prediction is updated as more visual information is received over time • Representation/prediction is compared to the incoming auditory signal • Residual errors to this matching process are reflected by temporal facilitation and amplitude reduction effects • Attended modality can influence temporal facilitation
Suggest 2 time scales for AV integration • 1: feature stage • 25 ms • Latency facilitation • (sub-)segmental analysis • 2: perceptual unit stage • 200 ms • Amplitude reduction • Syllable level analysis • Independent of feature content and attended modality
Summary • AV speech interaction occurs by the time N1 is elicited (50-100 ms) • Processing time of auditory speech varies by the saliency/ambiguity of visual speech • Amplitude of AV ERP reduced when compared to amplitude of A-alone ERP
Questions • Dynamic visual stimulus and ocular artifact • If effects of AV integration are influenced by attended modality, would modality dominance also influence these effects? • Are incongruent AV/McGurk stimuli ecologically valid?