890 likes | 1.34k Views
MULTIMODAL EMOTION PERCEPTION: ANALOGOUS TO SPEECH PROCESSES. Dominic W. Massaro. Old Fashioned View of Spoken Language and Communication. Anecdotal Evidence for Functional Value of Visible Speech. Persons with Hearing Loss Benjamin Franklin in France Hal in 2001: A Space Odyssey
E N D
MULTIMODAL EMOTION PERCEPTION: ANALOGOUS TO SPEECH PROCESSES Dominic W. Massaro
Anecdotal Evidence for Functional Value of Visible Speech • Persons with Hearing Loss • Benjamin Franklin in France • Hal in 2001: A Space Odyssey • “Hear TV Better with Glasses On” • Poorly Dubbed Foreign Films
Value of Talking Heads • Enhance Realism and Naturalness • Convey Intention and Emotion • Enhance Intelligibility
Research Strategy to Develop and Evaluate the Effectiveness of Talking Heads • Auditory Synthetic speech • Computer Animated Talking Head • Development and Evaluation
PSLTalk (Baldi) • Successive Approximations to Realism • Real Time on PC Platforms • Controlled by text-to-speech synthesizer
Rotation of points • movement around axis, e.g., jaw rotation
Translation • movement of points, e.g., raise upper lip
Interpolation • Between two different subsections of wireframes--e.g., neck size
Scaling • constant multiplier, e.g., head width
Features • Driven by Text to Speech Engine • Target Values for each phoneme • Coarticulation
Additional Features • Coarticulation • Paralinguistic Properties • Nonspeech Segments • Texture Mapping
Paralinguistic Synthesis • Nonspeech Segments • Breadth Noise, Cough, Clear Throat, Laugh, Lip Smack, Sneeze, Tongue Click, Burp Baldi’snonspeech
PSLTalk (Baldi) • Alignedwithnaturalauditory speech
Evaluating Intelligibility of Baldi, our Talking Head • Speechreading Syllables and Words • Understanding Sentences in Noise • Compare Baldi to Humans
Additional Features • Tongue and Palate • Can Hide Noncritical Components • Can Reveal Normally-Hidden Parts • Can Highlight Interaction of Articulators
Language Training exercise lay, ray, they Top View Side View
Language Training exercise wah, rah Back View
Synthesis of Emotion • Voice is Informative • Face is More Critical • Basic Universal Emotions • Happiness, Anger, Surprise, Fear, Disgust, and Sadness, and Neutral • Can specify the degree to which Baldi expresses each of these emotions, and some combination
Can specify the degree to which Baldi expresses each of these emotions, and some combination Over 80% correct in 6 alternative task
Emotion Training exercise surprise Front View
Experimental Strategy to Study How Emotion is Processed • Manipulate auditory and visual speech • present unimodal stimuli • present factorial bimodal stimuli • no feedback in task • test models of perception
Expanded Factorial Design (Baldi) • Play Continuum-- voice fastest moving • Play Continuum-- face fastest moving
Brow 2 3 4 Happy none Angry 2 3 4 Happy none Angry Mouth
Expanded Factorial Design (Baldi) • Play Continuum-- brow fastest moving • Play Continuum-- mouth fastest moving