1 / 87

MULTIMODAL EMOTION PERCEPTION: ANALOGOUS TO SPEECH PROCESSES

MULTIMODAL EMOTION PERCEPTION: ANALOGOUS TO SPEECH PROCESSES. Dominic W. Massaro. Old Fashioned View of Spoken Language and Communication. Anecdotal Evidence for Functional Value of Visible Speech. Persons with Hearing Loss Benjamin Franklin in France Hal in 2001: A Space Odyssey

PamelaLan
Download Presentation

MULTIMODAL EMOTION PERCEPTION: ANALOGOUS TO SPEECH PROCESSES

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MULTIMODAL EMOTION PERCEPTION: ANALOGOUS TO SPEECH PROCESSES Dominic W. Massaro

  2. Old Fashioned View of Spoken Language and Communication

  3. Anecdotal Evidence for Functional Value of Visible Speech • Persons with Hearing Loss • Benjamin Franklin in France • Hal in 2001: A Space Odyssey • “Hear TV Better with Glasses On” • Poorly Dubbed Foreign Films

  4. Visual Communication in Love

  5. Value of Talking Heads • Enhance Realism and Naturalness • Convey Intention and Emotion • Enhance Intelligibility

  6. Research Strategy to Develop and Evaluate the Effectiveness of Talking Heads • Auditory Synthetic speech • Computer Animated Talking Head • Development and Evaluation

  7. PSLTalk (Baldi) • Successive Approximations to Realism • Real Time on PC Platforms • Controlled by text-to-speech synthesizer

  8. BaldiDescribesHimself

  9. Rotation of points • movement around axis, e.g., jaw rotation

  10. Translation • movement of points, e.g., raise upper lip

  11. Interpolation • Between two different subsections of wireframes--e.g., neck size

  12. Scaling • constant multiplier, e.g., head width

  13. Features • Driven by Text to Speech Engine • Target Values for each phoneme • Coarticulation

  14. Additional Features • Coarticulation • Paralinguistic Properties • Nonspeech Segments • Texture Mapping

  15. Paralinguistic Synthesis • Nonspeech Segments • Breadth Noise, Cough, Clear Throat, Laugh, Lip Smack, Sneeze, Tongue Click, Burp Baldi’snonspeech

  16. Texture Mapping

  17. PSLTalk (Baldi) • Alignedwithnaturalauditory speech

  18. Evaluating Intelligibility of Baldi, our Talking Head • Speechreading Syllables and Words • Understanding Sentences in Noise • Compare Baldi to Humans

  19. Additional Features • Tongue and Palate • Can Hide Noncritical Components • Can Reveal Normally-Hidden Parts • Can Highlight Interaction of Articulators

  20. Training with Baldi’s New Tongue and Palate

  21. Language Training exercise lay, ray, they Top View Side View

  22. Language Training exercise wah, rah Back View

  23. Synthesis of Emotion • Voice is Informative • Face is More Critical • Basic Universal Emotions • Happiness, Anger, Surprise, Fear, Disgust, and Sadness, and Neutral • Can specify the degree to which Baldi expresses each of these emotions, and some combination

  24. Can specify the degree to which Baldi expresses each of these emotions, and some combination Over 80% correct in 6 alternative task

  25. Emotion Training exercise surprise Front View

  26. Emotion Hallucinations

  27. Speech Hallucinations

  28. Experimental Strategy to Study How Emotion is Processed • Manipulate auditory and visual speech • present unimodal stimuli • present factorial bimodal stimuli • no feedback in task • test models of perception

  29. Expanded Factorial Design (Baldi) • Play Continuum-- voice fastest moving • Play Continuum-- face fastest moving

  30. Two Sources Better than either One Alone

  31. Unique Response from Two Sources

  32. Brow 2 3 4 Happy none Angry 2 3 4 Happy none Angry Mouth

  33. Expanded Factorial Design (Baldi) • Play Continuum-- brow fastest moving • Play Continuum-- mouth fastest moving

  34. Identification Judgments

  35. Rating Judgments

More Related