1 / 86

Perceiving Talking Faces: A Paradigm for Multimodal Communication

Perceiving Talking Faces: A Paradigm for Multimodal Communication. Perceptual Science Laboratory University of California Santa Cruz, CA 95064 mambo.ucsc.edu. Perceptual Science Laboratory (PSL). Dom Massaro Michael Cohen Christopher Campbell Rashid Clark Jonas Beskow

fruma
Download Presentation

Perceiving Talking Faces: A Paradigm for Multimodal Communication

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Perceiving Talking Faces: A Paradigm for Multimodal Communication Perceptual Science Laboratory University of California Santa Cruz, CA 95064 mambo.ucsc.edu

  2. Perceptual Science Laboratory (PSL) • Dom Massaro • Michael Cohen • Christopher Campbell • Rashid Clark • Jonas Beskow • Kartik Venkataraman • Tony Rodriguez • Nathan Sanders

  3. Interdisciplinary Endeavor • Cognitive Science • Psychology • Philosophy • Linguistics • Computer Sciences • Anthropology

  4. Visual Communication in Love

  5. Anecdotal Evidence • Persons with Hearing Loss • Benjamin Franklin in France • Hal in 2001: A Space Odyssey • “Hear TV Better with Glasses On” • Poorly Dubbed Foreign Films

  6. Summary of Research • Psychological and Psycholinguistic Inquiry • How we make sense of the world • Value of multiple sources of information • Expert parallel processors • Developing and evaluating a talking head • Application in language tutoring • Implications for education • Other application possibilities

  7. Theories of Speech Perception • Psychoacoustic Theories • Motor Theory • Direct Perception Theory • Pattern Recognition Theory

  8. Theories of Speech Perception • Psychoacoustic Theories • Influence is Solely Auditory

  9. Theories of Speech Perception • Motor Theory • Speech Production Mediates Perception

  10. Theories of Speech Perception • Direct Perception Theory • Articulatory Gestures Enforce Direct Perception

  11. Theories of Speech Perception • Pattern Recognition Theory • Speech is Prototypical Pattern Recognition

  12. Theories of Speech Perception • Psychoacoustic Theories • Influence of Visible Speech • No Obvious Mechanism

  13. Theories of Speech Perception • Motor Theory • Influence of Context

  14. Theories of Speech Perception • Direct Perception Theory • Influence of Non-Articulatory Sources

  15. Theories of Speech Perception • Pattern Recognition Theory • Best Description of All Relevant Results

  16. Research Strategy to Develop and Evaluate the Effectiveness of Visible Speech • Control Presentation • Auditory Synthetic speech • Computer Animated Talking Head • Development and Evaluation

  17. About Baldi

  18. Hallucinations

  19. Experimental Strategy • Manipulate auditory and visual speech • present unimodal stimuli • present factorial bimodal stimuli • test models of perception

  20. BA VA THA DA none BA VA THA DA none Auditory Visual

  21. stimulus input speech alternatives a lot like /da/ visual /da/ mostly nothing like /ba/ not at all like /da/ somewhat like /va/ auditory /ba/ A lot like /ba/ a little like /tha/

  22. A BA 2 3 4 DA none BA 2 3 4 DA none V

  23. Hallucinations

  24. Pattern Recognition • Central to Cognition • Multiple Continuous Sources of Information • Bimodal Speech Perception • Other Domains • Reading, visual perception, skill learning • Universal Principle

  25. Fuzzy Logical Model of Perception (FLMP) • Continuous Information (Fuzzy Logic) • Independence of Sources • Multiplicative Integration of Sources • Optimal Integration Rule

  26. Fuzzy Logic • Truth of Proposition x: t(x) • Truth of Proposition y: t(y) • 0 < t(x) < 1, 0 < t(y) < 1 • Negation of x: t(~x) = 1 - t(x) • Conjunction: t(x and y) = t(x) t(y) • Disjunction: DeMorgan’s Law • t(x or y) = t(x) + t(y) - t(x) t(y)

  27. FLMP • Evaluation: /ba/ - Rising F2-F3 and Closed Lips • /da/ - Level F2-F3 and Open Lips • Integration: s(/ba/) = (1 - a)(1 - v) • s(/da/) = av • Decision: • av • P(/da/) = -------------------------- • av + (1 - a)(1 - v)

  28. Confusion Matrix • Auditory /va/ & visual /da/ ---> /tha/ • Auditory /ba/ & visual /da/ ---> /tha/

  29. stimulus input speech alternatives a lot like /da/ visual /da/ mostly nothing like /ba/ not at all like /da/ somewhat like /va/ auditory /ba/ A lot like /ba/ a little like /tha/

  30. A BA 2 3 4 DA none BA 2 3 4 DA none V

  31. Play Auditory continuum

  32. Pattern Recognition • Central to Cognition • Multiple Continuous Sources of Information • Bimodal Speech Perception • Other Domains • Reading, visual perception, skill learning • Universal Principle

  33. A i Evaluation V j a v i j Integration s k Decision R k

  34. FLMP • Evaluation: /da/ - Level F2-F3 and Open Lips • /ba/ - Rising F2-F3 and Closed Lips • Integration: s(/da/ | A) = a • s(/ba/ | A) = (1 - a) • Decision: • a • P(/da/ | A) = -------------------------- = a • a + (1 - a)

  35. FLMP • Evaluation: /da/ - Level F2-F3 and Open Lips • /ba/ - Rising F2-F3 and Closed Lips • Integration: s(/da/ | A) = .6 • s(/ba/ | A) = (1 - .6) • Decision: • .6 • P(/da/ | A) = -------------------------- = .6 • .6 + (1 - .6)

More Related