860 likes | 1.06k Views
Perceiving Talking Faces: A Paradigm for Multimodal Communication. Perceptual Science Laboratory University of California Santa Cruz, CA 95064 mambo.ucsc.edu. Perceptual Science Laboratory (PSL). Dom Massaro Michael Cohen Christopher Campbell Rashid Clark Jonas Beskow
E N D
Perceiving Talking Faces: A Paradigm for Multimodal Communication Perceptual Science Laboratory University of California Santa Cruz, CA 95064 mambo.ucsc.edu
Perceptual Science Laboratory (PSL) • Dom Massaro • Michael Cohen • Christopher Campbell • Rashid Clark • Jonas Beskow • Kartik Venkataraman • Tony Rodriguez • Nathan Sanders
Interdisciplinary Endeavor • Cognitive Science • Psychology • Philosophy • Linguistics • Computer Sciences • Anthropology
Anecdotal Evidence • Persons with Hearing Loss • Benjamin Franklin in France • Hal in 2001: A Space Odyssey • “Hear TV Better with Glasses On” • Poorly Dubbed Foreign Films
Summary of Research • Psychological and Psycholinguistic Inquiry • How we make sense of the world • Value of multiple sources of information • Expert parallel processors • Developing and evaluating a talking head • Application in language tutoring • Implications for education • Other application possibilities
Theories of Speech Perception • Psychoacoustic Theories • Motor Theory • Direct Perception Theory • Pattern Recognition Theory
Theories of Speech Perception • Psychoacoustic Theories • Influence is Solely Auditory
Theories of Speech Perception • Motor Theory • Speech Production Mediates Perception
Theories of Speech Perception • Direct Perception Theory • Articulatory Gestures Enforce Direct Perception
Theories of Speech Perception • Pattern Recognition Theory • Speech is Prototypical Pattern Recognition
Theories of Speech Perception • Psychoacoustic Theories • Influence of Visible Speech • No Obvious Mechanism
Theories of Speech Perception • Motor Theory • Influence of Context
Theories of Speech Perception • Direct Perception Theory • Influence of Non-Articulatory Sources
Theories of Speech Perception • Pattern Recognition Theory • Best Description of All Relevant Results
Research Strategy to Develop and Evaluate the Effectiveness of Visible Speech • Control Presentation • Auditory Synthetic speech • Computer Animated Talking Head • Development and Evaluation
Experimental Strategy • Manipulate auditory and visual speech • present unimodal stimuli • present factorial bimodal stimuli • test models of perception
BA VA THA DA none BA VA THA DA none Auditory Visual
stimulus input speech alternatives a lot like /da/ visual /da/ mostly nothing like /ba/ not at all like /da/ somewhat like /va/ auditory /ba/ A lot like /ba/ a little like /tha/
A BA 2 3 4 DA none BA 2 3 4 DA none V
Pattern Recognition • Central to Cognition • Multiple Continuous Sources of Information • Bimodal Speech Perception • Other Domains • Reading, visual perception, skill learning • Universal Principle
Fuzzy Logical Model of Perception (FLMP) • Continuous Information (Fuzzy Logic) • Independence of Sources • Multiplicative Integration of Sources • Optimal Integration Rule
Fuzzy Logic • Truth of Proposition x: t(x) • Truth of Proposition y: t(y) • 0 < t(x) < 1, 0 < t(y) < 1 • Negation of x: t(~x) = 1 - t(x) • Conjunction: t(x and y) = t(x) t(y) • Disjunction: DeMorgan’s Law • t(x or y) = t(x) + t(y) - t(x) t(y)
FLMP • Evaluation: /ba/ - Rising F2-F3 and Closed Lips • /da/ - Level F2-F3 and Open Lips • Integration: s(/ba/) = (1 - a)(1 - v) • s(/da/) = av • Decision: • av • P(/da/) = -------------------------- • av + (1 - a)(1 - v)
Confusion Matrix • Auditory /va/ & visual /da/ ---> /tha/ • Auditory /ba/ & visual /da/ ---> /tha/
stimulus input speech alternatives a lot like /da/ visual /da/ mostly nothing like /ba/ not at all like /da/ somewhat like /va/ auditory /ba/ A lot like /ba/ a little like /tha/
A BA 2 3 4 DA none BA 2 3 4 DA none V
Pattern Recognition • Central to Cognition • Multiple Continuous Sources of Information • Bimodal Speech Perception • Other Domains • Reading, visual perception, skill learning • Universal Principle
A i Evaluation V j a v i j Integration s k Decision R k
FLMP • Evaluation: /da/ - Level F2-F3 and Open Lips • /ba/ - Rising F2-F3 and Closed Lips • Integration: s(/da/ | A) = a • s(/ba/ | A) = (1 - a) • Decision: • a • P(/da/ | A) = -------------------------- = a • a + (1 - a)
FLMP • Evaluation: /da/ - Level F2-F3 and Open Lips • /ba/ - Rising F2-F3 and Closed Lips • Integration: s(/da/ | A) = .6 • s(/ba/ | A) = (1 - .6) • Decision: • .6 • P(/da/ | A) = -------------------------- = .6 • .6 + (1 - .6)