370 likes | 514 Views
Modeling facial expressions for Finnish talking head. Michael Frydrych, LCE, 11.6.2004. Finnish talking head. Computer animated model of a talking person Synchronized A/V speech Model of emotional facial expressions. User interface of “old” talking head. Talking Head.
E N D
Modeling facial expressions for Finnish talking head Michael Frydrych, LCE, 11.6.2004
Finnish talking head • Computer animated model of a talking person • Synchronized A/V speech • Model of emotional facial expressions Michael Frydrych, 11.6.2004
User interface of “old” talking head Michael Frydrych, 11.6.2004
Talking Head • What has been done with it? • Studies in audiovisual speech perception • Kiosk-interface at the University of Tampere • Cultural activities • Major role in play Kyberias at Kellariteatteri (2001) Michael Frydrych, 11.6.2004
Content • Talking heads – why? • Animation methods • Controlling animation • Making them speak • Practicals • -------------------------------------------------- • Making the head smile • Emotions –why? • Practicals Michael Frydrych, 11.6.2004
Why talking heads? • Entertainment • Information services • Ananova, information kiosks • Education services • Learning foreign languages,… • Agents in spoken dialogue systems • nonverbal signals, comfort Michael Frydrych, 11.6.2004
Tampere museums Michael Frydrych, 11.6.2004
Aids in communication • Speech is both heard and seen • Improve intelligibility in noisy env. • Aid for hearing impaired people • Synface Michael Frydrych, 11.6.2004
Synface (telephone -> animated face) Figure by KTH Stockholm Michael Frydrych, 11.6.2004
… applications • Language training • speech training for profoundly deaf • Diagnostics and therapy • EU: VEPSY, VREPAR (assess and treat naxiety disorders and specific phobia) Michael Frydrych, 11.6.2004
/pa/ /ta/ /ka/ Audiovisual speech integration • = combining auditory and visual percepts into a single speech percept • Strength of integration is demonstrated by McGurk-effect: combining sound /pa/ to a face ”telling” /ka/, speech percept is often /ta/(McGurk & MacDonald, 1976, Nature) Michael Frydrych, 11.6.2004
A study in audio-visual speech perception Result: Computer animated talking face improves intelligibility of auditory speech Michael Frydrych, 11.6.2004
… application in research • Psychophysical and psychophysiological experiments • Audiovisual speech perception • Emotion research … • Benefits • Natural stimuli may contain unwanted features • Full controllability • Quick creation of stimuli Michael Frydrych, 11.6.2004
Bulding on realism Realism: • Objective • topography, animation, texture, synchronization, ... • Subjective (communication) • Audio-visual speech • Facial expressions, nonverbal behavior (prosody, eye movements) Evaluation: Objective Subjective Michael Frydrych, 11.6.2004
Making the head speak Issues: • Voice - speech synthesizer • Animation – parameterization • Synchronization Michael Frydrych, 11.6.2004
Acoustic Speech Generation • Based on Festival platform. • Developed at The Centre for Speech Technology Research, University of Edinburg, England. • Scheme programming language, allows to program behaviour • Finnish voice, prosody, expansion (numerals, etc.) • Department of Phonetics, University of Helsinki • Issues: production of articulatory parameters, synchronization Michael Frydrych, 11.6.2004
Animation methods - representation • Polygonal • Keyframing • libraries of postures, interpolation • Parametric deformations • deformations are grouped under parameters meaningful to the animator • Muscle-Based deformations • Interactive deformations • numerous control points, deformation propagation • Free Form deformations • deformation associated with a deformation box Michael Frydrych, 11.6.2004
Splines • Implicit surfaces • Physics-based models • Physical models of the skin • Volume preservation • Deformations by inducing forces Michael Frydrych, 11.6.2004
Hooks to data • Need the geometry of faces • Rendering properties • Deformation of facial expression or speech • How? 2D and 3D techniques Michael Frydrych, 11.6.2004
3D Input • 3D digitizer is the most direct way, fairly automatic (Optotrack) • 3D trackers – digitizing of projected/marked mesh, rather manual • CT (Computer Tomography) and MRI (Magnetic Resonance Imaging) • and … 3D modelingprograms Michael Frydrych, 11.6.2004
2D Input • Photogrammetry • Two images of an object are taken from different viewpoints, corresponding points are found • The 3D shape of faces can be determined from a single 2D image after projecting of regular pattern • Generic facial model is prepared and transformed to “match” a photograph • 3rd dimension can be approximated by acquiring face model (set priors) and Bayesian inference Michael Frydrych, 11.6.2004
Texture mapping Michael Frydrych, 11.6.2004
break Michael Frydrych, 11.6.2004
Data for articulation and expressions • Keyframing -> expression libraries • Real-time/performance data • Parameterization • Articulatory parameters – jaw opening, lip rounding, lip protrusion, … • Facial expressions – FACS • Statistical models from expression libraries or real-time data Michael Frydrych, 11.6.2004
Statistical parameterization Parameterized model learned from 3D performance data (Reveret) Figure by ISCP Grenoble Michael Frydrych, 11.6.2004
… three control parameters Figure by ISCP Grenoble Michael Frydrych, 11.6.2004
… and the results Jaw Opening Figure by ISCP Grenoble Rounding Raising Michael Frydrych, 11.6.2004
Video by ISCP Grenoble Michael Frydrych, 11.6.2004
Finnish talking head • Audiovisual database • Using MaxReflex 3D optical tracker (at Linköping Univ.) • Multiple IR cameras, reflexive markers reconstruction from stereo • Coarticulation, lips, visual prosody Michael Frydrych, 11.6.2004
Point-lights positions Michael Frydrych, 11.6.2004
Demo – live recording at Linköping Michael Frydrych, 11.6.2004
How to create “visemes” ? Michael Frydrych, 11.6.2004
Demo – reconstructed motion 10 fps 40 fps Michael Frydrych, 11.6.2004
Figure by ISCP Grenoble Michael Frydrych, 11.6.2004
End of 1st part Michael Frydrych, 11.6.2004