130 likes | 248 Views
human language technologies. hlt. Data Collections & Studies WP4 - Emotions: Faces. Collection and annotation of audio-visual databases. extensive data collection, both at KTH and ISTC/IRST using opto-electronic systems reflective markers placed on the subject’s face
E N D
human language technologies hlt Data Collections & Studies WP4- Emotions: Faces
Collection and annotation ofaudio-visual databases • extensive data collection, both at KTH and ISTC/IRST • using opto-electronic systems • reflective markers placed on the subject’s face • capturing of dynamics of emotional facial expressions with very high precision. • eliciting technique: using movies to elicit facial expressions denoting emotions on watching subjects attempted – not promising • extraction technique: extract expressive behaviour directly from movies and television talk-shows attempted – not promising
KTH - first year DATABASE 1 • 15 emotions and attitudes were recorded (acted) anger, fear, surprise, sadness, disgust, happiness, worry, satisfaction, insecurity, confidence, questioning, encouragment, doubt, confirmation and neutral • Semantically neutral utterances, 9 utterances per expression DATABASE 2 • 6 emotional states confident, confirming, questioning, insecure, happy, neutral • VCV & VCCV nonsense words • CVC nonsense words • Short sentences • Common ITA-SWE set (abba, adda, alla, anna, avva) DATABASE 3 • Spontaneous dialogue
Eliciting technique: information seeking scenario Focus on the speaker who has the role of information giver The speaker whose facial and head motion is to be recorded seats facing 4 infrared cameras, a digital video-camera,a microphone and his/her interlocutor. Collection of audio-visual databases: interactive dialogues (KTH)
ISTC & IRST - first year • 6 emotional states (Ekman’s set) + Neutral • Anger, Disgust, Fear, Happiness, Sadness, Surprise • 3 intensities (Low, Medium, High) • “isolated” emotional expressions • VCV nonsense words (aba, ada, aLA, adZa, ala, ana, ava) • good phonetic coverage of Italian • Long sentence (“Il fabbro lavora con forza usando il martello e la tenaglia” – “the smith works with strength using the hammer and the pincer”) • common ITA-SWE set(VCCV nonsense words: abba, adda, alla, anna, avva) • “concatenated” emotional expressions • VCV nonsense words, in pairs, with different emotions • e.g. (aba)Neutral – (aba)Happy
Results • ISTC/IRST: 1573 recordings • 798 single emotional expressions (7 emotional states, 3 intensities – L, M, H) • 672 concatenated emotional expressions (in pairs, 3 emotional states - Anger, Happy, Neutral - medium intensity) • 57 long sentences (7 emotional states, 3 intensities) • 46 instances of the common ITA-SWE set (3 emotional states, medium intensity) • KTH: 1749 recordings (database 2) • 828 VCV words (138 x 6 emotional states) • 246 CVC words (41 x 6 emotional states) • 645 sentences (270 neutral + 75 x 5 emotional states) • 30 instances of the common ITA-SWE set Total: 3322 recordings
Qualisys recordings: Swedish db – 2nd year • 75 sentences with Ekman’s 6 basic emotions + neutral • Dialogues to analyze communicative facial expressions: • 10 short dialogues in a travel agency scenario • 15 sentences uttered with a focussed word, with the 6 expressions used in corpus 2 + angerExample:Båtenseglade förbi Båten seglade förbi Båten seglade förbi
See Poster!!! Audio-Visual Italian Database – 2nd year (IRST) : A database of human facial expressions (1) • Short videos containing acted kinetic facial expressions (video length: 4-27 secs.) • 8 professional actors (4 male and 4 female). • Each actor played Ekman’s set of six emotions (happy, sad, angry, surprised, disgusted, afraid) + neutral • Actors were asked to play each emotion on three intensity levels (Low -Medium – High) Total: 1008 short videos (= ~ 2h 50’)
See Poster!!! A database of human facial expressions (2) • Facial expressions recorded in two conditions: • “utterance” condition: actors played emotions while uttering a phonetically rich and visemically balanced sentence. • “non utterance” condition: actors played emotions without pronouncing any sentence. • Both video and audio signals were recorded. • After collecting the corpus: Data Selection • Validation of the emotions played • Video selection based on the accordance among judges. “In quella piccola stanza vuota c’era pero’ soltanto una sveglia”, <FEAR>, <HIGH> <DISGUST>, <HIGH>
Annotation of audio-visual databases: interactive dialogues ANVIL: tool for the analysis of digitized audio-visual data • Orthographic transcription of the dialogue • Annotation of the facial expressions related to emotions and of the communicative gestures (turn-taking, feedback and so on) • The annotation is performed on a freely definable multi-layered annotation scheme, created ad hoc for the specific purposes. • These levels go from a less detailed to a more detailed analysis • Annotation is performed on several main tracks, which are displayed, on the screen in alignment with the video and audio data
Annotation (cont’d) glad
Evaluation Studies (IRST) • Experiment 1: Comparison of emotion recognition rates from natural (actor) videos with different types of synthetic (synthetic face) videos, in different animation conditions [reference person: Fabio Pianesi – pianesi@itc.it] • Experiment 2: Cross-cultural comparison of emotion recognition rates from Italian and Swedish natural and synthetic videos [reference person: Fabio Pianesi – pianesi@itc.it] • Experiment 3: as for Experiment 1 but using • three regions of the face • only one animation condition (script based) [reference person: Michela Prete – prete@itc.it]
Papers on Evaluation Studies • J. Beskow, L. Cerrato, P. Cosi, E. Costantini, M. Nordstrand, F. Pianesi, M. Prete, G. Svanfeldt, "Preliminary Cross-cultural Evaluation of Expressiveness in Synthetic Faces". In E. André, L. Dybkiaer, W. Minker, P. Heisterkamp (eds.) "Affective Dialogue Systems", ADS '04, Springer Verlag. Berlin, 2004. • E. Costantini, F. Pianesi, P. Cosi, "Evaluation of Synthetic Faces: Human Recognition of Emotional Facial Displays ". In E. André, L. Dybkiaer, W. Minker, P. Heisterkamp (eds.) "Affective Dialogue Systems". Springer Verlag, Berlin, 2004 • E. Costantini, F. Pianesi, M. Prete "Recognising Emotions in Human and Synthetic Faces: The Role of the Upper and Lower Parts of the Face". To appear in Proceedings of IUI 2005: International Conference on Intelligent User Interfaces. San Diego, California, 2005.