270 likes | 365 Views
19 February 2008. Facial Expression for Human-Robot Interaction – A prototype. http://robotics.ece.auckland.ac.nz. Matthias Wimmer Technische Universitat M ü nchen Bruce MacDonald, Dinuka Jayamuni, and Arpit Yadav Department of Electrical and Computer Engineering, Auckland. Outline.
E N D
19 February 2008 Facial Expression for Human-Robot Interaction – A prototype http://robotics.ece.auckland.ac.nz Matthias Wimmer Technische Universitat München Bruce MacDonald, Dinuka Jayamuni, and Arpit Yadav Department of Electrical and Computer Engineering, Auckland
Outline Motivation Background Facial expression recognition method Results on a data set Results with a robot (the paper contribution) Conclusions
Motivation: Goal Our Robotics group goals: To create mobile robotic assistants for humans To make robots easier to customize and to program by end users To enhance interactions between robots and humans Applications: healthcare, eg aged care Applications: agriculture (eg Ian's previous presentation) (Lab visit this afternoon) Robot face
Motivation: robots in human spaces Increasingly, robots live in human spaces and interact closely InTouch remote doctor
Motivation: close interactions RI-MAN http://www.bmc.riken.jp/~RI-MAN/index_us.html
Motivation: different types of robot Robots have many forms; how do people react? Pyxis HelpMate SP Robotic Courier System Delta Regional Medical Centre, Greenville, Mississippi
Motovation: different robot behaviour AIBO (Sony) Paro the therapeutic baby seal robot companion http://www.aist.go.jp/aist_e/latest_research/2004/20041208_2/20041208_2.html
Motivation: supporting the emotion dimension Robots must give support with psychological dimensions home and hospital help therapy companionship We must understand/design the psychology of the exchange Emotions play a significant role Robots must respond to and display emotions Emotions support cognition Robots must have emotional intelligence Eg during robot assisted learning Eg security screening robots Humans’ anxiety can be reduced if a robot responds well [Rani et al, 2006]
Motivation: functionality of emotion response Not just to be “nice”; the emotion dimension is essential to effective robot functionality [Breazeal]
Motivation: robots must distinguish human emotional state However, recognition of human emotions is not straightforward Outward expression versus internal mood states People smile when happy AND they are interacting with humans Olympic medalists don’t smile until the presenter appears (eg 1948 football team) Ten pin bowlers smile when they turn back to their friends
Motivation: deciphering human emotions • Self-reports are more accurate than observer ratings • Current research attempts to decipher human emotions • facial expressions • speech expression • heart rate, skin temperature, skin conductivity www.cortechsolutions.com
Motivation: Our focus is on facial expressions Despite the limitations, we focus on facial expression interpretation from visual information. Portable, contactless Needs no special nor additional sensors Similar to humans' interpretation of emotions (which is by vision and speech) No interference with normal HRI Asimo www.euron.org
Background • Six universal facial expressions (Ekman et al.) • Laughing, surprised, afraid, disgusted, sad, angry • Cohn-Kanade-Facial-Expression database (488 sequences, 97 people) • Performed • Exaggerated • Determined by • Shape • Muscle motion
Background: Why are they difficult to estimate? • Different faces look different • Hair, beard, skin-color, … • Different facial poses • Only slight muscle activity
Background • Typical FER process [Pantic & Rothkrantz, 2000]
Background: Challenges 1. Face detection and 2. feature extraction challenges: • Varying shape, colour, texture, feature location, hair • Spectacles, hats • Lighting conditions including shadows 3. Facial expression classification challenges: • Machine learning
Background: related work • Cohen et al: 3D wireframe with 16 surface patches • Bezier volume parameters for patches • Bayesian network classifiers • HMMs model muscle activity over time • Bartlett et al: Gabor filters using AdaBoost, Support Vector • 93% accuracy on Cohn-Kanade DB • Is tuned to DB
Background: challenges for robots • Less constrained face pose and distance from camera • Human may not be facing the robot • Human may be moving • More difficulty in controlling lighting • Robots move away! • Real time result is needed (since the robot moves)
Facial expression recognition (FER) methodMatt’s model based approach
FER method • Cootes et al statistics based deformable model (134 points) • Translation, scaling, rotation • Vector bof 17 face configuration parameters • Rotate head b1, open mouth b3, change gaze direction b10
FER method: Model-based image interpretation • The model The model contains a parameter vector that represents the model’s configuration. • The objective functionCalculates a value that indicates how accurately a parameterized model matches an image. • The fitting algorithmSearches for the model parameters that describe the image best, i.e. it minimizes the objective function.
FER method • Two step process for skin colour: see [Wimmer et al, 2006] • Viola & Jones technique detects a rectangle around the face • Derive affine transformation parameters of the face model • Estimate b parameters • Viola & Jones repeated • Features are learned to localize face features • Objective function compares an image to a model • Fitting algorithm searches for a good model
FER method: learned objective function • Reduce manual processing requirements by learning the objective function [Wimmer et al, 2007a & 2007b] • Fitting method: hill-climbing
FER method Facial feature extraction: • Structural (configuration b) and temporal features (2 secs) Expression classification • Binary decision tree classifier is trained on 2/3 of data set
Results on a dataset Happiness and fear have similar muscle activity around the mouth, hence the confusion between them.
Results on a robot • B21r robot • Some controlled lighting • Human about 1m away • 120 readings of three facial expressions • 12 frames a second possible • Tests at 1 frame per second
Conclusions Robots must respond to human emotional states Model based FER technique (Wimmer) 70% accuracy on Cohn-Kanade data set (6 expressions) 67% accuracy on a B21r robot (3 expressions) Future work: better FER is needed Improved techniques Better integration with robot software Improve accuracy by fusing vital signs measurements