300 likes | 929 Views
Multimodal Emotion Recognition. Colin Grubb Advisor: Nick Webb. Motivation. Multimodal fusion Research looking at audio, visual, and gesture information Feature Level vs. Decision Level. Previous Research.
E N D
Multimodal Emotion Recognition Colin Grubb Advisor: Nick Webb
Multimodal fusion Research looking at audio, visual, and gesture information Feature Level vs. Decision Level Previous Research
To what extent can we improve emotion recognition by using classification methods on audio and visual data? Research Question
Set of rules vs. training a classifier Rule set is too basic Will use classifier to learn outputs of unimodal systems Decision Level Analysis
EmoVoice (EMV) Real Time Audio Analysis Five emotional states w/ probabilities Published accuracy: 47.67% Audio System https://www.informatik.uni-augsburg.de/en/chairs/hcm/projects/emovoice/
(Negative Active) Angry (Negative Passive) Sad (NEutral) Neutral (PositveActive) Happy (Positive Passive) Content EmoVoice Confidence Levels negativeActive <0.40, 0.20, 0.10, 0.15, 0.15>
Software created by Prof. Shane Cotter Uses still images Published accuracy: 93.4% Visual System
Images Emotion: Happy Video Software Classifier Output: Happy I’m in a good mood! System Layout Emotion: Happy EmoVoice
8 subjects Five male, three female Audio Data Read sample sentences Visual Data Gather facial expressions from regular and long distance (6 ft.) Data Gathering
1 Weka Data Mining Software Used J48 Classifier C4.5 algorithm – decision tree Each branch represents decision made at that node 2 3 Experiments Output 1 Output 2 Output 3 Output 4 http://www.cs.waikato.ac.nz/ml/weka/
Final dataset classifies between Happy Angry Neutral Sad Audio performance: 38.43% Visual performance: 77.43 % Emotion Classes
Ran combined dataset against J.48 classifier Multimodal data initially ineffective Needed a way to improve dataset Initial Performance
How can we use the two individual systems to complement each other? Two pieces of information: What does the visual system do poorly on? What kind of biases does EmoVoice have? Improving Accuracy
Visual System Performs poorly at Neutral Some inaccuracy for all emotions tested EmoVoice Bias towards negative voice Very strong bias towards active voice Manual Bias
Happy: For all happy training instances, if PP + PA > NA & NE & NP, change EMV Class to Happy Sad: If NP is 2nd to NA and within 0.05, change EMV Class to Sad Neutral: If NE tied with another confidence level, change EMV Class to Neutral If all probabilities within 0.05 of each other, change EMV Class to Neutral EmoVoice – Modification Rules
Post Man. Bias Results
Spring Practicum Refine rules Automation Online Classifier Mount on robot; cause apocalypse Future Work
Questions? Comments? Thank you for listening.