230 likes | 395 Views
Using Word-level Features to Better Predict Student Emotions during Spoken Tutoring Dialogues. Mihai Rotaru Diane J. Litman Graduate Research Competition. Breaking down the (long) title. using. FEATURES. TO PREDICT. BETTER. STUDENT EMOTIONS. TUTORING DIALOGUES. WORD-LEVEL. SPOKEN.
E N D
Using Word-level Features to Better Predict Student Emotions during Spoken Tutoring Dialogues Mihai Rotaru Diane J. Litman Graduate Research Competition
Breaking down the (long) title using FEATURES TO PREDICT BETTER STUDENT EMOTIONS TUTORING DIALOGUES WORD-LEVEL SPOKEN
Tutoring dialogues • Education • Classroom setting • Human (one-on-one) tutoring • Computer tutoring (ITS – Intelligent Tutoring Systems) • Addressing the learning gap between human and computer tutoring • Improve the language understanding module of ITS • Incorporate affective reasoning • Adding human-provided emotional scaffolding to a reading tutor increases student persistence (Aist et al., 2002)
Human-Computer Excerpt Tutor26: Let me ask my question in a more general way. What happens to the velocity of a body when there is no force acting on it? Student27: dammit(ASR: it is) Tutor28 : Could you please repeat that? Student29 : same(ASR: i same) Tutor30 : Right. And what happens to the HORIZONTAL velocity of a body when there is no HORIZONTAL force acting on it? Student31: zero(ASR: the zero) Tutor32 : Well... OK, let's consider how forces in the vertical direction affect the motion of a body in a horizontal direction in a different scenario <…omitted…> Student33: oh gravity you already said this(ASR: beats gravity you're exert less) Tutor34: Fine. Are there any other forces acting on the apple as it falls? Student35: no why are you doing this again(ASR: no y and to it yes) Tutor36: Correct. The only force acting on the apple as it falls to the earth is the earth's gravitational force. In which direction does gravitational force act? Student37: downward you computer(ASR: downward you computer)
Affective reasoning • Prerequisites • Dialogue-based ITS : Why2 • Interaction via speech : ITSPOKE (Intelligent Tutoring SPOKEn dialogue system) • Affective reasoning • Detect student emotions • Handle student emotions
Outline using FEATURES TO PREDICT BETTER STUDENT EMOTIONS TUTORING DIALOGUES WORD-LEVEL SPOKEN
Student emotions • Emotion annotation • 3 Main emotion classes • Negative - e.g. uncertain, bored, irritated, confused, sad • Positive - e.g. confident, enthusiastic • Neutral - no strong expression of negative or positive emotion; • Predict EnE (Emotional, Non-Emotional) • negatives and positives are conflated as Emotional neutrals are Non-Emotional • useful for triggering system adaptation (HH corpus analysis) • Corpora • Human-Human (453 student turns from 10 dialogues) • Human-Computer (333 student turns from 15 dialogues)
Annotation example Tutor: Uh let us talk of one car first. Student: ok. (EMOTION = NEUTRAL; Non-Emotional) Tutor: If there is a car, what is it that exerts force on the car such that it accelerates forward? Student: The engine. (EMOTION = POSITIVE; Emotional) Tutor: Uh well engine is part of the car, so how can it exert force on itself? Student: um… (EMOTION = NEGATIVE; Emotional)
Outline using FEATURES TO PREDICT BETTER STUDENT EMOTIONS TUTORING DIALOGUES WORD-LEVEL SPOKEN
Predicting student emotion • Lexical (word choice) • Prosodic • Dialogue context • Others Tutor: Uh let us talk of one car first. Student: ok. Tutor: If there is a car, what is it that exerts force on the car such that it accelerates forward? Student: The engine. Tutor: Uh well engine is part of the car, so how can it exert force on itself? Student: um… Features 321654615, asdakd, 342.234234 Asdhkas, a34334, 324,7657755 • Previous work • Mostly turn-level • Very few word-level • No comparison between turn-level and word-level
Why word-level features? • Emotion might not be expressed over the entire turn • “This is great” • Non-emotional words might wash out the effect of emotional words Dispirited Happy
Why word-level features? (2) • Can approximate pitch contour better at sub-turn levels. • Especially for longer turns This is great
But wait… Features Machine learning Student turn Turn emotional class 321654615, asdakd, 342.234234 Asdhkas, a34334, 324,7657755 Turn-level Word-level Word 1 321654615, asdakd, 342.234234 Asdhkas, a34334, 324,7657755 ? … … Turn emotional class Word n 321654615, asdakd, 342.234234 Asdhkas, a34334, 324,7657755 Machine learning 321654615, asdakd, 342.234234 Asdhkas, a34334, 324,7657755 Sönmez et al., 1998
Word-level emotion model Features Machine learning Student turn Turn emotional class 321654615, asdakd, 342.234234 Asdhkas, a34334, 324,7657755 Turn-level Word-level Word-level emotion Word 1 321654615, asdakd, 342.234234 Asdhkas, a34334, 324,7657755 … … … Turn emotional class Word n Word-level emotion 321654615, asdakd, 342.234234 Asdhkas, a34334, 324,7657755
Word-level emotion model • Training phase • Each word labeled with turn class • Extra features to identify the position of the word in the turn (distance in words from the beginning and end of the turn) • Learn emotion model at the word level • Test phase • Predict each word class based on the learned model • Use majority/weighted voting to label the turn based on its word classes • Ties are broken randomly
Outline using FEATURES TO PREDICT BETTER STUDENT EMOTIONS TUTORING DIALOGUES WORD-LEVEL SPOKEN
Questions to answer • Will word level feature work better than turn level features for emotion prediction? • Yes • If yes, where does the advantage comes from? • Better prediction of longer turns • Is there a feature set that offers robust performance? • Yes. Combination of pitch and lexical features at word level.
Experiment setup • Two contrasting corpora • Two contrasting learners (WEKA) • IB1 – nearest neighbor classifier • ADA – boosted decision trees
Feature sets • Only pitch and lexical features • 6 sets of features • Turn level: • Lex-Turn – only lexical • Pitch-Turn – only pitch • PitchLex-Turn – lexical and prosodic • Word level: • Lex-Word – only lexical + positional • Pitch-Word – only pitch + positional • PitchLex-Word – lexical and prosodic + positional • Baseline: majority class • 10 x 10 cross validation
Results – IB1 on HH • Word-level features significantly outperform turn-level features • Word-level better than turn-level on longer turns • Best performers: Lex-Word, PitchLex-Word
Discussion • Lexical features at turn and word-level are similar • Performance dependent on corpus and learner • Pitch features differ significantly • Word-level better than turn-level (4/6) • PitchLex-Word a consistent best performer • Our best accuracies comparable with previous work
Conclusions & Future work • Word-level better than turn-level for emotion prediction • Even under a very simple word-level emotion model • Word-level better at predicting longer turns • PitchLex-Word a consistent best performer • Future work: • More refined word-level emotion models • HMMs • Co-training • Filter irrelevant words • Use the prosodic information left out • See if our conclusions generalize on detecting student uncertainty • Experiment with other sub-turn units (breath groups)
Acknowledgements • ITSPOKE group • Diane Litman, Hua AI, Kate Forbes-Riley, Beatriz Maeireizo-Tokeshi, Amruta Purandare, Scott Silliman, Joel Tetreault, Art Ward • NLP group • People who heard my presentation so many times