On the road to the creation of situation-adaptive dialogue managers

On the road to the creation of situation-adaptive dialogue managers Ajay Juneja akj@andrew.cmu.edu 11-716 Dialog Seminar

Papers – a diverse selection • Design of the VICO Spoken Dialogue System: Evaluation of user Expectations by Wizard of Oz Experiments (Petra Geutner, Frank Steffens, and Dietrich Manstetten, LREC 2002) • An Automobile-Integrated System for Assessing and Reacting to Driver Cognitive Load (Pompei, Sharon, Buckley, and Kemp, 2002) • We are not amused – but how do you know? User States in a multi-modal dialogue system (Batliner, ZeiBler, Frank, Adelhardt, Shi, Noth, Eurospeech 2003)

VICO overview • Goal: to evaluate the use of a Natural-Language dialogue system in an automotive driving simulation • How do drivers interact with the NLP system? • What are the user’s reactions to such a system? • What are the distraction effects on the driving behavior? • VICO operates in a similar manner to Ariadne. Petra Geutner used to be part of the Interactive Systems Lab.

VICO – Driver Interactions • Some interactions were initiated by the dialogue manager, some were initiated by the user. • Dialogues ranged from 20-200 seconds in duration. • Were the pre-defined system prompts enough? According the results, they experienced much less variation in what people would say than they expected. As a result, they seemed to have very promising results. • This is contrary to what General Motors feels will be the case in NLP dialogue system design.

VICO – User Reactions • 9/10 people experienced an overall pleasant reaction to VICO. • Problems experienced in understanding the speech output from VICO (Speech synthesis) • VICO overloaded some drivers with too much information at once, one person felt VICO talked to fast.

VICO – distraction effects • People tended to slow down when using VICO as the most common side effect. • Drifting lanes was not common. • No accidents happened.

Pompei & Sharon:Reaction to Driver Cognitive Load • Utilized the following sensors: • Cameras • GPS • accelerometers • grip sensors • foot-position sensors • ultrasonic sensors on the bumpers • microphones • seat sensors • cup holder sensors

Pompei & Sharon:Reaction to Driver Cognitive Load • Also used Blue Eyes gaze tracking system (from IBM Almaden Research Center) • They DID NOT use a telematics or navigation system in the car, as they wanted to test the complexity of the typically owned vehicles as a baseline

Pompei & Sharon:Reaction to Driver Cognitive Load • Goals: • Monitor driver stress levels – Anger is associated with crashes. • Where the driver’s gaze and attention are. Will be utilized to determine if the driver is looking at the road or not. • Force the driver’s attention to a particular device with the use of LED’s. • Improve someone’s driving habits. • Limit the audibility of a cell phone or telematics system messages to just the driver. • Warn the driver when appropriate • “Busy” button to let the user tell the system that they do not want to be disturbed by the cell phone, telematics system, etc.

Pompei & Sharon:Reaction to Driver Cognitive Load • They haven’t yet done user studies or examined all of the data yet.

We are not amused – but how do you know? • Goal: to examine emotional states in the context of dialogue systems. • Used SmartKom dialogue system with gesture and facial expression recognition. • What prosodic features are relevant to classifying user emotional state?

We are not amused – but how do you know? • Checked for word boundaries by using fixed alignment. • Studied both holistic user states (Speech, Gestures, Facial Expression) and just facial expressions. • Marked significant deviations from neutral.

We are not amused – but how do you know? • Results: • Prosodic classification doesn’t work so well, and parts of speech don’t help so much. • Much confusion between user states of angry and helplessness. • They haven’t classified the facial data yet.

We are not amused – but how do you know? • Characterizations of User States (audio only): • Joyful is characterized by lower energy level and less (duration/F0) variation • Helpless has more pauses and longer durations • Angry has a higher energy level and less energy variation

Diverse topics, Where do we go from here? • VICO shows that NLP does have a place within the car over a command and control dialog. Distraction caused by an NLP dialogue system appears to be minor in their opinion • Pompei and Sharon show us a very interesting control system set up within a car to monitor a driver’s behavior and have a great framework for distraction studies. • Batliner, et. al show us that it is extremely hard to gather information on a users’ state from a dialogue manager alone.

Where do we go from here? • Within an automotive setting, utilize control systems akin to what Pompei and Sharon have done, and integrate them into a dialogue manager. • Have the dialogue manager adapt to different user states as monitored by outside data, not just emotional state as determine by tonal characteristics in one’s speaking behavior. • Toyota in another paper suggested that throttle control was the best measure of someone’s distraction level.

On the road to the creation of situation-adaptive dialogue managers