150 likes | 291 Views
User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog. Gabriel Skantze, David House & Jens Edlund. Setting. Errors in dialog. Dialog not always error free Error detection often made by grounding the user utterance using explicit or implicit verification:.
E N D
User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Gabriel Skantze, David House & Jens Edlund
Setting aaa
Errors in dialog • Dialog not always error free • Error detection often made by grounding the user utterance using explicit or implicit verification: User […] on the right I see a red building. System (low conf.) Did you say ’A red building’? System (high conf.) A red building… ok, take left […]? aaa
Grounding in dialog • Traditional dialog system grounding • Constructed as full propositions • Often perceived as tedious • Verifies entire user utterances • Fragmentary grounding • Fast • Focuses on problem words/concepts • Often used in human-human dialog User […] on the right I see a red building. System red? / red. aaa
The problem • Fragmentary grounding utterances are potentially ambiguous • Little syntax and structure • Prosody more critical • How do prosodic features affect the interpretationof such utterances? • How do fragmentary grounding utterances and their prosody affect the subsequent user behavior? aaa
Interpretations User […] on the right I see a red building. System red(?) Allwood et al. (1992), Clark (1996) aaa
Experiment I • Perception study to find out how prosodic features affect the interpretationof fragmentary grounding • 36 stimuli • Parameters: color word, peak position, peak height, vowel duration • LUKAS diphone MBROLA synthesis • 8 subjects • Task: Listen to each stimulus in dialog context and select an appropriate paraphrase aaa
Experiment I: results 2 3 1 • Interpretations: • OK, yellow • Do you really mean yellow? • Did you say yellow? aaa
Experiment II • Wizard of Oz experiment to find out how fragmentary grounding affects user behaviour • 8(+2) subjects • Task: to help the computer model color perception by answering questions about color similarities • The three prototypes from Experiment I were used to ground the user utterances aaa
Results • Subjects gave responses (”yes”, ”mm”) to grounding utterances in 243 of 294 cases • Responses were similar regardless of grounding type • 2 judges categorized the responses by listening to them together with paraphrases of the grounding utterances • Judges agreed in 50% of the cases aaa
Results • Subjects gave responses (”yes”, ”mm”) to grounding utterances in 243 of 294 cases • Responses were similar regardless of grounding type • 2 judges categorized the responses by listening to them together with paraphrases of the grounding utterances • Judges agreed in 50% of the cases aaa
Percentage of stimuli 100% ClarifyPerc 90% ClarifyUnd 80% Accept 70% 60% 50% 40% 30% 20% 10% 0% Accept ClarifyUnd ClarifyPerc Annotators' selected paraphrase Results The categories chosen by the judges corresponded significantly (chi-square) with the type of grounding utterance actually preceding the response. Significant correspondance aaa
Results • The silences between the end of the grounding utterances and the following user response were measured with /nailon/ - software for speech analysis. • Cognitive load hypothesis – responses to: • acceptance: fast • perception clarification request: slower • understanding clarification request: slowest • The results support the hypothesis (ANOVA) aaa
Relation to the field in general and the other contributions in particular • Important issues not addressed here: • Timing • Other modalities, e.g. facial gestures • Language and socio-cultural differences aaa
Where we want to be in 5-10 years • Goals: • More human-like error handling behavior in spoken dialog systems • Ability to generate appropriate grounding prosody for all types of utterances • Models for choosing prosody to achieve the desired pragmatic effect • Integration with fast and appropriate turn-taking aaa