1 / 26

Error Detection in Human-Machine Interaction

Explore challenges in speech recognition technology, typical error rates, and solutions for detecting and recovering from errors in human-machine interaction. Examines spoken-language interfaces and error prevention strategies.

debrac
Download Presentation

Error Detection in Human-Machine Interaction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Error Detection in Human-Machine Interaction Dan Bohus DoD Group, Oct 2002

  2. Errors in Spoken-Language Interfaces • Speech Recognition is problematic: • Input signal quality • Accents, Non-native speakers • Spoken language disfluencies: stutters, false-starts, /mm/, /um/ • Typical Word Error Rates in SDS: 10-30% • Systems today lack the ability to gracefully recover from error

  3. An example S: Are you a registered user ? U: No I'm not. No [NO I'M NOT NO] S: What is your full name ? U: Adam Schumacher [ATHENS IN AKRON] S: There is an Athens in Georgia and in Greece. Which destination did you want ? U: Neither [NEITHER] S: I'm sorry, I didn't catch that. There is an Athens in Georgia and in Greece. Which destination did you want ? U: Georgia [GEORGIA] S: A flight from Athens... Where do you want to go ? U: Start over [START OVER] S: Do you really want to start over ? U: Yes [YES] S: What city are you leaving from ? U: Hamilton, Ontario [HILTON ONTARIO] S: Sorry, I'm not sure I understood what you said. Where are you leaving from ? U: Hamilton [HILTON] S: I'm still having trouble understanding you. To go on, I need you to answer the following question. What city are you leaving from ? U: Toronto [TORONTO]

  4. Pathway to a solution • Make systems aware of unreliability in their inputs • Confidence scores • Develop a model which learns to optimally choose between several prevention/repair strategies • Identify strategies • Express them in a computable manner • Develop the model

  5. Papers • Error Detection in Spoken Human-Machine Interaction[E.Krahmer, M. Swerts, M. Theune, M. Weegels] • Problem Spotting in Human-Machine Interaction[E.Krahmer, M. Swerts, M. Theune, M. Weegels] • The Dual of Denial: Discomfirmations in Dialogue and Their Prosodic Correlates[E.Krahmer, M. Swerts, M. Theune, M. Weegels]

  6. Goals • [Let’s look at dialog on page 2] • (1) Analysis of positive an negative cues we use in response to implicit and explicit verification questions • (2) Explore the possibilities of spotting errors on line

  7. Explicit vs. Implicit • Explicit • Presumably easier for the system to verify • But there’s evidence that it’s not as easy … • Leads to more turns, less efficiency, frustration • Implicit • Efficiency • But induces a higher cognitive burden which can result in more confusion • ~ Systems don’t deal very well with it…

  8. Clarke & Schaeffer • Grounding model • Presentation phase • Acceptance phase • Various indicators • Go ON / YES • Go BACK / NO • Can we detect them reliably (when following implicit and explicit verification questions) ?

  9. Positive and Negative Cues

  10. Experimental Setup / Data • 120 dialogs : Dutch SDS providing train timetable information • 487 utterances • 44 (~10%) not used • Users accepting a wrong result • Barge-in • Users starting their own contribution • Left 443 resulting adjacent S/U utterances

  11. Results – Nr of words

  12. Results – Empty turns (%)

  13. Results – Marked word order %

  14. Results – Yes/No

  15. Results – Repeated/Corrected/New

  16. First conclusion • People use more negative cues when there are problems • And even more so for implicit confirmations (vs. explicit ones)

  17. How well can you classify • Using individual features • Look at precision/recall • Explicit: absence of confirmation • Implicit: non-zero number of corrections • Multiple features • Used memory based learning • 97% accuracy (maj. Baseline 68%) • Confirm + Correct is winning, although individually less good • This is overall, right ? How about for explicit vs. implicit ?

  18. BUT !!! • How many of these features are available on-line?

  19. What else can we throw at it ? • Prosody (next paper) • Lexical information • Acoustic confidence scores • Maybe also of previous utterances • Repetitions/Corrections/New info on transcript ? • … • …

  20. Papers • Error Detection in Spoken Human-Machine Interaction[E.Krahmer, M. Swerts, M. Theune, M. Weegels] • Problem Spotting in Human-Machine Interaction[E.Krahmer, M. Swerts, M. Theune, M. Weegels] • The Dual of Denial: Discomfirmations in Dialogue and Their Prosodic Correlates[E.Krahmer, M. Swerts, M. Theune, M. Weegels]

  21. Goals • Investigate the prosodic correlates of disconfirmations • Is this slightly different than before ? (i.e. now looking at any corrections? Answer: No) • Looked at prosody on “NO” as a go_on vs a go_back: • Do you want to fly from Pittsburgh ? • Shall I summarize your trip ?

  22. Human-human • Higher pitch range, longer duration • Preceded by a longer delay • High H% boundary tone • Expected to see same behavior for disconfirmation in human-machine

  23. Prosodic correlates • Yes, the correlations are there as expected

  24. Perceptual analysis • Took 40 “No” from No+stuff, 20 go_on and 20 go_back (note that some features are lost this way…) • Forced choice randomized task, w/ no feedback; 25 native speakers of Dutch • Results • 17 go_on correctly identified above chance • 15 go_back correctly identified above chance; but also 1 incorrectly identified above chance.

  25. Discussion • Q1: Blurred relationships … • Confidence annotation • Go_on / Go_back signal • Is that the same as corrections ? • Is that the most general case for responses to implicit/explicit verifications, or should we have a separate detector ? • Q2: What other features could we throw at these problems ? What are the “most juicy” ones ?

  26. Discussion • Q3: For implicit confirms, are these different in terms of induced response behavior ? • When do you want to leave Pittsburgh ? • Travelling from Pittsburgh … when do you want to leave ? • When do you want to leave from Pittsburgh to Boston ?

More Related