110 likes | 124 Views
This paper explores the intersections between dialog systems and information retrieval in the domains of call routing and question answering. It discusses current work, classification approaches, dialog management, and the potential for voice-only information access.
E N D
Dialogue and Information Retrieval Dialogs on Dialogs March all the way through April 2003
Intersections between Dialog Systems and IR • Current work • Call Routing • Question Answering • Why so little? • What else? Let’s brainstorm!
Call Routing • Task: given a NL expression of a problem, classify (route) it in one of several categories • Examples • AT&T: How May I Help You • British Telecom • Jennifer Chu-Carroll
Call Routing (2) • It’s a classification problem! • Salience (co-ocurrence) based approaches (AT&T) • IR-like approaches (J. Chu-Carroll) • Treat user requests as “documents” • Use VSM and cosine similarity to classify
The IR in Call Routing • Regard the problem as text classification • Do standard IR work: • LSA • LDA • Centroid vs. KNN approaches • Results? Classification perf?
The Dialog in Call Routing • Disambiguation • Easy to do based on the VSM IR approach • Follow-up dialog • HMIHY: frame-based follow-up dialogs • Q: Is Call Routing dialog management? • Q: Or is it more like understanding? • Q: Why typical understanding/DM approaches fail in HMIHY-type domains?
Question Answering • Task: answer to a question in Natural Language from a database of documents in Natural Language. • Examples: • http://www.ai.mit.edu/projects/infolab/ • http://www.ask.com
IR in Question Answering • Everywhere: • Document indexing • Retrieval • … • What is different from traditional IR? • Some parsing/understanding of questions and documents • Some language generation (?)
Dialog in QA • Refining the question: • Clarification dialogue • Decide which question to ask • Only for very restricted domains uses fixed frames (Rutgers: HITIQA)
Why so little? • Different issues: • IR = lots of unstructured data, no NLP • Dialog = structured data, lots of NLP • Main problems: • Structural mismatch • NLU mismatch
But HUGE potential! • Voice-only random access to large amounts of information (“Voice IR”): • technical manuals of “in-the-field” devices (e.g. NASA) • tutorial systems • phone-based Google (e.g. legal information…) • GUI+ for IR • Learn dialog stuff from data (LM, NLG, parsing…)