430 likes | 440 Views
This workshop brought together researchers working on dialog management to discuss and learn about dialog systems in the medical diagnosis domain. Participants discussed and implemented dialog systems using Java and a web-based interface.
E N D
MITRE Dialog Management Workshop – a review Dan Bohus Dialogs on Dialogs reading group CMU, November 2003
The Workshop • MITRE Dialog Workshop • @ MITRE, Bedford/Boston • October 27-28, 2003 • Idea • Bring together researchers working on dialog management • Give them a homework • Adapt you dialog manager to a medical diagnosis domain (details in a sec) • Discuss, compare, learn MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
The Homework • Implement a dialog system for the medical diagnosis domain • Task left open-ended (diagnosis, tutoring, etc) • No speech, just text in and out • Backend provided backend.doc • Java version and web-based interface version • 3 diseases: malaria, coccidioidomycosis, another one • List of symptoms: headache, nausea, muscle pain, etc. • Decision tree involving symptoms and tests (fever, blood tests, travel patterns, etc) • Small enough to presumably not be lots of work, but large enough to allow illustration of functionalities, and provide some skeleton to the discussions… MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Participants • MITRE (Carl Burke et al) MiDiKi • Gothenburg (Staffan Larsson) GoDiS (TRINDIKit) • USC ICT (David Traum) ICT Dialogue Manager • NTT/CMU (Matthias Denecke) Ariadne • CMU (Dan, Alex) RavenClaw • Ames (Beth-Ann Hockey) NASA Dialogue Manager • DFKI (Norbert Reithinger) DFKI Dialogue Manager • MERL (Candy Sidner, Charles Rich) COLLAGEN … and others invited but not present MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
GoDiS GoDiS MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
GoDiS • TRINDIKit – information state update dialogue management toolkit • Information state • Private: dialog plan, beliefs, agenda (short term goals) • Shared: established facts, QUD, last utterance information • Dialog moves • Update rules • GoDiS: dialog management system implemented in TRINDIKit, handing: • information oriented dialogue • action oriented dialogue MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
control DME input inter- pret update select gene- rate output • TIS • DEVICES LEXICON DOMAIN backend interface lexicon domain knowledge TRINDIKit / GoDiS architecture Dialog plansOntology Connection to Java Backend MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
GoDiS: Task Representation • Plans; propositional logic • Dialogue plans for dealing with diagnosis (issues opened at dialogue start) • ?x.disease(x): ”which disease is diagnosed?” • ?confirmed_by_interview: ”Is the diagnosis confirmed by additional information?” • ?confirmed_by_tests: ”Is the diagnosis confirmed by medical tests?” • Additional plans • ?x.info(x): ”What information is there about a given disease?” • ?x.treatment(x): ”What treatment is there for a given disease?” MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
GoDiS: Alternate Tasks • User-driven dialogue (implemented) • Not load issues when resetting; user has to raise all issues • User can ask system to • Provide a diagnosis • Confirm whether user has given disease • Decision trees as dialogue plans • Move backend knowledge into dialogue plans • Information conversion could be done automatically • Separate genre: expert system dialogue • Add special purpose update rules • Dynamic dialogue planning by expert MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
GoDiS: Highlights / Lowlights • Highlights: • Reuse, you get for free: • Grounding • Accomodation / plan recognition • Multiple simultaneous issues & info sharing • High-level abstraction for dialog plans • Rapid prototyping • Lowlights • Not used in this type of domain so far, so not entirely straight-forward (update rule changes) • Dynamic dialog plans (backend decides) MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
GoDiS RavenClaw MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
RavenClaw • Captures all domain-specific dialog (task) logic with a hierarchical description • The authoring effort is focused entirely here Dialog Task (Specification) Domain-independent Dialog Engine • Manages dialog by executing the dialog task specification • Provides domain-independent conversational strategies MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
have_fever general_feeling diagnostic chart RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
have_fever general_feeling diagnostic chart RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
have_fever general_feeling diagnostic chart RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Welcome Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
have_fever general_feeling diagnostic chart RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
diagnostic chart general_feeling have_fever headache RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… LoadSymptoms Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
diagnostic chart general_feeling have_fever headache RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
diagnostic chart general_feeling have_fever headache RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… GeneralFeel Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
headache chart have_fever diagnostic general_feeling RavenClaw Architecture Madeleine I:Welcome E:LoadSymptoms GeneralFeel GeneralFeel Diagnose R:HowAreYou? I:Glad I:Glad I:Sorry I:Sorry Fever Travel R:Headache R: R: R: R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… general_feeling: [good], [bad], [soso] How are you feeling today? general_feeling: [good], [bad], [soso] Not so good, I think I have a fever general_feeling: [good], [bad], [soso]have_fever: [fever]. ![yes], ![no]headache: [headache], ![yes], ![no]cough: [cough], ![yes], ![no]… … [soso](not so good)[fever](I think I have a fever) HowAreYou GeneralFeel GeneralFeel Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Illustrated Features • Dynamic generation of dialog task structure • Symptoms loaded from backend, appropriate structures to “talk about them” created on-the-fly • New symptoms – no DM changes • Dynamic dialog control policy • The order in which symptoms are addressed is controlled by the backend • Conversational skills MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Illustrated Features • Dynamic generation of dialog task structure • Symptoms loaded from backend, appropriate structures to “talk about them” created on-the-fly • New symptoms – no DM changes • Dynamic dialog control policy • The order in which symptoms are addressed is controlled by the backend • Conversational skills MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Backend Decision Tree headache have_fever chart general_feeling diagnostic Dynamic Dialog Control … Madeleine I:Welcome E:LoadSymptoms GeneralFeel Diagnose R:HowAreYou? I:Glad I:Sorry Fever Travel R:Headache R: R: R: R:AskFever E:MeasureTemp I:InformFever Dialog Stack Expectation Agenda Hi, this is Madeleine, the automated… How are you today? Not so good, I think I have a headacheSorry to hear you’re not feeling so good,Tell me more about your symptoms… Do you have abdominal pain? Diagnose Madeleine MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Illustrated Features • Dynamic generation of dialog task structure • Symptoms loaded from backend, appropriate structures to “talk about them” created on-the-fly • New symptoms – no DM changes • Dynamic dialog control policy • The order in which symptoms are addressed is controlled by the backend • Conversational skills MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Conversational Skills • Corresponding agencies added automatically to the dialog task tree • Help • What Can I Say? • Repeat • Suspend / Resume • Start Over • Timeout handling (not illustrated) • Still need all the language generation prompts and grammar, but some of those are develop-once, too MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
RavenClaw Conclusion • Highlights • Set task posed no challenges to the framework • Easy to implement • Dynamic dialog structure and control • Automatic use of domain-independent conversational skills • Lowlights? • Toolkit perspective: how easy would it be for someone else to build it? • Asynchronous behaviors? (timing) • Couple of bugs / fixes (or is that a highlight?) MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
GoDiS Collagen MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Collaborative Interface Agent * focus stack plan tree Collagen communicate observe observe interact interact COLLAGEN MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
COLLAGEN Systems • air travel planning • email reading and responding (w. IBM/Lotus) • GUI design tool operation • car navigation system operation • airport landing path planning (w. MITRE) • gas turbine operator training (w. USC/ISI) • personal video recorder operation • programmable thermostat operation (with Delft U.) • multi-modal web-based form-filling MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Java Implementation SharedPlan Discourse Theory Intentional purposes, contributes focus stack focus spaces, focus stack segments, lexical items Linguistic Attentional purpose tree (Grosz, Sidner, Kraus, Lochbaum 1974-1998) Collagen: Theory and Implementation MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
(fixing an air compressor, E = expert, A = apprentice) E: Replace the pump and belt please. A: Ok, I found a belt in the back. A: Is that where it should be? A: [removes belt] A: It’s done. E: Now remove the pump. … E: First you have to remove the flywheel. … E: Now take the pump off the base plate. A: Already did. replace belt replace pump and belt replace pump (Grosz, 1974) Collagen: Discourse Segments and Purposes MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Focus Stack Purpose Tree replace pump and belt current focus space replacebelt replace pump and belt replace pump replace belt E: Replace the pump and belt please. A: Ok, I found a belt in the back. A: Is that where it should be? A: [removes belt] A: It’s done replace pump and belt replace belt (Grosz & Sidner, 1986) Discourse state representation MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
focus stack purpose tree • directly achieves the purpose • is a step in the plan for the purpose * • identifies the recipeused to achieve the purpose • identifieswho should perform the purpose or a step in the plan • identifies a parameter of the purpose or a step in the plan An act contributes to the purpose of a segment if it: * does not include recursive plan recognition (see later topic) Discourse interpretation algorithm The current (communication or manipulation) act either: • starts a new segment/focus space (push) • ends the current segment/focus space (pop) • continues (contributes to) the current segment/... (add) (Lochbaum, 1998) MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
COLLAGEN … my take • Separation of task from dialog/discourse engine • Recipes / Domain plans / Task tree • Full-blown HTN • Hierarchical • Preconditions (constraints) • Effects • Completion / failure • Live nodes • Stack to keep track of focus and discourse structure • Tree explicitly contains agent and user nodes • Formalized / descriptive recipe specs (actually Java underneath), with procedure overwrites… MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
GoDiS Themes … MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Themes: Task Representation • Task representation • Separation of task representation from dialog engine • High-level representations of task • Descriptive rather than procedural • Procedural will be unavoidable for complex tasks • Expressive power • GoDiS, RavenClaw, Collagen: plan based representations of task MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Themes: Task/Domain/Gendre • The notion of dialog gendre • Tutoring • Diagnosis • Information Access • Where to fold it in a dialog manager? • GoDiS: update/select rules • Ariadne: plugins • RavenClaw: collapsed with task • How clear is that separation: task vs. gendre? MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Themes: Development time • Systems took on the order of 3-5 days to develop • Significant effort in the backend connection • Some sites shortcut it • Significant effort in grammar/language generation development • Some sites shortcut it • Everyone that had an implementation: “fixed a couple of bugs, but no major changes required” MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Themes: Development tools • Regression testing (GoDiS) • Systems are complex. Change something in a dialog management framework, can you prove that it did not screw up things that used to work? • System-wise, very intractable • Component-wise, maybe: i.e. DM with DM inputs/outputs • System diagnosis / log visualization tools (Collagen) MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Themes: Timing • (Micro)timing • unaddressed • Turn-taking models • in general, very rudimentary • Asynchronous behaviors • Could be accomplished, but no-one seemed to have it • Multi-party conversation • unaddressed MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Themes: the important problems • Different people have different views of what those are: • Plan / Intention recognition • Reference resolution • Backup in complex systems • Tense problems • Negations • Grounding; error prevention / recovery MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Themes: Reasoning • Dialog Managers vs Backends • Where to draw the line? • Who does the reasoning? • Can we avoid duplicating it? • How rich is the interaction between them? • Dialog systems - use language to act in a domain, so they are generally strongly tied • Basic set of conversational skills can be identified • Drawing that line is still an “art”, no general agreement or solutions exist MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes
Themes: Science of Dialog? • How much science do we have? • Theory vs. experiment • Interesting Collagen / RavenClaw similarities • Representation or not? • GUI analogy • Do we have the checkboxes and radio-buttons? MITRE Dialog Management Workshop workshop: godis : ravenclaw : collagen : themes