270 likes | 414 Views
COLLAGEN: Middleware for Building Mixed-Initiative Problem Solving Assistants. Charles Rich Candace L. Sidner Mitsubishi Electric Research Laboratories Cambridge, MA.
E N D
COLLAGEN: Middleware for Building Mixed-Initiative Problem Solving Assistants Charles Rich Candace L. Sidner Mitsubishi Electric Research Laboratories Cambridge, MA ( Neal Lesh, Andy Garland, Chris Lee, David McDonald, Egon Pasztor, Chris Maloof, Luke Zettlemoyer, Jim Davies, Myrosia Dzikovska, Steve Wolfman, Jacob Eisenstein, Allison Bruce ) MERL
Outline of the Talk • Introduction • Demo:DiamondHelp • Some Theory • Some Architecture • Some More Technical Details • Related Work MERL
* implies Collaboration Mixed-Initiative i.e., all collaborative systems are mixed-initiative strongly suggests Collaboration Mixed-Initiative i.e., most interesting mixed-initiative systems are collaborative Mixed-Initiative and Collaboration Mixed Initiative: …efficient, natural interleaving of contributions by users and automated services… [Horvitz] Collaboration: A process in which two or more participants coordinate their actions toward achieving shared goals. [Grosz & Sidner] MERL
Collaboration • usually involves some form of communication (discourse) between the participants, e.g., in natural language. • covers a wide spectrum of interactions depending, among other factors, on: • - the relative knowledge of the participants • - which participant predominantly has the initiative • - the primary goal of the collaboration • e.g., tutoring versus assistance MERL
Demo MERL
Outline of the Talk • Introduction • Demo:DiamondHelp • Some Theory • Some Architecture • Some More Technical Details • Related Work MERL
SharedPlan Collaborative Discourse Theory Intentional goals, recipes, plans focus spaces, focus stack segments, lexical items Attentional Linguistic (Grosz, Sidner, Kraus, Lochbaum 1974-1998) MERL
replace belt replace pump and belt replace pump Discourse Segments and Purposes (fixing an air compressor, E = expert, A = apprentice) E: Replace the pump and belt please. A: Ok, I found a belt in the back. A: Is that where it should be? A: [removes belt] A: It’s done. E: Now remove the pump. … E: First you have to remove the flywheel. … E: Now take the pump off the base plate. A: Already did. (Grosz, 1974) MERL
SharedPlan Discourse State Model Focus Stack Plan Tree replace pump and belt current focus space replacebelt replace pump and belt replace pump replace belt E: Replace the pump and belt please. A: Ok, I found a belt in the back. A: Is that where it should be? A: [removes belt] A: It’s done replace pump and belt replace belt (Grosz & Sidner, 1986) MERL
live 4#Propose.Should [ agent, g[user] ] 1#e [ user ] 2#d [ user ] 3#f [ agent ] SharedPlan Discourse Interpretation Algorithm Updating the discourse state in response to new discourse events (communications or manipulations) A Plan Tree: live Focus Stack: g B live C live B A live live live d [ user ] e [ user ] f [ agent ] g [ user ] 1. User performs e. 2. User performs d. 3. Agent performs f. (Lochbaum, 1998) 4. Agent says “Please perform g.” MERL
User says "What next?" Agent says "What do you want to do?" [Choosing the fabric and stain.] User says "Choose the fabric and stain." [Done choosing the fabric.] [Done successfully navigating.] [Done user successfully popping up the fabric load selection display.] Agent says "Please press the Fabric Load picture to pop up the fabric choices." Agent points to where you press the Fabric Load picture to pop up the fabric choices. User pops up the fabric load selection display. User closes the current pop-up window (by pressing OK in the window corner). User says "What next?" [Choosing the stain.] [Done successfully navigating.] [Done user successfully popping up the stain selection display.] Agent says "Please press the Stain picture to pop up the stain choices." Agent points to where you press the Stain picture to pop up the stain choices. User pops up the stain selection display. [Next expecting optionally to select a stain.] [Next expecting to close the current pop-up window (by pressing OK in the window corner).] [Expecting optionally to adjust detailed settings.] [Expecting optionally to run the selected cycle.] MERL
discourse theory problem solving theory Discourse Theory vs. Problem-Solving Theory • Even though it includes an intentional (plan tree) component, SharedPlan discourse theory is not a complete problem-solving theory: • For example, it does not tell you how to build new recipes (for that, you might use, e..g., first-principles planning or case-based reasoning) • If a problem solver does not collaborate, then it does not need a discourse model! • However, a mixed-initiative problem solving assistant needs both a discourse model and a problem-solving model (e.g., BDI). MERL
Discourse Theory vs. Problem-Solving Theory • The discourse model constrainsthe problem solving model: • For example, the discourse model constrains which subproblem to work on next based on the focus of attention in the collaboration. • This modularity is possible because SharedPlan discourse theory captures structure that is independent of thedomain and the problem solving model, i.e., structure that is fundamentally about the collaboration process itself. • The discourse model also provides structure needed for linguistic processing, such as reference resolution (via focus spaces). problem solving model discourse model desires intentions discourse interpretation first-principles planning plan recognition beliefs MERL
Outline of the Talk • Introduction • Demo:DiamondHelp • Some Theory • Some Architecture • Some More Technical Details • Related Work MERL
The COLLAGEN Project Theoretical Orientation: Applying SharedPlan collaborative discourse theory to improve human-computer interaction. Practical Goal: Building collaborative agents (mixed-initiative problem solving assistants) for a wide range of applications with a maximum degree of software reuse. MERL: Charles Rich Candace Sidner USC/ISI: Jeff Rickel MITRE: Abigail Gertner TU Delft: David Keyson, Elyon Dekoven MIT Media Lab: Justine Cassell, Tim Bickmore MERL
Collaborative Agent Task-Oriented Human Collaboration COLLAGEN Task-Oriented Human Collaboration focus stack plan tree communicate observe observe interact interact MERL
Software Reuse: Prototypes Built with Collagen LOTUS/IBM MERL MERL/MELCO MERL USC/ISI MERL/MELCO MITRE MERL/MELCO MERL MERL MERL MERL
Implementation of SharedPlan Discourse Theory Collagen Architecture Task Model (Recipes) Discourse State Interpret * user event Respond ** Weak Problem-Solving Model Choose Generate agent event agenda * Lesh, Rich, Sidner (1999-2001) -- plan recognition Grosz, Sidner, Kraus, Lochbaum (1974-1998) -- discourse interpretation ** Rich, Lesh, Rickel, Garland (2002) -- plugins MERL
Outline of the Talk • Introduction • Demo:DiamondHelp • Some Theory • Some Architecture • Some More Technical Details • Related Work MERL
Non-primitive act Primitive act Recipe step Ordering constraint Other constraints (not shown): * - act pre/postconditions - recipe applicability conditions - equality between parameters - other logical predicates Fragment of Gas Turbine Task Model EngageEngine StartGenerator OpenBleedValve Engage Purge OpenFuelValve TurnOn ( HierarchicalTask Network ) *Truth maintenance system MERL
Y Z W U Plan Recognition Reduces Explicit Communication X X plan recognition U (Lesh, Rich & Sidner, 1999) MERL
without plan recognition with plan recognition USER: "Let's work on email." USER: "Let's work on email." AGENT: "You have 5 unread messages.” AGENT: "You have 5 unread messages." USER: "Let's read a message." USER: [select message from Chuck] USER: [select message from Chuck] USER: "Let's read message in its own window." USER: [open message] USER: [open message] USER: "Let's do something using message." USER: [visit http://foo.bar.com] USER: [visit http://foo.bar.com] ...long lunch... ...long lunch... USER: "What was I doing?" USER: "What was I doing?" AGENT: "Working on email. AGENT: "Working on email. Reading message from Chuck. Reading message from Chuck. Reading message in its own window. Reading message in its own window. Done something using message. Done something using message. [visit http://foo.bar.com] [visit http://foo.bar.com] Expecting to react to message (opt)." Expecting to react to message (opt)." USER: "Let's react to message." USER: "Let's send email." USER: [start new message] USER: [start new message] USER: "Let's fill in message." USER: [address message to Dan] USER: [address message to Dan] AGENT: "Why are you sending email to Dan? (a) reacting to message from Chuck (b) as a new goal " USER: "I am reacting to message from Chuck." ...additional editing actions... ...additional editing actions... USER: "Done filling in message." USER: "Done filling in message." AGENT: "Let's send, cancel or save message." AGENT: "Let's send, cancel or save message." USER: "Please save message." USER: "Please save message." AGENT: [save message] AGENT: [save message] MERL
SLU SLG Natural Language Processing Task Model (Recipes) Discourse State Interpret user event Respond Choose Generate agent event agenda MERL
Artificial Discourse Language (1) Formal semantics in terms of beliefs and intentions: speaker: PROPOSE(b) Believe(speaker, b) Intend(speaker, Achieve(speaker, Believe(hearer, b) hearer: ACCEPT(b) Believe(speaker, b) Believe(hearer, b) Believe(speaker, Believe(hearer, b)) Believe(hearer, Believe(speaker, b)) Believe(speaker, Believe(hearer, Believe(speaker, b))) ... mutual belief (Sidner, 1994) MERL
text to speech utterance menu “Let's work on email.” template substitution * PROPOSE(SHOULD(DoEmail(...))) * also using SPUD (Stone, 2003) Devault, Rich, Sidner 2004 Artificial Discourse Language (2) Translation to and from natural languages: speech recognition “Let’s work on email.” natural language understanding PROPOSE(SHOULD(DoEmail(...))) MERL
Related Work (vs. Collagen) • multiple participant collaboration (vs. two participants) • e.g., Tambe et al. • other theoretical models of collaboration (vs. SharedPlan) • e.g., Levesque & Cohen, Carberry • application-specific collaborative dialogue systems (vs. middleware) • e.g., MERIT, MIRACLE, DenK, TRIPS • other interface agents (without discourse model) • e.g., Maes, and many others • other agent-related middleware (without discourse model) • e.g., PRS, and other BDI interpreters * * Recently evolving into CPS middleware MERL
COLLAGEN discourse theory problem solving theory Conclusions (Free research licenses available) Questions? MERL