20 likes | 129 Views
RavenClaw: Dialog Management Using Hierarchical Task Decomposition and an Expectation Agenda Dan Bohus Alex Rudnicky School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15213. ø. Abstract
E N D
RavenClaw: Dialog Management Using Hierarchical Task Decomposition and an Expectation Agenda Dan Bohus Alex Rudnicky School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, 15213 ø. Abstract RavenClaw is a new dialog management framework developed as the successor to the Agenda architecture used in the CMU Communicator.RavenClaw introduces a clear separation between the specification of task and discourse behaviors, and allows rapid development of dialog management components for spoken dialog systems operating in complex, task-oriented domains. The new system development effort is focused entirely on the specification of the dialog task, while a rich set of domain-independent conversational behaviors are transparently generated by the dialog engine. To date, RavenClaw has been applied to five different domains allowing us to draw some preliminary conclusions as to the generality of the approach. We briefly describe our experience in developing these systems. 1. 2. • Goals • RavenClaw = framework aimed at the rapid development of dialog managers for complex, task-oriented dialog domains • Handle a variety of complex domains • Easy to develop and maintain systems • Developer focuses only on specifying the dialog task • Dialog engine handles the rest automatically • Architecture supports: • Learning (both task and discourse levels) • Dynamic generation of dialog tasks • Grounding mechanisms • Overall design • RavenClaw is a 2-tier architecture (see below) Dialog Task Specification Layer • Captures all the domain-specific dialog (task) logic • The system development effort is entirely focused here Domain-independent Dialog Engine • Manages dialog by executing the Dialog Task Specification • Provides domain-independent conversational strategies Key architectural details Fig Dialog Task Specification (sample) DEFINE_AGENCY(CLogin, IS_MAIN_TOPIC() DEFINE_SUBAGENTS( SUBAGENT(Welcome, CWelcome) SUBAGENT(AskRegistered, CAskRegistered) SUBAGENT(AskName, CAskName) SUBAGENT(GreetUser, CGreetUser) ) DEFINE_CONCEPTS( STRING_USER_CONCEPT(user_name) BOOL_USER_CONCEPT(registered) ) SUCCEEDS_WHEN(COMPLETED(GreetUser)) PROMPT_ESTABLISH_CONTEXT(“establish_context login”) ) DEFINE_INFORM_AGENT(CWelcome, PROMPT(“:non-interruptable inform welcome”) ) DEFINE_REQUEST_AGENT(CAskRegistered, REQUEST_CONCEPT(registered) GRAMMAR_MAPPING(“[Yes]>true, [No]>false”) ) DEFINE_REQUEST_AGENT(CAskName, PRECONDITION(IS_TRUE(registered)) REQUEST_CONCEPT(user_name) MAX_ATTEMPTS(2) GRAMMAR_MAPPING(“[UserName]”) ) ... RoomLine Suspend user_name query results registered Login GetQuery GetResults DiscussResults DateTime Location Properties Welcome GreetUser Rich concept representation AskRegistered AskName Network Projector Whiteboard • Set of confidence / value pairs • History of previous values • Flags indicating grounding, availability, conveyance status, etc John Doe / 0.46 Joe Down / 0.33 Dialog TaskSpecification Dialog Engine Expectation Agenda User Input: Dialog Stack / Agents Execution 1 2 3 System: Are you a registered user? User: Yes, this is John Doe Parse: [Yes](yes / 0.87) [UserName](john doe / 0.46) registered: [No] → false, [Yes] →true Welcome registered: [No] → false, [Yes] → true user_name: [UserName] Login Login RoomLine RoomLine RoomLine registered: [No] → false, [Yes] → true user_name: [UserName] query.date_time: [DateTime] query.location: [Location] query.network: [Network] query.projector: [Projector] query.whiteboard: [Whiteboard] 4 5 AskRegistered Login Login RoomLine RoomLine
4. 2. The Dialog Task Specification Generics The Dialog Task Specification =tree of dialog agents, with each agent handling the corresponding part of the dialog task Advantages of hierarchical representation: • Dialog task structure naturally lends itself to hierarchical description • Ease of maintenance and design; good scalability • Implicitly captures context in dialog Conversational behaviors The Dialog Engine automatically provides a basic set of domain-independent conversational behaviors • Generic dialog mechanisms • Help, Repeat, Suspend, Start over, etc • Turn-taking behavior • Grounding behaviors • Explicit and implicit verifications, disambiguations, context reestablishment, etc • RavenClaw-based systems • LARRI [Symphony Project, CMU] A multi-modal conversational agent that provides support for F/A-18 aircraft mechanics performing maintenance tasks: • Guidance & information browsing domain • Tree-based decomposition very well suited in this domain; portions of the dialog task tree are generated dynamically based on the task to be performed • Intelligent Procedure Assistant [NASA Ames] Multi-modal system that provides assistance to astronauts on the International Space Station in the execution of procedural tasks and checklists: • Guidance & information browsing domain • RavenClaw interfaced in Open Agent Architecture (with Gemini inputs / output) • BusLine [Let’s Go! Project, CMU] Information search interface to Pittsburgh bus schedules: • Information exploration domain • Static dialog task tree • RoomLine [CMU] Assistance for conference room reservation and scheduling within the School of Computer Science at CMU: • Information management domain • Static dialog task tree • TeamTalk [11-741, CMU] Spoken command and control for a team of robots: • Command and control domain • Challenges: multi-way conversations, (complex) asynchronous behaviors • Static dialog task tree • Dialog Task Agents • Fundamental Dialog Agents (on leaves) • Inform – sends an output • Request – requests and listens for information • Expect – expects (listens for) information • DomainOperation – performs domain operations (i.e. back-end calls, etc) • Dialog Agencies (non-terminal nodes) • Control the execution of the subsumed agents Agent properties / functionalities: • Execute routine • Preconditions and triggers • Completion criteria (successful / unsuccessful) • Effects • Hold concepts 3. • The Dialog Engine • Domain-independent component that executes the Dialog Task Specification • Dialog flow is generated by alternating Execution Phases and Input Phases • Execution Phase • The dialog agents in the task tree are executed and generate the system’s behavior. • Dialog engine uses a stack structure to execute the agents in the task tree: • Repeatedly execute agent on top of the stack • When agencies execute, they plan one of their subsumed agents for execution (according to preconditions and policies) • Completed agents are removed from the stack • Request-type fundamental agents can interrupt an Execution Phase and solicit an Input Phase • (3-Stage) Input Phase • Assemble an Expectation Agenda • Expectation Agenda models the system’s input expectation at that point in time • Bind values from input to concepts • Inputs are matched to system expectations • Analyze focus shifts • Establish if the focus of the conversation should beshifted in light of the recent input • … then, continue with another Execution Phase. 5. • Conclusions • RavenClaw = Dialog Management framework which focuses system development effort on creating a description of the underlying dialog task • Dialog Engine drives the dialog towards its goals, and uses generic conversational strategies to maintain dialog flow and coherence • 5 systems built to date spanning various domains and task complexities • RavenClaw adapted easily, indicating high versatility and good scalability properties School of Computer Science, Carnegie Mellon University, 2003, Pittsburgh, PA, 15213.