540 likes | 727 Views
Learning to Talk Through Listening. Alexander I. Rudnicky with Ananlada Chotimongkol and Dan Bohus Carnegie Mellon University CATALOG 2004 – Barcelona July 21, 2004. Outline. Empirical approaches to understanding dialogue and building dialogue systems A task-based approach to dialogue
E N D
Learning to Talk Through Listening Alexander I. Rudnicky with Ananlada Chotimongkol and Dan Bohus Carnegie Mellon University CATALOG 2004 – Barcelona July 21, 2004
Outline • Empirical approaches to understanding dialogue and building dialogue systems • A task-based approach to dialogue • Fundamental representations and observable events • Learning through observation
Outline • Empirical approaches to understanding dialogue and building dialogue systems • A task-based approach to dialogue • Fundamental representations and observable events • Learning through observation
Why Build Dialogue Systems? • The devil is in the details • Better understand the actual complexities of human-computer interaction • Create specific artifacts that embody theories of dialogue and interaction (and thereby allow us to test them directly)
Task Task Task Application Domains, Tasks and Applications Domain
Task representation/specification alternatives • Code (unspecialized representations, procedural) • Difficult to manage • Forms (properly, F→A sets) • Works for the simplest tasks, which can be easily cast as such • Many examples • Forms + graph-based dialogue structure • Graph-based part essentially = code, same problems • Examples: VXML, SALT • Hierarchical, plan-based • Task specified as a hierarchical plan (recipe) for the domain • Examples: RavenClaw, Collagen
CMU dialogue approaches and systems • Procedural • Command and control [OM,, etc] • Information access [MovieLine,etc] • Script-based and graph-based • Travel planning; maintenance [SpeechWear] • AGENDA-based • Communicator: travel planning • LARRI: task guidance [m-modal] • Roomline, etc: information access and transactions • Madeleine: medical diagnosis • TeamTalk: multi-participant dialogue • Valerie: interviews
Loan Car Graph-based systems Welcome to Bank ABC! Please say one of the following: Balance, Hours, Loan, ... What type of loan are you interested in? Please sayone of the following: Mortgage, Car, Personal, ... . . . .
Destination_City: ______ Departure_Date: ______ Departure_Time: ______ Preferred_Airline: ______ . . . Frame-based systems • I would like to fly to Boston • When would you like to fly? • Friday Boston 20030822
Destination_City: ______ Departure_Date: ______ Departure_Time: ______ Preferred_Airline: ______ . . . Frame-based systems • I’d like to go to Boston on Friday, … • What time would you like to leave? Boston 20030822
Frame-based systems Zxfgdh_dxab: _____ askjs: _____ dhe: _____ aa_hgjs_aa: _____ . . Transition on keyword or phrase Zxfgdh_dxab: _____ askjs: _____ dhe: _____ aa_hgjs_aa: _____ . . Zxfgdh_dxab: _____ askjs: _____ dhe: _____ aa_hgjs_aa: _____ . . Zxfgdh_dxab: _____ askjs: _____ dhe: _____ aa_hgjs_aa: _____ . . Zxfgdh_dxab: _____ askjs: _____ dhe: _____ aa_hgjs_aa: _____ . .
Outline • Empirical approaches to understanding dialogue and building dialogue systems • A task-based approach to dialogue • Learning through observation • Fundamental representations and observable events
Task-oriented Interaction • Implicit system goal is to create products • Data structures that specify information for action • Sessions can generate multiple products • Immediate products, e.g., information requests • Products that are built up incrementally over the course of a session, e.g., a plan such as an itinerary • An Agenda to order (and re-order) topics for discussion
Products and Actions • Products and Actions are domain-specific • e.g., itineraries bookings, queries information display • Products are represented as an ordered tree • nodes in the trees correspond to schemas (handlers, agents, etc.) and are slots or forms • Slot-specific computation is encapsulated in schema (handler objects) • Agenda is generated from the current product tree • defines the sequence of topics to take up with the user
Current focus Pending goals Persistent goals Agenda Structure • Ordered list of conversational topics • current goal: focussed topic • pending goals: schema yet to be filled • persistent goals: handlers that are always active • constructors • generic help • garble
Value_1 Value_2 • Invalidate value • self-promote • reorder tree receptor report prompt focus hook Value_3 transform value + transform receptors value e.g. SQL query Domain Agent Domain Agent Simple and Compound Schema
3 10 8 9 Agendas from product tree traversal • Default traversal of current product tree • left-to-right, depth-first • all nodes in the current product tree are always on the agenda • Persistent goals sort to the bottom of the list 1 root 2 profile Leg_1 Leg_2 4 Flight_1 Hotel_1 Car_1 5 6 7 Dest_1 Date_1 Time_1
a b e a b e c d f g c d g f a h i e b h i g f c d 1 h i 2 7 3 9 6 8 4 5 ( 1) Shifting focus • Agenda has linear structure • Derived from product tree • Focus capture implies reordering sibling nodes • Reordering propagates to root • enclosing topic contexts get promoted • Focus node is promoted to top of the agenda node i gets focus
t t n n l l l f f f h h h c c c D D D d d d t t t Constructors • Products are not fixed data structures but may expand through the course of a session • Users can modify the product • “I’d like to go on to Syracuse” • [system adds a new leg sub-tree to the product]
Hierarchical Plan-based Representation GOAL: (registered = false) || AVAILABLE(profile) Login Execution policy AskRegistered GetProfile PRE: AVAILABLE(name) AskName GreetUser PRE: AVAILABLE(name) GreetGuest PRE: registered=false • Dialog control: • Task constraints (Declarative): define the boundaries of the space of possible dialogs • Execution policy (Procedural/Workflow): actively defines dialogue control
Hierarchical Plan-based Representation Communicator MAIN TOPIC Welcome Login Travel Locals Bye AskRegistered GreetUser GetProfile Leg1 AskName GetQuery ExecuteQuery DiscussLeg1 FOCUS S: Are you a registered user? U: Yes, this is Alex [yes] [user_name] AskRegistered Registered: [yes] Registered: [yes] Name: [user_name] Login Registered: [yes] Name: [user_name] Departure: [City] Arrival: [City] … … … Communicator
Common task skills Hierarchical Plan-based Representation Leg1 ExecuteQuery DiscussLeg1 GetQuery: FORM DepartureLocation: TCity ArrivalLocation: TCity DepartureDate: TDate DepartureTime: TTime
Dialog Engine • Controls the dialog by executing the hierarchical plan-based task specification • In the process, automatically exhibits appropriate generic (task and domain-independent) conversational skills: • Global dialogue mechanisms • repeat, suspend, start-over, help, where are we? • Grounding • Implicit and explicit confirmations, disambiguations, various non-understanding handling strategies • Timing and turn-taking
Issues that remain • Parallel activities and asynchronous events • Understanding the scope of “dialogue” • Knowledge engineering dialogue systems • Building the interface between the dialogue engine and the world (“pragmatics”) • Capturing human speech and language behavior within tasks and domains • Reasoning about the world within applications • Communicating meaningfully and efficiently with the user about the state of the world
Outline • Empirical approaches to understanding dialogue and building dialogue systems • A task-based approach to dialogue • Learning through observation • Fundamental representations and observable events
Learning by observation • Many automatic systems are meant to substitute for current human-based operations (e.g., a travel agency or a call center) • Can we use such existing working human systems to infer the structure of a corresponding automatic system? • If so, what might be the requisite representations and learning heuristics?
Learning to dialogue • Goal-directed conversation is regular • Both participants can agree on the same goal and both participants want to achieve this goal • Correct transmission of information is at a premium • Can we exploit the regularity to extract the (currently human engineered) structure of the dialogue?
Learning structure from dialogue • Concept identification • Form (topic) segmentation • Task graphs • Multiple data streams • Lightly supervised learning
greeting car payment / close out leg return confirm hotel Travel agent and client
Outline • Empirical approaches to understanding dialogue and building dialogue systems • A task-based approach to dialogue • Learning through observation • Fundamental representations and observable events
Properties of a dialogue representation • Sufficiency • Captures sufficient information for the creation of a dialogue system • Describes the important (i.e., operative) phenomena in conversations • Generality • Covers conversations in dissimilar domains • Learnability • Can be populated through observation (e.g., from a corpus of human-human conversations)
Task-centric dialogue representation • Components of task structure • Procedures for completing task goal(s) • Steps in the task and their dependencies (i.e., the workflow) • Domain language • Concepts and idioms that humans use to communicate about the task • Domain reasoning • The relationships between language and task, and the domain of the application • Components of task structure • Procedures for completing task goal(s) • Steps in the task and their dependencies (i.e., the workflow) • Domain language • Words, constructs and idioms that humans use to communicate about the task • Domain reasoning • The relationships between language and task, and the domain of the application
Dialogue primitives Levels of representation • Task: a subset of conversational sequences that achieves a particular (human/system) goal • Sub-task: a step in a task that contributes toward the fulfillment of the task goal • The smallest unit of a dialogue that contains information sufficient to execute a specific domain action • Concept: key domain entities (perhaps organized into a type-hierarchy or ontology) Mechanisms • Task Oriented: form-filling and result negotiation • Discourse oriented: grounding, etc
Task Structure Representation • Task = collection of forms • Sub-task = a form • Concept = a slot in a form F: Query_Departure_Time Depart_Location: carnegie_mellon Arrive_Location: the airport Arrive_Time: Hour:four Minute: thirty Bus_Number: 28X
Example: Air travel planning • Task: create itinerary • Sub-tasks: • Flight reservation • Hotel reservation • Car rental reservation • Concepts: • Airline = { Continental, Iberia, … } • Hotel = { Novotel, Hilton, … }
Example: Bus schedule enquiry • Task (multiple tasks): • Find bus numbers that run between two locations • Find a departure time given a bus number and stop location • Sub-tasks: • No further decomposition needed • Concepts: • Bus Number = { 61C, 28X, … } • Location = { CMU, airport, … }
Dialogue mechanisms • Operations invoked by participants: • Correspond to an utterance or a part of an utterance • Has a unique consequence on the state of the conversation • init_form causes a system to create a new form • The behavior of the same operation is the same regardless of the domain (only the parameters that are different)
Dialogue mechanisms (2) • Dialogue procedure • Requires more than one utterance to complete • A confirmation mechanism = 2 operations (confirmation_request + respond) • Non-verbal operation • Activated by a state of the representation rather than a verbal expression • access_database is activated by the completion of the query form
An example from the Map Task • Forms • Action forms ( →draw_line) • Entity forms ( landmark ) • Operations ( various ) • Resolving a misunderstanding through grounding [session q8nc7]
Giver’s Map Follower’s Map
Episode 11-1 Operation: GIVER87: ask_landmark: have you got a TarLM:[golden beach((left))]? FOLLOWER88: respond:yes uh-huh.add_landmark:(golden beach (right)) (Misunderstanding, the follower ground the left one while the giver ask about the right one) Giver’s Landmark: golden beach (left) Giver Map: yes Follower Map: Location: Giver’s Landmark: golden beach (left) Giver Map: yes Follower Map: yes Location: implicitly grounded Giver’s Landmark: golden beach (left) Giver Map: yes Follower Map: yes Location: implicitly grounded Follower’s Landmark: golden beach (right) Giver Map: yes Follower Map: yes Location: implicitly grounded Follower’s Landmark: golden beach (right) Giver Map: yes Follower Map: yes Location: implicitly grounded
Episode 11-1 (2) Operation: GIVER87: ask_landmark: have you got a TarLM:[golden beach((left))]? FOLLOWER88: respond:yes uh-huh.add_landmark:(golden beach (right)) (Misunderstanding, the follower ground the left one while the giver ask about the right one) Grounding Form Landmark: golden beach (left) Giver Map: yes Follower Map: yes Location: implicitly grounded
Origin: Ori:[Loc:[the top of the white mountain]] Orientation: Dir:[straight up ] Distance: Path: Destination Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]] Episode 11-2 Operation: GIVER89: fill_form_info: well goDir:[straight up ]... ... from Ori:[Loc:[the top of the white mountain]] 'til you're just Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]]FOLLOWER90: acknowledge: right, Origin: Orientation: Distance: Path: Destination Grounding Form Landmark: golden beach (left) Giver Map: yes Follower Map: yes Location: implicitly grounded
Episode 11-3 Operation: ask_fill_form_info: you want me to go dilect-- ... Dir:[directly right]? GIVER91: respond:no, fill_form_info:Dir:[directly up]. Origin: Ori:[Loc:[the top of the white mountain]] Orientation: Dir:[straight up ] Distance: Path: Destination Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]] Grounding Form Landmark: golden beach (left) Giver Map: yes Follower Map: yes Location: implicitly grounded
Episode 11-4 Operation: FOLLOWER92: fill_form_info: but golden beach((right)) is away in Loc:[the far right].(The follower explicitly fill the location of the golden beach (right). ) GIVER93: acknowledge: ah right. (Agree with the location of the golden beach (right)) Giver’s Landmark: golden beach (left) Giver Map: yes Follower Map: yes Location: implicitly grounded Giver’s Landmark: golden beach (left) Giver Map: yes Follower Map: Location: Giver’s Landmark: golden beach (right) Giver Map: yes Follower Map: Location: Follower’s Landmark: golden beach (right) Giver Map: yes Follower Map: yes Location: implicitly grounded Follower’s Landmark: golden beach (right) Giver Map: Follower Map: yes Location: the far right
Episode 11-5 Operation: FOLLOWER94: ask_landmark: have you got TarLM:[your (golden beach (right))]? GIVER95: inform_other_info: i've got two golden beaches. FOLLOWER96: acknowledge: ah. add_landmark:(golden beach (right)) Landmark: golden beach (right) Giver Map: Follower Map: yes Location: the far right Landmark: golden beach (right) Giver Map: yes Follower Map: yes Location: the far right Landmark: golden beach (right) Giver Map: yes Follower Map: yes Location: the far right
Episode 11-5 (2) Operation: FOLLOWER94: ask_landmark: have you got TarLM:[your (golden beach (right))]? GIVER95: inform_other_info: i've got two golden beaches. FOLLOWER96: acknowledge: ah. add_landmark:(golden beach (right)) Grounding Form Landmark: golden beach (right) Giver Map: yes Follower Map: yes Location: the far right
Episode 11-6 Operation: GIVER97: fill_form_info: sorry ... so there's TarLM:[the one(golden beach (left))] Loc:[above the ... white mountain] as well to Loc:[ the left of it (white mountain)] for me. FOLLOWER98: fill_form_info: is there, yeah there's nothing nothing there. add_landmark:golden beach (left) GIVER99: acknowledge: right okay, Landmark: golden beach (left) Giver Map: yes Follower Map: Location: Landmark: golden beach (left) Giver Map: yes Follower Map: no Location: above the ... white mountain, the left of it (white mountain) Landmark: golden beach (left) Giver Map: yes Follower Map: no Location: above the ... white mountain, the left of it (white mountain) Landmark: golden beach (left) Giver Map: yes Follower Map: Location: above the ... white mountain, the left of it (white mountain) Grounding Form Landmark: golden beach (right) Giver Map: yes Follower Map: yes Location: the far right
Episode 11-6 (2) Operation: GIVER97: fill_form_info: sorry ... so there's TarLM:[the one(golden beach (left))] Loc:[above the ... white mountain] as well to Loc:[ the left of it (white mountain)] for me. FOLLOWER98: fill_form_info: is there, yeah there's nothing nothing there. add_landmark:golden beach (left) GIVER99: acknowledge: right okay, Grounding Form Landmark: golden beach (left) Giver Map: yes Follower Map: Location: above the ... white mountain, the left of it (white mountain) Grounding Form Landmark: golden beach (right) Giver Map: yes Follower Map: yes Location: the far right
Applying the representation • Four different task-oriented domains • Air travel planning • Professional travel agent and volunteer clients (re)booking former trips • HCRC map-reading task • Hired subjects communicating path information • Bus schedule information • Professional agents helping customers • UAV operation • Trainees flying an unmanned airline, in a simulation