500 likes | 626 Views
Dialog Structure Design and Annotation. Ananlada Chotimongkol Language Technologies Institute School of Computer Science Carnegie Mellon University. Out Line. Existing Annotation Schemes Linguistic Oriented Engineering Oriented HCRC dialog structure Conversation Acts DAMSL Comparison
E N D
Dialog Structure Design and Annotation Ananlada Chotimongkol Language Technologies Institute School of Computer Science Carnegie Mellon University
Out Line • Existing Annotation Schemes • Linguistic Oriented • Engineering Oriented • HCRC dialog structure • Conversation Acts • DAMSL • Comparison • Form-based dialog structure
Structure of a dialog • Explain how the conversation is organized • To create a theory of dialog in order to understand the meaning of the dialog • Linguistic-Oriented • To develop a procedure that support a computer agent in a dialog system • Engineering-Oriented
Linguistic-Oriented • Some are extended from discourse structure (focus on monologue text) • Provide basic theory for the engineering-oriented one • Speech Act Theory: capture speaker’s intention • Rhetorical Structure Theory: explain the coherence between parts of text • Dialog Grammar: capture regular patterns in the dialog
Engineering-Oriented • HCRC structure (Edinburgh) • Conversation Acts (Rochester) • DAMSL (Multiparty Discourse Group)
HCRC Dialog Structure Carletta, J., Isard, A., Isard, S., Kowtko, J., Doherty-Sneddon, G., Anderson, A., HCRC dialogue structure coding manual, 1996 http://www.ltg.ed.ac.uk/~amyi/maptask/demo.html • Domain = map description • Focus on describing the phenomenon occurs in the Map Task corpus • But claim to be task-independent • Focus on high level structure • Can use in conjunction with other coding scheme
3-level structure • Transaction: a sub-dialog that accomplish a major goal of the task • In Map Task = 1 segment of the route • Game (interaction, exchange): a set of utterances composes of an initiation and a sequence of responses that fulfills the initiations purpose • Move (dialog act): an utterance or part of utterance that serves a particular propose e.g. as an initiation or a response
Move Coding Scheme • Tradeoff between semantic distinction and coding consistency • 12 moves from 3 categories • Initiating Moves: set up an expectation at the beginning of the game • Instruct, Explain, Check, Align, Query-YN and Query-W • Response: follow the initiation and fulfill the expectation • Acknowledge, Reply-Y, Reply-N, Reply-W and Clarify • Ready: occur in the transition between games
Game Coding Scheme • Game’s purpose = the name of game’s initiating move • All games begin with an initiating move but not all initiating moves begin games • Game can be nested e.g. contain clarification sub-dialog
Transaction Coding Scheme • Divide the dialog into transactions • Different between giver and follower’s perspectives • For a giver, how he divides a route into sub-task • 4 types of transactions: normal, review, overview and irrelevant • Each transaction (except irrelevant) is associated with a route segment on the map • For a follower, how he perceives a segment and performs some actions • 2 types of actions: drawing a line and crossing out a line • A transaction isn’t nest (too large)
Discussion • No real dialog application. Use as a data for analyzing phenomena in dialog • Emphasize on how the information is conveyed e.g. as a question or a response, rather than what information is conveyed (concept) • Annotate the purpose of the utterance in general e.g. instruct, explain, question, rather than the purpose that each utterance serves according to the task e.g. describe the movement or describe the landmark
Conversation Acts • David R. Traum and Elizabeth A. Hinkelman, "Conversation Acts in Task-Oriented Spoken Dialogue", In Computational Intelligence, 8(3):575--599, 1992. Also appears as TR 425, Computer Science Dept. • Emphasize • Mutual understanding between participants • Dialog mechanisms that serve in coordination and maintenance of the dialog itself rather than the direct task.
Dialog units • Utterance unit (UU) • Continuous speech by the same speaker • Each speaker turn can contain more than one UU • Discourse Unit (DU) • A sequence of an initial presentation and subsequent utterances by each party that are needed to make a unit grounded
Classes of Conversation acts • 4 classes • Turn-taking acts (sub-UU acts) • Grounding acts (UU acts) • Core speech acts (DU acts?) • Argumentation acts (multiple DUs) • More general than speech act theory
Turn-taking Act • Can have more than one turn-taking act in an utterance (sub-UU act) • Coordinate the control of the speaking channel • Types of turn-taking acts • take-turn, keep-turn, release-turn, assign-turn and pass-up-turn • Turn-taking acts occur all the time • Should we annotate all of them? • Which one is important?
Grounding Act • Correspond to one utterance unit (UU act) • Coordinate mutual understanding • Types of grounding acts • Initiate (an initial component of a DU) • Continue • Acknowledge • Repair • ReqRepair • ReqAck • Cancel (close off the current DU as ungrounded)
Core Speech Act • Similar to a traditional speech act • Coordinates the local flow of changes in belief, intentions and obligations • Types of core speech acts: • Inform, WHQ, YNQ, Accept, Request, Reject, Suggest, Eval, ReqPerm, Offer, Promise • Doesn’t correspond to any of dialog units?
Argumentation Act • Compose of combinations of core speech acts (Multiple DUs act) • Coordinate discourse purpose • Is at the same level as Rhetorical Relations and Adjacency Pairs • Types of argument acts: Elaborate, Summarize, Clarify, Q&A, Convince, Find-Plan • Build up hierarchy with in the same class • The high level acts correspond to steps in task structure (task-dependent?) • The lower level acts Q&A
DAMSL (Dialog Act Markup in Several Layers) • Coding Dialogs with the DAMSL Annotation Scheme. Mark Core, James Allen. AAAI Fall Symposium on Communicative Action in Humans and Machines, 1997. • J. Allen and M. Core. “Draft of DAMSL: Dialog Act Markup in Several Layers”, 1997.
DAMSL Tag Set • Developed by Multiparty Discourse Group • Contain primitive communicative actions that manipulates the common ground directly • Allow multiple labels in multiple layers • Eliminate the restriction in Speech Act Theory • Design to be domain-independent • But can add domain relevant acts • The annotation can be used to • Interpret utterances in dialog • Design appropriate dialog strategy
DAMSL Annotation Scheme • 3-layer of annotation for each utterance • Forward Communicative Functions • Backward Communicative Functions • Utterance Features • These 3 layers are orthogonal • But some utterances may not have a label for every layer • Can have more than one label in each layer • Utterance segmentation is based on the intentions of the speaker • An utterance can have several clauses or just an initial word
Forward Communicative Function • Indicates how the current utterance constrains the future beliefs and actions • Similar to actions in speech act theory • Types of Forward Communicative Functions • Statement • Influencing Addressee Future Action • Committing Speaker Future Action • Performative (make a fact true by saying it) • Other Forward Function
Backward Communicative Function • Indicate how the current utterance relates to the previous dialog • Types of Backward Communicative Functions • Agreement (accept/reject) • Understanding • Answer (associate with info-request act) • Information Relation (How this utterance relates to the previous one) • Similar to Rhetorical Relations
Utterance Feature • Capture content and form of utterance • The features are • Information Level: task, task management, communication management • Communicative Status: abandoned, uninterpretable • Syntactic Features: conventional form, exclamatory form
Discussion • Focus on the primitive purpose of the utterance • Need more detail representation to get the key information in the utterance • Also need higher level representations such as plans and discourse structures • Are these 3 layers orthogonal? • Are there too many tags for each utterance?
HCRC Transaction Game Move Comparison: Levels of Annotation Conver. Acts • Argumentation acts • Core speech acts • Grounding • Turn-taking DAMSL • Forward • Backward • Utterance Features
HCRC Transaction Game Move (The same level as all DAMSL tags) Conver. Acts Argumentation acts (Dialog Unit) Core speech acts Grounding Turn-taking Comparison: Levels of Annotation
DAMSL Forward Statement, Influencing-Addressee-Future-Action, Committing- Speaker-Future Action, Performative Backward Agreement (accept/reject), Understanding, Answer, Information Relation Comparison: tags for utterance level HCRC • Initiation Instruct, Explain, Check, Align, Query-YN and Query-W • Response Acknowledge, Reply-Y, Reply-N, Reply-W and Clarify Conver. Acts • Inform, Suggest, Offer, Promise Request, ReqPerm, WHQ, YNQ, Accept, Reject, Eval,
Form-based dialog structure • Why we need a new structure • The existing structures are too general • Want to capture domain information e.g. task structure, key concepts • Want to create a dialog system from a structure • Choose to work on a form-based dialog system • Represent a structure of a dialog in term of forms and slots
Three-level organization • Task (dialog) A task is a subset of conversation that serves a particular goal of a dialog. • Episode (sub-task) A set of utterances that corresponds to a smaller step in a task • Concept An important piece of domain information that the participants would like to communicate in the dialog
Form representation • A form is a repository of related pieces of information (concepts) • A sub-task is equivalent to form • A sub-task is a smallest practical unit • A task = collection of forms (sub-tasks)
How the task can be accomplished using a form? • The sub-task is accomplished by manipulating the form: • *Fill in the slots • *Execute the form • Discuss the result • Operations
Operations • Operation is an utterance or a part of an utterance (turn) that causes a unique consequence in the conversation U:fill_form_info: I'D LIKE TO FLY TO ArLoc:[HOUSTON ]ArLoc:[TEXAS ] S: access_DB: inform_result:I HAVE A NON-STOP ON CONTINENTAL
Question & Answer pair • Q&A are separated into 2 operations by a turn boundary • The consequence of the answer is depended on the question especially the yes/no answer Dialog1: U: init_form :I NEED A HOTEL IN HOUSTON Dialog2: S: ask_init_form:AND WOULD YOU NEED A HOTEL WHILE YOU'RE IN HOUSTON U: respond:YES
Let’s Go • Goal: request information about the bus schedule • Tasks: (multiple system functions) • Ask bus number • Ask departure time • Ask stop • Etc. • One form for each task (a simple task) • Concept: bus_number, hour, minute, depature_location
List of Operations • Form-filling operations • init_form • fill_form_info • change_form_info • Form execution operations • access_DB (task-specific) • Discuss-result operations • inform_result • navigate_results
Air Travel Domain • Goal: Reserve a flight with optional hotel and car • Tasks: • Reserve a flight • Reserve a car • Reserve a hotel • But car and hotel are always parts of flight reservation. • So it is better to think of them as sub-tasks • One form for each sub-task • Concept: airline, city, date, time
Flight Reservation • There are 3 form executions (DB access) in the flight reservation episode • Retrieve departure flight • Retrieve arrival flight • Retrieve fare • Fare is depended on the flights • Embedded forms Trip flight info flight info Departure Leg fare Arrival Leg
Map Task: description • Conversation between 2 participants • Giver: has a map with a route on it • Follower: has a map without a route • Task: a giver tell the follower how to draw the route on the follower’s map • The maps are not exactly the same
Map Task: Characteristic • More casual conversation • Disfluency • Repetition • Anaphora • No well-defined form • No constraint from the backend • There are many ways to describe a segment • Need a lot of grounding processes
Map Task: Structure • Goal: draw a map from a description • Task: draw a line (a route) • Sub-task • draw a segment of a line • Locate a new landmark (can be embedded)
Grounding Process • Create mutual understanding between participants • Check understanding, correctness of communication • Confirmation and clarification • Define a new term • Discuss the attributes of the object e.g. check landmark and create landmark
Grounding process in form-based structure • Confirmation • If ‘yes’, increases the confidence on the slot value • If ‘no’, crosses out the value from the slot • Clarification S:ask_fill_form_info:INTO ArLoc:[INTERCONTINENTAL ]AIRPORT OR ArLoc:[HOBBY ] U: fill_form_info:AT THE /UH/ ArLoc:[INTERCONTINENTAL ]
Grounding process in form-based structure (2) • Define a new term • A form is a collection of object attributes FOLLOWER: fill_form_info: but golden beach is away in Loc:[the far right]. Landmark: golden beach Location: the far right
Plane simulation task • 3 participants works on the plane simulation • Task = take pictures of a list of targets • Each participant has different roles: flying the plane, navigating the route, taking a picture • There are some restriction on controlling a plane such as speed, altitude and radius from a destination
Dialog Structure • Task: Take pictures of a given list of targets • Sub-tasks: Take a picture of one target • Concept: • target • waypoint • distance • speed • altitude
Task Characteristic • 3-party conversation • Command & Control style • The physical actions have a time constraint • Can’t execute the form right away after all the slots get filled • The list of the sub-tasks (targets) is not fixed and not known in advance
Sub-task • Main sub-task = take a picture of the target • Also have to control the plane • Set destination, altitude and speed (have restriction) • Report the result in term of the plan status: altitude, speed, destination and the distance from destination • Grounding process • Define a landmark as a target or a waypoint
Forms • target form (take a picture) • target name • required distance from target • control form: contain only a single slot (fly a plane) • Altitude • Speed • Destination (may have radius) • grounding form (grounding process) • object name • attributes e.g. type of landmark