450 likes | 462 Views
A Glimpse on Some Dialogue Systems. Arthur Chan. Introduction. Questions to ponder: What is a dialogue? What is a dialogue system? What are the issues of building a dialogue system? How current dialogue systems address the issues?. The term “dialogue”. From Merriam and Webster
E N D
A Glimpse on Some Dialogue Systems Arthur Chan
Introduction • Questions to ponder: • What is a dialogue? • What is a dialogue system? • What are the issues of building a dialogue system? • How current dialogue systems address the issues?
The term “dialogue” • From Merriam and Webster • a: a conversation between two or more persons; also : a similar exchange between a person and something else (as a computer) • The term “conversation”: a (1) : oral exchange of sentiments, observations, opinions, or ideas • b: an exchange of ideas and opinions • c: a discussion between representatives of parties to a conflict that is aimed at resolution
A Dialogue could be…… • Two parties or more (Let’s call them John and Mary) • John and Mary talk about politics • Goal: an exchange of ideas about something • John and Mary talk about how to solve a problem. • Goal: try to solve a problem • John and Mary talk about nothing. “A random chitchat” • (Actually very common in human dialogue) • Goal? : an exchange of sentiment?, make themselves feel better by talking? or even …… there is no goal?
A Dialogue Systems • Systems that • Allow exchange between human and computer • Simplistic point of view • 3 components • Input <- Could be spoken input, keyboard typing, gesture, facial expression, sign language etc. • Control Unit <- Process the input and generate and output. • Output <- Could be spoken output, rendered animation
Outline of this talk: • Focus on Spoken Dialogue System (SDS) • Presenting 3 papers: • Paper 1: (13 pages) • “Steps toward graceful interaction in spoken and written man-machine communication” Philip J. Hayes and D. Raj Reddy • Written in 1983 • A paper discusses detail of what issues of dialogue system research • Also outlines a lot of interesting issues in the field
Outline of this talk (cont): • 2 systems are selected • Both are representative • Both are trying to solve real problems • Both are quite recent (written at 98, 01) • Paper 2: TRIPS project (previously TRAINS) by CISD • “Toward Conversational Human-Computer Interaction” by James Allen et al. (7 pages) • Paper 3: CMU Communicator • “Creating Natural Dialogs in the Carnegie Mellon Communicator System” by A. I. Rudnicky et al. (5 pages)
Some Perspectives • Did recent systems solve what Hayes and Reddy raise? • Are there new issues emerge in recent years? • System architectures of the two systems are different, does it matter?
Paper 1: “Steps toward graceful interaction in spoken and written man-machine communication”
About Paper I • Written in 1983 • Most authors of the referred systems become professors • Computation is limited at the time • Most are discussion • “U” will mean user, “S” will mean system
Graceful Interaction • Graceful interaction • “……involve dealing appropriately with an anything a user happen to say……” • Proposed components for graceful interaction: • Robust Communication • Flexible Parsing • Domain Knowledge • Explanation Facilities • Focus Mechanisms • Identification from Descriptions • Generation of Descriptions
Robust Communication • Sometimes even humans misunderstand others. • U: “Hello! Are you there?” • Implicit Confirmation e.g • Speaker assume that the information received correctly unless the listener state otherwise e.g S: <nothing> • Implicit Acknowledgement e.g. • S: “Yes! Can you hear me?” • Explicit Indication of Incomprehension • S: “What did you say?” • Echo: S: “Aha.” • Fragmentary Recognition : • S: (If “Hello” is recognized), “Hi.” • S: (If “Are you there?” is recognized), “Yes.”
Flexible Parsing • Human conversation • Usage of Idioms • “Phrase whose interpretation cannot be obtained by using the components of the phrase in the usual way” • Fragmentary utterance e.g. • Alright when “give me” and “the number for Joe Smith” were recognized out of • “Would you be so kind to give me your listing of the number for Joe Smith?” • Not Good in “I asked you to give me the number for Joe Smith, but I meant Fred.”
Flexible Parsing (cont.) • Omissions, repetitions and noise phrase e.g. • “What is er could er you g…give me the number er the extension for Joe Smith?” • Grammatical Errors • E.g. Just listen to Arthur Chan • Ellipsis • Omission of words in a sentence but could be obviously understood. • E.g. • U: What is the number for Mr. Smith? • S: Do you mean Joe Smith or Fred Smith? • U: Joe. (Instead of “I mean Joe Smith”)
Flexible Parsing (cont.) • Standard parsers, • Fail easily in repetition/omissions • Pattern matcher could handle idioms • No easy solution for ellipsis
Domain Knowledge • “Simple Service” • The customer or client identify certain entities • The entities could be regarded as parameters. • Frame-based system (Minsky 75) • Frames: • A method of knowledge representation • A frame is a representation of one entity in terms of the entities which make its part. • “……Frame have already have be used successfully in systems ……”
Explanation Facilities • Questions about ability –indirect speech acts • “Can you swim?” (Interpret literally) • “Can you open the window?” (Request of an action) • Questions about ability in a restricted domain • “Can you tell me the number of Joe Smith?” • Event Question • “What did you just say?” • Did you just ask for my name?” • Hypothetical Question • “If …… , what will happened?”
Goals and Focus • Human conversation are goal-oriented • (?) • Goals in spoken dialogue systems could be simplified • 1, “it has no independent goals of its own, its only goal is to help the user fulfill his goals.” • 2a, “the user’s goal are either to avail himself of the system’s highly limited services, • or 2b, fall into an undistinguished class for which the system in unable to help the user. “
Goals and Focus (cont.) • Goals could be divided to subgoals • Focus is • “An extension of focus by equating it with the currently active subgoal”
Identification from Descriptions • The identification capability of • “a listener to use a speaker’s description of a previously memorized entity to identify an object.” • Systems need to deal with • Ambiguous Descriptions • U: “What is the number for Smith?” • The user may try to change the description when being disambiguated “What is the number of Smythe?” • Unsatisfiable Descriptions, possible response: • S: “There is no listing for Smith.” • Description and Faulty Comprehension • U: “What is the number for <GARBLE> Smith?” • S: “Did you say Jim Smith and Joe Smith?”
Language Generation • Different ways of saying the same thing could mean different: • E.g. U: “Do you mean Jim Smith or Fred Smith?” • S: “Jim Smith.” or “I mean Jim Smith.” • E.g. Restaurant systems was unsure about its recognition of “seven” in • U: “I’d like a reservation for seven people” • S: “What time would your party of seven like to eat?” • (This provide implicit confirmation.)
Language Generation (cont.) • Sometimes system’s response could change the user’s response • U: “I would like the extension of Mr. Smith” • E.g. if the system don’t understand the extension • (Appropriate) “Do you mean Joe Smith or Jim Smith?” • (Less appropriate) “What is the meaning of “extension?”” • Knowledge of standard transformation and conformations plays a role • Systems should understand distant way to say something • S: “Would you prefer 7 o’clock or 8 o’clock?” • U: (Acceptable) “I prefer 7 o’clock” • U: (Should also be acceptable(?)) “7 o’clock is my preference”
Summary of Paper I • Graceful interaction requires systems to behave more intelligently than a simple input/output system • 7 components are discussed. • Further reading: • “Natural Language Understanding” by James Allen.
Paper II: “Towards Conversational Human-Computer Interaction”
About Paper II • Mainly about conversational human-computer interaction. • The Rochester Interactive Planning System (TRIPS) • Mentioned as a “practical dialogue system” • We are now back to the modern time……
How the author see SDS • About System • Not to “… engage you in a dialogue” • But to “……enhances the richness of dialogue” • About Spoken User Interface • Could be as effective as GUI • If viewed as mixed-initiative dialogue, • can be viewed as man-machine interaction after human collaborative problem solving
Dialogue Task Complexity • Finite-state Script (Least complicated) • Example: Long Distance Dialing • Dialogue Phenomenon handled: • User answers questions • Frame-based • Example: Getting trained arrival and departure information • Dialogue Phenomenon handled: • User ask questions, simple clarification by system
Dialogue Task Complexity (cont.) • Sets of Contexts • Example: Travel Booking Agent • Dialogue Phenomenon handled: • Shift between predetermined topics • Plan-based Models • Example: Kitchen design consultant • Dialogue Phenomenon handled: • Dynamically generated topic structures, collaborative negotiation subdialogues
Dialogue Task Complexity (cont.) • Agent-based Task • Example: Disaster Relief Task • Dialogue Phenomenon handled: • A dynamically changing world • Different modalities involved • TRIPS focused on • “…… primarily interested in design of the last two-levels of dialogue systems ……”
Hypothesis of Dialogue Systems • The Practical Dialogue Hypothesis • “The conversational competence required for practical dialogues, while still complex, is significantly simpler to achieve than general human conversational competence.” • The Domain-Independence Hypothesis • “Within the genre of practical dialogue, the bulk of the complexity in the language interpretation and dialogue management is independent of the task being performed.”
Four challenge mentioned • Parsing Language in Practical Dialogues • Integrating Dialogue and Task Performance • Intention Recognition • Mixed-Initiative Dialogue
Summary of Paper II • Present a more detailed point of view on identifying dialogue complexity. • New challenges • System architecture becomes important when agents need to work with each other. • Recognition Intention
Further Reading • CISD web page: • http://www.cs.rochester.edu/research/cisd/ • Further Technical Detail of TRIPS • “An Architecture for a Generic Dialogue Shell” • TRAINS • “The Design and Implementation of the TRAINS-96 System: A Prototype Mixed-Initiative Planning Assistant”
Paper III: “Creating Natural Dialogs in the Carnegie Mellon Communicator System”
About Paper III: CMU Communicator • DARPA Communicator • Travel Booking application • Participants: MITRE, CSLU, BBN, CMU, SRI (not exhaustive) • Several open systems were created. • MITRE GalaxyCommunicator • CMU Communicator. • From Paper II, it is a “set of contexts” application.
How the author see • The travel-planning domain is interesting because • “……the sequence of interactions …. is not easily reduced to a fixed sequence of steps ….” • “simple form-based approaches (e.g., [6]) are difficult to adapt to this domain” because • “… structure of form could be unpredictable.” • The user goal could easily change.
Task-based Dialog Management • Task • Successful completion of a task: • Two parties agree on a particular result (e.g. itinerary) • Some understanding of how to complete a task • A representation for the domain-specific information • AND A representation captures the structure of activity
Products and Schema • A product • Holds the result of the interaction • A schema • How element of product could be interacted about
An itinerary • An itinerary • A hierarchical structure • Essentially are “dynamically constructed form” • Tree structure will allow inheritance of information. • Creating an itinerary • Composition of the structure • Population of the structure with trip-specific information
System Architecture Note: From MITRE Communicator
Summary of Paper III • More stress on how the system could be implemented • Further info • MITRE Communicator • http://communicator.sourceforge.net/ • CMU Communicator • http://www.speech.cs.cmu.edu/Communicator/ • CU Communicator
Conclusion • Generally about SDS • There are still a lot of challenges in dialogue syste • Current practical systems are working on limited domain • Practical system require higher complexity • System architecture becomes important because different agents will need to work with each other • Hayes paper: • Some issue could be greatly simplified in practice • Robust parsing, handling of ellipsis • Some issue may not be appreciated as much • Dialogue management.