360 likes | 544 Views
Conversational Agents in Multi-Party Situations. Rohit Kumar Language Technologies Institute Carnegie Mellon University. Wednesday, August 19, 2009: Speaking Requirement Talk. Multi-Party Situations. More than one Human User Who am I interacting with? Tasks situates Interaction
E N D
Conversational Agentsin Multi-Party Situations Rohit Kumar Language Technologies Institute Carnegie Mellon University Wednesday, August 19, 2009: Speaking Requirement Talk
Multi-Party Situations More than one Human User Who am I interacting with? Tasks situates Interaction What are they interacting about? May not be always “open-world context” However, unlike dialog systems for Information access, Multi-Party situations make scope of conversation during tasks quite broad Environment dictates Modalities How are we going to interact? (Voice, Text, Virtual Worlds, …) Examples (Role of Agents): Collaborative Learning (Tutor, Peer, …) Multi-User Interactive Games (NPC: Peer, Prop | E.g. Coach in Sims) Computer Supported Communication (Assistant, Translator, …) Related Terms Situated Interaction, Open-World Dialog, Mixed-Initiative, … 2
Outline Multi-Party Situations Two Assumptions in Dialog Systems Generalized Process of Conversation Basilica: Architectures Agents built using Basilica CycleTalk, Wrench Lab, PsychTalk, 911 Specific Challenges & Directions 3
Two Assumptions in Dialog Systems (1) Participation Each participant participates (almost) Evenly Some recent work to address local failures due to this assumption (Raux 2008) In multi-party situations: such an assumption does not hold 4 Users System
Two Assumptions in Dialog Systems (2) Direction of Turns All user turns are directed at Agent Failure of this assumption happens (Rarely) Example: User talking to a friend, while calling Let’s Go In multi-party situations, this assumption usually doesn’t hold What can I do for you? I want to go to the Airport. To the Airport. Where are you leaving from? yo, where are we at? I am sorry. I didn’t get that. Where are you leaving from? Uh.. I donno man, I guess its Squirrel Hill … (Background) 5 Users System
Generalized Process of Conversation • Model the Agent as a composition of behaviors • Similar to architectures like Jaspis, Rime • Transfer function • Modeled as a network of behavioral components Environment (E) (A) Agent si (Selection fn) Behavior (bij) (ci) Components (ej)Event { (ej, t) }Set of Events Transfer Function (T) 7
E A bij ci si {(ej,t)} ej T Generalized Process of Conversation • Agent (A) = ({ci}, T, π) • Component ci = ({bij}, si) • Simplified T(c) {c’} • Network that connects components • T(c, typee) {c’} … • Selection function si(typee) bij • (Simplified) Maps Events to Behaviors • Event (e) = (data, type, csender, time) 8
E A bij ci si {(ej,t)} ej T Generalized Process of Conversation • Does not make the two assumptions discussed earlier • Uneven Participation by Agent as messages from the environment after propagation through the component network may generate zero, one or many messages • Messages not directed at the Agent can be made to quickly dissipate • Collection of components can act as Filters on signals from the Environment 9
Generalized Process of Conversation • Note: This doesn’t automatically solve the problem of “Is it directed at me?” or “Should i participate now?” • Helps us design agents in a way that is flexible enough to allow us to not have to make those assumptions • Basilica:Software architecture that helps us build agents based on this theoretical construct 10
Basilica: Architecture Event-driven Architecture for building Conversational Agents Provides { operation, agent, component, connection } management event propagation mechanism (sync & async modes) generic components (memory) growing set of re-usable components observing / debugging interfaces Utilities: logging, timers Major component Classes: Filters & Actor Analogy: Java Swing for GUI development Cannot automatically Build GUIs Figure out what to do when an event happens The designer/developer still needs to do that as it is largely dependent on the task 11
Basilica: Architecture Agents built on this Architectures must programmatically specify / define Components i.e. { ({bij}, si) } Network (T) (Can be specified in xml) Events i.e. (Data, Type) Illustrated ahead Using agents built on this architecture 12
Basilica: Component Specification Example Selection Function si(typee) bij protected void processEvent(Event e) { if (e instanceofMessageEvent) { handleMessageEvent((MessageEvent) e); } } Example Behavior bij Adds semantic parse information to a message private void handleMessageEvent(MessageEvent me) { Message911 m = me.getMessage(); myParser = new Parser(); String parse = myParser.parse(myParser.preProcess(m.getBody())); m.setParse(parse); MessageEventnewme = new MessageEvent(this); newme.setMessage(m); this.broadcast(newme); } 13
Basilica: Network Specification Example Network Specification: <connections> <connection from="myXMPPListener" to="myPresenceFilter"/> <connection from="myXMPPListener" to="myMessage911Filter"/> <connection from="myMessage911Filter" to="myNLU"/> <connection from="myMessage911Filter" to="myRequestTypeFilter"/> <connection from="myNLU" to="myDAClassifier"/> <connection from="myNLU" to="myParser"/> <connection from="myDAClassifier" to="myNLU"/> <connection from="myParser" to="myNLU"/> <connection from="myNLU" to="myInformDispatcherActor"/> <connection from="myGreetingActor" to="myPromptCallerActor"/> <connection from="myInformDispatcherActor" to="myXMPPActor"/> <connection from="myPromptCallerActor" to="myXMPPActor"/> … </connections> 14
Basilica: Events definition Example Event Definition Data is encapsulated in events public class MessageEvent extends Event { Message911 myMessage; public MessageEvent(Component s) { super(s); } public Message911 getMessage() { return myMessage; } public void setMessage(Message911 m) { myMessage = m; } } 15
CycleTalk Agent Situation: Agent supports Sophomore Mechanical Engineering students learn about Rankine cycles while they work together on designing an engine based on that cycle Environment: Concert Chat A multi-party chat room environment with collaboration tools like whiteboards, screen shot sharing, wiki, etc… Used by Virtual Math Teams (of MathForum): A service that reaches a million students in the US Agent Role: Tutor Basilica Specs: 13 components 14 event types Observable Behaviors: Hinting, Tutoring, Attention Grabbing 16
CycleTalk Agent: Graphical Spec. Signal from the Environment Filters Concert Chat Listener Actors Concert Chat Actor Starts Tutoring Presence Filter Message Filter Tutor turn sent to Environment Tutoring Actor Turn TakingCoordinator Prompting Actor Tutoring Filter Request Detector “Help with…” detected Attention Grabbing Actor Attention Grabbing Filter Hinting Filter Hinting Actor 18
CycleTalk Agent Our first agents based on this architecture The architecture partly evolved based on needs from this project Although, built many versions of this agent Lessons Learnt: Filter type Components have sub-categories Listeners (Perceptors): Listen to environment / modalities Annotators: As events pass through these, they get annotated with additional information Detectors: Event get translated into other types of events based on certain characteristics Coordinators: Coordinates component(s) by timing / absorbing / synchronizing events 19
CycleTalk Agent in SecondLife Experimental prototype Same Situation/Task as the CycleTalk agent Environment: SecondLife (Multi-User Virtual Environment with Embodied Users) Lessons Learnt: Most components got re-used Integration with different environments (with similar modalities) possible by replacing/adding Environment Listener and Actor components Integration with Second Life using a Generic HTTP middleware created which is now used to integrate Basilica agents with other prototype web-based applications 20
CycleTalk Agent in SL: Graphical Spec. Middle Ware Actor Middle Ware Listener Unchanged Components Presence Filter Message Filter Tutoring Actor Turn TakingCoordinator Prompting Actor Tutoring Filter Request Detector Attention Grabbing Actor Attention Grabbing Filter Hinting Filter Hinting Actor 22
Wrench Lab Agent Situation: Teams (3+) of freshmen mechanical engineering students designing a Wrench. Environment: Concert Chat (Same as CycleTalk) Agent Role: Tutor Basilica Spec 9 Components 8 Events Observable Behaviors Tutoring, Task Instruction 23
Wrench Lab Agent This was first prototype built in Spring 2009 to collect some data of teams of three or more students performing a learning task while a tutor tries to engage them in instructive dialog occasionally Lessons Learnt Revisiting the Basilica architecture for this agent after my internship made me realize a lot of software issue with the architecture Too many dependencies on some Global classes Components were not autonomous units.Instead were tightly coupled with each other. Event Logging was poor 24
Wrench Lab Agent: Graphical Spec. Actors Filters CC Actor CC Listener Launch Filter Prompting Actor Prompting Filter CC Text Filter Tutoring Actor Tutoring Filter Turn TakingCoordinator 25
PsychTalk Agent Situation: College students learning Psychology vocabulary by playing a vocabulary game (Taboo). Students can play with each other and with agents. To be delivered under the portal of the most widely used Psychology textbook in the US. Environment: A specialized web portal developed for this project which provides chat facility between participants and the ability to play the game. Agent Role: Peer / Player Basilica Spec 10 Components 12 Event types Observable behaviors Hinting, Guessing 26
PsychTalk Agent: Graphical Spec. Middle WareActor Middle Ware Listener Actors Filters Role Filter Status Filter Greeting Actor Role Actor Card Memory Hinting Actor Score Memory Guessing Actor 28
PsychTalk Agent Lessons Learnt Memory Component Need for a component type that can serve as memory accessible to other components Memory components store histories of a data type e.g. a Taboo Card (a game prop) Mechanism for faster (synchronous) commit/retrieval from Memory component implemented instead of using the usual send event and wait for response event mechanism Synchronous mode of Agent operation Learnt that in web environments cannot afford to run a highly multi-threaded agent. That needs us to run the agent operation as a different server. Synchronous mode allows agents to run on one thread. Components are not autonomous and events get processed synchronously 29
911 Agent Situation: Translation agent for an English Speaking 911 dispatch operator responding to a Spanish speaking reporting an emergency Environment: Specialized Environment build for Dispatcher Agent Role: Interpreter Basilica Spec: 22 Components 9 Event Types Observable behaviors Prompting Caller, Confirming Information, Extracting reports, requests and parameters (like description, location), … 30
911 Agent: Graphical Spec. Actors Filters XMPP Listener XMPP Actor Caller Turn Memory Presence Filter DA Classifier Message Filter Parser Prompt Caller Actor Greeting Actor Request Type Filter NLU Meta Act Performer Meta Act Filter ReportDetector IQA Act Performer IQA Act Filter Request Detector Slot Confirmer Confirm Slot Filter Value Slot Detector Inform Dispatcher Actor Generics Detector 32
911 Agent Lessons Learnt: Required a visual representation of the agent Agent Observation Interface: To create a debugging interface To create other UIs that can allow online human intervention with behaviors Agent Operations (help in creating Agent farms) To be able to run many agents simultaneously and easily 33
Summary • Event driven architecture Basilica provides the flexibility to build Conversational Agents that can participate in Multi-Party interactive situations • Agent as a decomposition of many light weight components allow incremental development and re-use • Progressively fewer changes to the architecture and less frequent need for new types of components suggests stability • Similar design patterns observed across various agents • Similarity to other pipeline architectures • Having the ability to build these agents, lets us focus on new interesting problems in Multi-Party situations • Awareness, Social Presence, Long-term interaction, Novel Behavior, … 34
Done: Thanks for tuning in. Most questions are good questions. So, Please Ask. 35
Publications • Rohit Kumar, Carolyn P. Rose, "Building Conversational Agents with Basilica", NAACL 2009 (DEMO) • Sourish Chaudhuri, Rohit Kumar, Iris Howley, Carolyn P. Rose, "Engaging Collaborative Learners with Helping Agents", AIED 2009 • Rohit Kumar, Sourish Chaudhuri, Iris Howley & Carolyn Penstein Rosé, "VMT-Basilica: An Environment for Rapid Prototyping of Collaborative Learning Environments with Dynamic Support", CSCL 2009 • Nominated for Best Technology Demonstration • Yue Cui, Rohit Kumar, Sourish Chaudhuri, Gahgene Gweon, Carolyn Rose, "Helping Agents in VMT", in "Studying Virtual Math Teams", Stahl, G., ed. (2009), New York, NY: Springer • Baba Kofi A. Weusijana, Rohit Kumar, Carolyn P. Rose, "MultiTalker: Building Conversational Agents in Second Life using Basilica", Second Life Education Community Convention, Purple Strand: Educational Tools and Products, 2008, Tampa, FL • Sourish Chaudhuri, Rohit Kumar, Carolyn P. Rose, "It’s not easy being green - Supporting Collaborative Green Design Learning", ITS 2008, Montreal, Canada 36