100 likes | 186 Views
The Role and Identification of Dialog Acts in Online Chat. AAAI-11 Workshop on Analyzing Microtext August 8, 2011 Tamitha Carpenter, Emi Fujioka Stottler Henke Associates Inc. 1107 NE 45th St., Suite 310, Seattle, WA 98105 206-545-1478 FAX: 206-545-7227
E N D
The Role and Identification of Dialog Acts in Online Chat AAAI-11 Workshop on Analyzing Microtext August 8, 2011 Tamitha Carpenter, Emi Fujioka Stottler Henke Associates Inc. 1107 NE 45th St., Suite 310, Seattle, WA 98105 206-545-1478 FAX: 206-545-7227 tamitha@stottlerhenke.com http://www.stottlerhenke.com
Overview • Problem: Analyze task-supporting chat to enable situation awareness processing • Domain: Software development • Corpus • 1111 messages, collected from an IRC chat room over a 6 week period • Approach • Chat-IE – Context-aware, event driven, collection of experts • Includes tokenizer, POS tagger, dialog act type identifiers, and dialog pattern matcher
Software Development Team Source Code Bug Tracking meeting tomorrow at noon to discuss ideas on how to do this. what defect is it? changed so how do you know how to read the value if the file hasn't changes? I've finished one task (in review now) and one review Wiki Pages Context Dialog Act Splitting/Merging Historical Phrase Matching Shallow Parsing Domain Term Recognition Fragment Tagger Action Wh-question I've finished one task (in review now) and one review what defect is it? Wh-question so how do you know how to read the value if the file hasn't changed? Directive meeting tomorrow at noon to discuss ideas on how to do this.
Dialog Act Types, most common first Most commonly self-completion Describe ongoing and completed activities statement non opinion statement opinion action description yes no question action directive commit agree accept other wh question thanking affirmative answer completion declarative y/n question hmm response acknowledge apology appreciation negative answer offer correction hedge maybe accept part open question reject hold before agreement other answer summarize restate rhetorical question conventional closing quotation downplayer option or clause self talk abandoned ack backchannel (mm hmm) attention backchannel question conventional opening declarative wh question repeat phrase signal non understanding tag question Example: Speaker1: I’m working on defect 567 Speaker1: I meant 568 For messages directed at specific person
Uses • Triage – Identify critical events mid-conversation • Threading – Use patterns of dialogs to detangle multiple conversations • Filtering – Direct topically relevant conversations to interested users • Extraction – Use sequences of dialog act types to structure IE rules
Dialog Act Identification (1) • Historical Phrase Matching • Identify Dialog Act Types based on past messages • Raw text • Text tagged with parts of speech • Uses variation of a String B-tree for fast matching over a large corpus • Obtained about 60% accuracy on common dialog act types
Dialog Act Identification (2) • Boosted performance to near 90% accuracy • Example rules: • Wh-questions – Messages starting with wh-words (what, which, why, etc.). • Statement-opinion – Messages containing one of: “might”, “maybe”, “should”, “seems”, “i think”, “looks like”, “look like”, “probably”, or “i'm sure”. • Action-directive – Messages starting with infinitive verbs. • Action-description – Messages starting with “i”, “i just”, “i have”, “i’m”, etc., followed by a past tense or “-ing” verb. • Commit – Messages starting with “i will”, “i’ll”, “i’m going to”, or “i am going to”. Also, messages starting with “will” followed by an infinitive verb.
Dialog Patterns • Status updates – An action-directive or wh-question, followed by any number of action-descriptions. • Directed request with acknowledge – An attention followed by any number of utterances, followed by a response-acknowledge by the person mentioned in the first utterance. • Confirmed expertise (1) – An action-description followed by a thanking or a response-acknowledge (preferably mentioning the initial speaker). (First speaker demonstrated expertise.) • Confirmed expertise (2) – A yes-no-question or wh-question followed by a describe-other. (Second speaker demonstrated expertise.)
Lessons Learned • Users have very specific needs for chat analysis. • Filter chat dialogs and e-mail messages/threads into topics or “bins”. • Monitor chat rooms for triggering events. • Everything hinges on the tokenizer. • Users combine characters in novel ways (e.g., ?!?!, <---->, :-), etc.) • Domains may have special tokens (e.g., “/usr/bin/chatLogs”, “65.4N”). • Partial dialogs may need to be retired without being “finished”.
References Cohen & Levesque, 1990. Rational interaction as the basis for communication. In Intentions in Communication. Creswick, Fujioka, & Goan, 2008. Pedigree tracking in the face of ancillary content. In Proceedings of the Second Workshop on Uncovering Plagiarism, Authorship, and Software Misuse (PAN). Cunningham, Maynard, Bontcheva, & Tablan, 2002. GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL'02). Grice, 1975. Logic and conversation. In Syntax and semantics 3: Dialog acts. Hepple, 2000. Independence and Commitment: Assumptions for Rapid Training and Execution of Rule-based Part-of-Speech Taggers. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000). Stolcke, Ries, Coccaro, Shriberg, Bates, Jurafsky, Taylor, Martin, Van Ess-Dykema, & Meteer, 2000. Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech. In Computational Linguistics 26(3).