270 likes | 465 Views
Learning to Classify Email into “Speech Acts”. William W. Cohen, Vitor R. Carvalho and Tom M. Mitchell Presented by Vitor R. Carvalho IR Discussion Series - August 12 th 2004 - CMU. Imagine an hypothetical email assistant that can detect “speech acts”…. 1.
E N D
Learning to Classify Email into “Speech Acts” William W. Cohen, Vitor R. Carvalho and Tom M. Mitchell Presented by Vitor R. Carvalho IR Discussion Series - August 12th 2004 - CMU
Imagine an hypothetical email assistant that can detect “speech acts”… 1 urgent Request - may take action - request pending Do you have any data with xml-tagged names? I need it ASAP! Urgent Request - May take action 2 A Commitment is detected. “Should I send Vitor a reminder on Sunday?” “should I add this Commitment to your to-do list?” Sure. I’ll put it together by Sunday. 3 Delivery is sent - to-do list updated Here’s the tar ball on afs : ~vitor/names.tar.gz A Delivery of data is detected. - pending cancelled
Outline • Setting the base • “Email speech act” Taxonomy • Data • Inter-annotator agreement • Results • Learnability of “email acts” • Different learning algorithms, “acts”, etc • Different representations • Improvements • Collective/Relational/Iterative classification
Related Work • Email classification for • topic/folder identification • spam/non-spam • Speech-act classification in conversational speech • email is new domain - multiple acts/msg • Winograd’s Coordinator (1987): users manually annotated email with intent. • Extra work for (lazy) users • Murakoshi et al (1999): hand-coded rules for identifying speech-act like labels in Japanese emails
“Email Acts” Taxonomy From: Benjamin Han To: Vitor Carvalho Subject: LTI Student Research Symposium Hey Vitor When exactly is the LTI SRS submission deadline? Also, don’t forget to ask Eric about the SRS webpage. See you Ben • Single email message may contain multiple acts • An Act is described as a verb-noun pair (e.g., propose meeting, request information) - Not all pairs make sense • Try to describe commonly observed behaviors, rather than all possible speech acts in English • Also include non-linguistic usage of email (e.g. delivery of files) Request - Information Reminder - action/task
A Taxonomy of “Email Acts” Verb Negotiate Other Remind Greet Conclude Initiate Amend Propose Request Deliver Refuse Commit
A Taxonomy of “Email Acts” Noun Information Activity Data Opinion Ongoing Activity Single Event Meeting Logistics Data Other Data Committee Other Short Term Task Meeting <Verb><Noun>
Corpora • Few large, natural email corpora are available • CSPACE corpus (Kraut & Fussell) • Email associated with a semester-long project for GSIA MBA students in 1997 • 15,000 messages from 277 students in 50 teams (4 to 6/team) • Rich in task negotiation • N02F2, N01F3, N03F2: all messages from students in three teams (341, 351, 443 messages). • SRI’s “Project World” CALO corpus: • 6 people in artificial task scenario over four days • 222 messages (publically available) Double-labeled
Inter-Annotator Agreement • Kappa Statistic • A = probability of agreement in a category • R = prob. of agreement for 2 annotators labeling at random • Kappa range: -1…+1
Inter-Annotator Agreementfor messages with only one single “verb”
Learnability of Email ActsFeatures: un-weighted word frequency counts (BOW)5-fold cross-validation(Directive = Req or Prop or Amd)
Using Different Learners (Directive Act = Req or Prop or Amd)
Learning Commissives (Commissive Act = Delivery or Commitment)
Most Informative Features(are common words) Request+Amend+Propose Commit Deliver
Learning: document representation • Variants explored • TFIDF -> TF weighting (don’t downweight common words) • bigrams • For commitment: “i will”, “i agree”, in top 5 features • For directive: “do you”, “could you”, “can you”, “please advise” in top 25 • count of time expressions • words near a time expression • words near proper noun or pronoun • POS counts
Collective Classification • BOW classifier output as features (7 binary features = req, dlv, amd, prop, etc) • MaxEnt Learner, Training set = N03f2, Test set = N01f3 • Features: current msg + parent msg + child message (1st child only) • “Related” msgs = messages with a parent and/or child message … useful for “related” messages
Collective/Iterative Classification 0.53 TIME • Start with baseline (BOW) • How to make updates? • Chronological order • Using “family-heuristics” (child first, parent first, etc) • Using posterior probability (Maximum Entropy learner) (Threshold, ranking, etc) 0.65 0.85 0.85 0.95 0.93
Conclusions/Summary • Negotiating/managing shared tasks is a central use of email • Proposed a taxonomy for “email acts” - could be useful for tracking commitments, delegations, pending answers, integrating to-do lists and calendars to email, etc • Inter-annotator agreement → 70-80’s (kappa) • Learned classifiers can do this to some reasonable degree of accuracy (90% precision at 50-60% recall for top level of taxonomy) • Fancy tricks with IE, bigrams, POS offer modest improvement over baseline TF-weighted systems
Conclusions/Future Work • Teamwork (Collective/Iterative classification) seems to helps a lot! • Future work: • Integrate all features + best learners + tricks…tune the system • Social network analysis