1 / 27

Learning to Classify Email into “Speech Acts”

Learning to Classify Email into “Speech Acts”. William W. Cohen, Vitor R. Carvalho and Tom M. Mitchell Presented by Vitor R. Carvalho IR Discussion Series - August 12 th 2004 - CMU. Imagine an hypothetical email assistant that can detect “speech acts”…. 1.

yan
Download Presentation

Learning to Classify Email into “Speech Acts”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning to Classify Email into “Speech Acts” William W. Cohen, Vitor R. Carvalho and Tom M. Mitchell Presented by Vitor R. Carvalho IR Discussion Series - August 12th 2004 - CMU

  2. Imagine an hypothetical email assistant that can detect “speech acts”… 1 urgent Request - may take action - request pending Do you have any data with xml-tagged names? I need it ASAP! Urgent Request - May take action 2 A Commitment is detected. “Should I send Vitor a reminder on Sunday?” “should I add this Commitment to your to-do list?” Sure. I’ll put it together by Sunday. 3 Delivery is sent - to-do list updated Here’s the tar ball on afs : ~vitor/names.tar.gz A Delivery of data is detected. - pending cancelled

  3. Outline • Setting the base • “Email speech act” Taxonomy • Data • Inter-annotator agreement • Results • Learnability of “email acts” • Different learning algorithms, “acts”, etc • Different representations • Improvements • Collective/Relational/Iterative classification

  4. Related Work • Email classification for • topic/folder identification • spam/non-spam • Speech-act classification in conversational speech • email is new domain - multiple acts/msg • Winograd’s Coordinator (1987): users manually annotated email with intent. • Extra work for (lazy) users • Murakoshi et al (1999): hand-coded rules for identifying speech-act like labels in Japanese emails

  5. “Email Acts” Taxonomy From: Benjamin Han To: Vitor Carvalho Subject: LTI Student Research Symposium Hey Vitor When exactly is the LTI SRS submission deadline? Also, don’t forget to ask Eric about the SRS webpage. See you Ben • Single email message may contain multiple acts • An Act is described as a verb-noun pair (e.g., propose meeting, request information) - Not all pairs make sense • Try to describe commonly observed behaviors, rather than all possible speech acts in English • Also include non-linguistic usage of email (e.g. delivery of files) Request - Information Reminder - action/task

  6. A Taxonomy of “Email Acts” Verb Negotiate Other Remind Greet Conclude Initiate Amend Propose Request Deliver Refuse Commit

  7. A Taxonomy of “Email Acts” Noun Information Activity Data Opinion Ongoing Activity Single Event Meeting Logistics Data Other Data Committee Other Short Term Task Meeting <Verb><Noun>

  8. Corpora • Few large, natural email corpora are available • CSPACE corpus (Kraut & Fussell) • Email associated with a semester-long project for GSIA MBA students in 1997 • 15,000 messages from 277 students in 50 teams (4 to 6/team) • Rich in task negotiation • N02F2, N01F3, N03F2: all messages from students in three teams (341, 351, 443 messages). • SRI’s “Project World” CALO corpus: • 6 people in artificial task scenario over four days • 222 messages (publically available) Double-labeled

  9. Inter-Annotator Agreement • Kappa Statistic • A = probability of agreement in a category • R = prob. of agreement for 2 annotators labeling at random • Kappa range: -1…+1

  10. Inter-Annotator Agreementfor messages with only one single “verb”

  11. Learnability of Email ActsFeatures: un-weighted word frequency counts (BOW)5-fold cross-validation(Directive = Req or Prop or Amd)

  12. Using Different Learners (Directive Act = Req or Prop or Amd)

  13. Learning Requests only

  14. Learning Commissives (Commissive Act = Delivery or Commitment)

  15. Learning Deliveries only

  16. Learning to recognize Commitments

  17. Most Informative Features(are common words) Request+Amend+Propose Commit Deliver

  18. Learning: document representation • Variants explored • TFIDF -> TF weighting (don’t downweight common words) • bigrams • For commitment: “i will”, “i agree”, in top 5 features • For directive: “do you”, “could you”, “can you”, “please advise” in top 25 • count of time expressions • words near a time expression • words near proper noun or pronoun • POS counts

  19. Baseline classifier: linear-kernel SVM with TFIDF weighting

  20. Collective Classification (relational)

  21. Collective Classification • BOW classifier output as features (7 binary features = req, dlv, amd, prop, etc) • MaxEnt Learner, Training set = N03f2, Test set = N01f3 • Features: current msg + parent msg + child message (1st child only) • “Related” msgs = messages with a parent and/or child message … useful for “related” messages

  22. Collective/Iterative Classification 0.53 TIME • Start with baseline (BOW) • How to make updates? • Chronological order • Using “family-heuristics” (child first, parent first, etc) • Using posterior probability (Maximum Entropy learner) (Threshold, ranking, etc) 0.65 0.85 0.85 0.95 0.93

  23. Iterative Classification: Commitment

  24. Iterative Classification: Request

  25. Iterative Classification: Dlv+Cmt

  26. Conclusions/Summary • Negotiating/managing shared tasks is a central use of email • Proposed a taxonomy for “email acts” - could be useful for tracking commitments, delegations, pending answers, integrating to-do lists and calendars to email, etc • Inter-annotator agreement → 70-80’s (kappa) • Learned classifiers can do this to some reasonable degree of accuracy (90% precision at 50-60% recall for top level of taxonomy) • Fancy tricks with IE, bigrams, POS offer modest improvement over baseline TF-weighted systems

  27. Conclusions/Future Work • Teamwork (Collective/Iterative classification) seems to helps a lot! • Future work: • Integrate all features + best learners + tricks…tune the system • Social network analysis

More Related