1 / 22

ACM email corpus annotation analysis

ACM email corpus annotation analysis. Andrew Rosenberg 2/26/2004. Overview. Motivation Corpus Description Kappa Shortcomings Kappa Augmentation Classification of messages Corpus annotation analysis Next step: Sharpening method Summary. Motivation.

ahava
Download Presentation

ACM email corpus annotation analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ACM email corpus annotation analysis Andrew Rosenberg 2/26/2004

  2. Overview • Motivation • Corpus Description • Kappa Shortcomings • Kappa Augmentation • Classification of messages • Corpus annotation analysis • Next step: Sharpening method • Summary

  3. Motivation • The ACM email corpus annotation raises two problems. • By allowing annotators to assign a message one or two labels, there is no clear way to calculate an annotation statistic. • An augmentation to the kappa statistic is proposed • Interannotator reliability is low (K < .3) • Annotator reeducation and/or annotation material redesign are most likely necessary. • Available annotated data can be used, hypothetically, to improve category assignment.

  4. Corpus Description • 312 email messages exchanged between the Columbia chapter of the ACM. • Annotated by 2 annotators with one ortwo of the following 10 labels • question, answer, broadcast, attachment transmission, planning, planning scheduling, planning-meeting scheduling, action item, technical discussion, social chat

  5. Kappa Shortcomings • Before running ML procedures, we need confidence in assigning labels to the messages. • In order to compute kappa (below) we need to count up the number of agreements. • How do you determine agreement with an optional secondary label? • Ignore the secondary label?

  6. Kappa Shortcomings (ctd.) • Ignoring the secondary label isn’t acceptable for two reasons. • It is inconsistent with the annotation guidelines. • It ignores partial agreements. • {a,ba} - singleton matches secondary • {ab,ca} - primary matches secondary • {ab,cb} - secondary matches secondary • {ab,ba} - secondary matches primary, and vice versa • Note: The purpose is not to inflate the kappa value, but to accurately assess the data.

  7. Kappa Augmentation • When a labeler employs a secondary label, consider it as a single annotation divided between two categories • Select a value of p, where 0.5≤p≤1.0, based on how heavily to weight the secondary label • Singleton annotations assigned a score of 1.0 • Primary p • Secondary 1-p

  8. Kappa Augmentation example Annotator labels Annotation Matrices with p=0.6

  9. Kappa Augmentation example (ctd.) Annotation Matrices Agreement matrix

  10. Kappa Augmentation example (ctd.) • To calculate p(E), use the relative frequencies of each annotators label usage. • Kappa is then computed as originally:

  11. Classification of messages • This augmentation allows us to classify messages based their individual kappa’ values at different values of p. • Class 1: high kappa’ at all values of p. • Class 2: low kappa’ at all values of p. • Class 3: high kappa’ at p = 1.0 • Class 4: high kappa’ at p = 0.5 • Note: mathematically kappa’ needn’t be monotonic w.r.t. p, but with 2 annotators it is.

  12. Corpus Annotation Analysis • Agreement is low at all values of p • K’(p=1.0) = 0.299 • K’(p=0.5) = 0.281 • Other views of the data will provide some insight into how to revise the annotation scheme. • Category distribution • Category co-occurrence • Category confusion • Class distribution • Category by class distribution

  13. Corpus Annotation Analysis:Category Distribution

  14. Corpus Annotation Analysis:Category Co-occurrence

  15. Corpus Annotation Analysis:Category Confusion

  16. Corpus Annotation Analysis:Class Distribution

  17. Corpus Annotation Analysis:Category by Class Distribution-1/2 Class 1:const. high Class 2:const. low

  18. Corpus Annotation Analysis:Category by Class Distribution-2/2 Class 3:low to high Class 4:high to low

  19. Next step: Sharpening method • In determining interannotator agreement with kappa, etc., two available pieces of information are overlooked: • Some annotators are “better” than others • Some messages are “easier to label” than others • By limiting the contribution of known poor annotators and difficult messages, we gain confidence in the final category assignment of each message. • How do we rank annotators? Messages?

  20. Sharpening Method (ctd.) • Ranking Annotators • Calculate kappa between each annotator and the rest of the group. • “Better” annotators have a higher agreement with the group • Ranking messages • Variance (or -p*log(p)) of label vector summed over annotators. • Messages with high variance are more consistently annotated

  21. Sharpening Method (ctd.) • How do we use these ranks? • Weight the annotators based on their rank. • Recompute the message matrix with weighted annotator contributions. • Weight the messages based on their rank. • Recompute the kappa values with weighted message contributions. • Repeat these steps until the weights change beneath a threshold.

  22. Summary • The ACM email corpus annotation raises two problems. • By allowing annotators to assign a message one or two labels, there is no clear way to calculate an annotation statistic. • An augmentation to the kappa statistic is proposed • Interannotator reliability is low (K < .3) • Annotator reeducation and/or annotation material redesign are most likely necessary. • Available annotated data can be used, hypothetically, to improve category assignment.

More Related