250 likes | 403 Views
Exploiting Subjective Annotations. Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente http://hmi.ewi.utwente.nl. Types of content. Annotation as a task of subjective judgments? Manifest content Pattern latent content Projective latent content
E N D
Exploiting Subjective Annotations Dennis Reidsma and Rieks op den Akker Human Media Interaction University of Twente http://hmi.ewi.utwente.nl
Types of content • Annotation as a task of subjective judgments? • Manifest content • Pattern latent content • Projective latent content Cf. Potter and Levine-Donnerstein 1999
Projective latent content • Why annotate data as projective latent content? • Because it cannot be defined exhaustively, whereas annotators have good `mental schema’s’ for it • Because the data should be annotated in terms that fit with the understanding of `naïve users’
Inter-annotator agreement and projective content • Disagreements may be caused by • Errors by annotators • Invalid scheme (no true label exists) • Different annotators having different `truths’ in interpretation of behavior (subjectivity)
Subjective annotation • People communicate in different ways, and therefore, as an observer, may also judge the behavior of others differently
Subjective annotation • People communicate in different ways, and therefore, as an observer, may also judge the behavior of others differently • Projective content may be especially vulnerable to this problem
Subjective annotation • People communicate in different ways, and therefore, as an observer, may also judge the behavior of others differently • Projective content may be especially vulnerable to this problem • How to work with subjectively annotated data?
Subjective annotation • How to work with subjectively annotated data? Unfortunately, it leads to low levels of agreement, and therefore usually would be avoided as `unproductive material’
I. Predicting agreement • One way to work with subjective data is to try to find out in which contexts annotators would agree, and focus on those situations. • Result: a classifier that will not always classify all instances, but if it does, it will do so with greater accuracy
II. Explicitly modeling intersubjectivity • A second way: model different annotators separately, then find the cases where the models agree, and assume that those are the cases where the annotators would have agreed, too. • Result: a classifier that tells you for which instances other annotators would most probably agree with its classification
Advantages • Both solutions lead to `cautious classifiers’ that only render a judgment in those cases where annotators would have been expected to agree • This may carry over to users, too… • Neither solution needs to have all data multiply annotated for this
Pressing questions so far?(The remainder of the talk will give two case studies.)
Case studies • I. Predicting agreement from information in other (easier) modalities: The case of contextual addressing • II. Explicitly modeling intersubjectivity in dialog markup: The case of Voting Classifiers
Data used: The AMI Corpus • 100h of recorded meetings, annotated with dialog acts, focus of attention, gestures, addressing, decision points, and other layers
I. Contextual addressing • Addressing, and focus of attention. • Agreement is highest for certain FOA contexts. • In those contexts, the classifier also performed better. • … more in paper
II. Modeling intersubjectivity • Modeling single annotators, for `yeah’ utterances • Data annotated non-overlapping, 3 annotators
II. Modeling intersubjectivity • Cross annotator training and testing
II. Modeling intersubjectivity • Building a voting classifier:Only classify an instance when all three annotator-specific expert classifiers agree
II. Modeling intersubjectivity • In the unanimous voting context, performance is higher due to increased precision (avg 6%)
Conclusions • Possible subjective aspects to annotation should be taken into account • Agreement metrics are not designed to handle this • We proposed two methods designed to cope with subjective data
Thank you! • Questions?