1 / 19

Analyzing Disagreements

Analyzing Disagreements. Beata Beigman Klebanov Eyal Beigman Daniel Diermeier Kellogg School of Management Northwestern University. Metaphor Detection Task. For a given metaphor type, mark all paragraphs that contain it Metaphor types: Vehicle Love Build Authority

eilis
Download Presentation

Analyzing Disagreements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Analyzing Disagreements Beata Beigman Klebanov Eyal Beigman Daniel Diermeier Kellogg School of Management Northwestern University

  2. Metaphor Detection Task • For a given metaphor type, mark all paragraphs that contain it • Metaphor types: • Vehicle • Love • Build • Authority • A paragraph can contain none, one, or more metaphor types

  3. Materials • British public discourse about European integration, 1990-2000 Musolff, 2000 • 151 articles • 2364 paragraphs • 9 annotators

  4. Why not classification? Phenomenon of interest has very low incidence

  5. Implicit classification • Default class: no-metaphor-of-this-type •  between 0.39 and 0.66 • Hypothesis: implicit classification paradigm conductive to attention slips • Based on previous work in this paradigm

  6. Lexical Anchoring • For each newcoming word in a text, mark previous words that are semantically / associatively related to it (=anchors) Beigman Klebanov and Shamir 2006, BKS •  = 0.45 • 22 annotators • Is there a reliably annotated subset of items? • What is the nature of disagreements in this subset?

  7. BKS: Finding Reliably Deliberate Annotations • Suppose 20 pseudo-annotators were flipping each his own coin for every item • Heads probability induced from actual annotator • What is the level of agreement for which this scenario is sufficiently improbable? • Random anchoring hypothesis rejected with 99% confidence for 13 coinciding markups • Conclusion: For items marked by 13 people or more, at least some annotations are deliberate

  8. Metaphors: Finding Reliably Deliberate Annotations • Induce 9 pseudo-annotators • Using statistics of actual annotations for VEHICLE (~ 4%) • Random markup hypothesis can be rejected with sufficient confidence for 4 or more coinciding markups; at least some of the people should have acted deliberately • Reliably deliberate subset: 33% of all marked items

  9. BKS: Validation Experiment • Subjects are presented with everything marked by at least one human and some random markups, and asked to cross out things they disagree with. • Every subject has a yes or no vote per item. • Random markups: 15% yes • All human markups: 62% yes • Reliably deliberate markups: 94% yes

  10. Metaphors: Validation Experiment • With 7 out of 9 annotators • Random : 6% yes • Human : 62% yes • Reliably deliberate : 95% yes

  11. Validation of Deliberate Annotations • When some humans produced deliberate annotations, the results tend to be uncontroversial, as they are acceptable to other people, even if they did not produce the annotation themselves.

  12. Confounder: Self-affirmation • The more people marked an item, the higher the chance that the person validated her own annotation, rather than accepted something she did not produce herself • The probability of accepting a reliably deliberate annotation given that the person did not produce it: 91%

  13. Self-affirmation as a sign of annotator confidence • If a person’s judgment on an item is settled, he is expected to make the same decision when asked again • The probability of self-affirmation after 4-8 weeks: • Reliably deliberate annotations: 96% • Unreliable annotations: 77%

  14. Rejecting Own Annotation "Ironically, Britain, which is coy about joining EMU, has much stronger technical credentials than most of the euro-enthusiasts . . . " 14 October 1997 Guardian • 1 annotator marked as LOVE metaphor • Validation: • 1 person accepted, although she did not produce it • 6 people rejected, including the producer

  15. Summary and Conclusion In metaphor detection task, the statistically determined agreement threshold for deliberate annotation provided a split of the data with a marked disparity in annotation stability and agreeability. Deliberate annotations are stable (96%) and agreeable (91%), others are less stable (77%) and much less agreeable (41%).

  16. Further Issues Validation experiment was done to support statistical analysis. Can we put more trust in the validation step? "In European policy everything now depends on the Franco-German couple," he said. "The currency union is critical and has now entered its most sensitive preparation stage.” 3 June 1997 Guardian 2/9 annotators marked LOVE; 7/7 accepted

  17. Split Judgments "Under President Chirac this triangle will remain in force. But it would be wrong to see it as implying an equidistance between France's relations with London and those it has with Bonn. Like Germany, France continues to move towards a more integrated Europe in a way that Britain, regrettably, does not. ” 18 May 1995 Guardian • 1 annotator marked LOVE metaphor • 4/7 accepted; 3/7 rejected.

  18. Split Judgments (Cont.) The preceding paragraph: "The clearest exposition of France's view of the European triangle came from Alain Juppe earlier this year. "The future image of Europe will be a synthesis of three great visions: Germany's will to federalism […], the statist tradition of France […], and Britain's special world-view …" The title: "ETERNAL TRIANGLE; Jonathan Steele asks if Chirac's France will stay loyal to its marriage with Germany, or does Britain get a chance to flirt?"

  19. Split Judgments "Under President Chirac this triangle will remain in force. But it would be wrong to see it as implying an equidistance between France's relations with London and those it has with Bonn. Like Germany, France continues to move towards a more integrated Europe in a way that Britain, regrettably, does not. ” 18 May 1995 Guardian • 1 annotator marked LOVE metaphor • 4/7 accepted; 3/7 rejected

More Related