240 likes | 403 Views
Annotation Scheme and Gold Standard for Dutch sentiment-bearing Adjectives. Isa Maks and Piek Vossen Faculty of Arts, VU University Amsterdam. Overview of Presentation. Annotation scheme for subjectivity and sentiment annotation of (Dutch) Adjectives Composition of the Gold Standard
E N D
Annotation Scheme and Gold Standard for Dutch sentiment-bearing Adjectives Isa Maks and Piek Vossen Faculty of Arts, VU University Amsterdam
Overview of Presentation • Annotation scheme for subjectivity and sentiment annotation of (Dutch) Adjectives • Composition of the Gold Standard • Results of Human Annotation Task • Conclusions and Future Work
Sentiment Lexicon Dutch Reference Lexicon (lexical units) Dutch Wordnet (synsets) Sentiment Lexicon morphology, morfo-syntax, semantics, usage, etc. sentiment and subjectivity information systems for rich automatic sentiment analysis and opinion mining tools for automatic lexicon building evaluation Guidelines for sentiment and subjectivity annotation Gold Standard to evaluate automatically built sentiment lexicons
Existing Annotation Schemata • ‘prior’ polarity: positive (good), negative (ugly), neutral (direct), posneg (curious) • ‘prior’ subjectivity: subjective or objective alarm [emotion] vs. alarm [device] • annotation at word sense (or synset) level Wiebe et al. (2004, 2006), Su et al. (2008) • new:Attitude Holder
Attitude Holder (examples) • ...there are reports from inside Gaza (AE) that criticize (NEG) Hamas (TOPIC) • ...the dominant media (AE) vilify (NEG) Hamas (TOPIC) and …. (SW) • Bush (AE) is angry (NEG) about Obama’s behaviour (TOPIC) ... • Bush is bad (NEG) for the economy ... (SW) SW = Speaker or Writer AE = Agent or Experiencer
Values for Attitude Holder Annotation • SW: speaker’s or writer’s attitude (bad, ugly, beautiful) • AE: agent’s or experiencer’s attitude ( angry, bent on) • no-specific attitude holder (water proof, rainy, biological)
Attitude Holder Subjectivity Polarity Lexical Unit Semantic Category Illustration
Polarity, Subjectivity and Attitude Holder water resistantwatches deaf man W.B. is happy with the choice for .. civil marriage Bush is angry over Obama's leeking of private conversation ..... Bush is bad for the economy …. They drive around in beautiful cars a cautious estimate Subj=subjectivity Obj=objectivity AE= Agent/Experiencer SW=Speaker/Writer No-AH=no attitude holder
Gold Standard Annotation Schema Summarizing: annotation at word sense level (instead of word level) because words may be subjectivity or polarity ambiguous annotation of subjectivity (objective vs. subjective), polarity (positive, negative, posneg, neutral) , attitude holder (whose opinion: speaker/writer or agent/experiencer) Question: How reliable is human annotation with a complex schema for subjectivity annotation
Data Set Gold Standard Requirements: representative of the whole lexicon relevant to automatic annotation of subjectivity inclusion of subjective and objective lexical items equal distribution of items across the lexicon with regard to frequency, polysemy and synset size English General Inquirer (Stone, 1966), Hatzivassiloglou, V. et al. (1997), Riloff and Wiebe (2005) , Jijkoun et al. (2008) Micro-WNOp (Cerini et al., 2007), Su et al. (2008)
Composition Gold Standard ADJECTIVES 3 variants: • 609 lexical units • 512 synsets • 390 words
Inter-annotator results overall agreement for 2 annotators attitude holder both polarity and attitude holder polarity single-category kappa computation
Analysis of Disagreements • OBJ-neg (0.34) vs. SW-neg and OBJ-pos (0.23) vs. SW-pos kaalhoofdig (bald-headed), oud (old- having lived for a long time) , mute (doofstom), droog (dry), langzaam (slow), zuiver (pure), etc. • AE-pos vs. SW-neg belust (bent on) - hij is belust op geld (he is bent on money)
human annotations across various lexicon dimensions agreement decreases when polysemy increases (65%) agreement decreases when word frequency increases agreement increases when item is a member of a large synset
Conclusions • We designed a new annotation scheme for polarity, subjectivity and attitude holder annotation and showed that all substantial categories can be reliably annotated by human annotators. We assume that this holds for automatic annotation as well. • We aimed at an equal distribution of test items across 3 lexicon dimensions (word frequency, large synset membership and polysemy) relevant to subjectivity and polarity identification; we measured correlations between polarity annotation and each of these lexicon dimensions. Future Work • Development of similar annotation schemes and gold standards for nouns and verbs • use of the gold standard to test methods and techniques to build a sentiment lexicon for Dutch
Acknowledgements • The research is part of the project From Text To Political Positions (http://www2.let.vu.nl/oz/cltl/t2pp) • Funded by the Interfaculty Reseach Institute CAMeRA - VU university Amsterdam • Gold standard data available at (http:// www2.let.vu.nl/ oz/cltl/t2pp)
Attitude Holder Semantic Category Subjectivity Polarity Lexical Unit Illustration
Attitude holder: CDA-lijsttrekker topic: linkse coalitie Polarity negative
What is an Opinion or Attitude(Kim, Hovy 2006) (1) Bush is bad for the economy (2) Bush is angry about Obama’s behaviour judgment emotion -> judgment
Gold standard distribution polarity and attitude holder polarity attitude holder