140 likes | 232 Views
Detection of Implicit Citations for Sentiment Detection. Awais Athar & Simone Teufel. Problem: Find ‘All’ Citations. Context-Enhanced Citation Sentiment. Task 1: Find zones of influence of the citation O'Conner 1982 (manual, partially implemented) Kaplan et al (2009), for MDS
E N D
Detection of Implicit Citations for Sentiment Detection AwaisAthar& Simone Teufel
Context-Enhanced Citation Sentiment • Task 1: Find zones of influence of the citation • O'Conner 1982 (manual, partially implemented) • Kaplan et al (2009), for MDS • Related to implicit citation detection (Qazvinian& Radev, 2010) • Task 2: Citation Classification • Many manual annotation schemes in Content Citation Analysis • Nanba and Okumura (1999) • Athar (2011)
Corpus Construction x • Starting point: Athar's2011 citation sentence corpus • Select top 20 papers; treat all incoming citations to these • 1,741 citations (from >850 papers) • 4-class scheme • objective/neutral • positive • negative • e cluded
Task 1: Features for Classification • S(i) or S(i-1) contains full formal citation (2 features) • S(i) contains author name • S(i) contains acronym associated with citation • METEOR, BLEU etc. • S(i) contains a determiner followed by a “work noun” • This approach, These techniques
Task 1: Features (cont.) • S(i) contains a “lexical hook” • The Xerox tagger (Cutting et al. 1992) … • S(i) starts with a third person pronoun • S(i) starts with a connector • S(i), S(i+1) or S(i-1) starts with a subsection heading (3 features) • S(i) contains other citations than one under review • n-grams of length 1-3 (also acts as baseline)
Task 1: Methods and Results • SVM • 10-fold crossvalidation • F-score
Task 2: Features for Classification det_results_Thensubj_good_results cop_good_were n-grams of length 1 to 3 Dependency triplets (Athar, 2011)
Annotation Unit is the Citation • Problem • There may be more than 1 sentiment /citation • Annotation unit = citation. Projection needed: • For Gold Standard: assume last sentiment is what is really meant • For Automatic Treatment: merge citation context into one single sentence
Task 2: Methods and Results • SVM • 10-fold crossvalidation • F-score
Conclusion • Detection of citation sentiment in context, not just citation sentence. • New, large, context-aware citation corpus • This gives us a new truth: • More sentiment recovered • Harder to determine • Subtask of finding citation context: MicroF=.992; MacroF=.75 • Overall result: MicroF=0.8; MacroF=0.68
Thank you! Questions?