540 likes | 754 Views
Ellen Riloff University of Utah . Creating Subjective and Objective Sentence Classifiers from Unannotated Texts. (Joint work with Janyce Wiebe at the University of Pittsburgh). What is Subjectivity?.
E N D
Ellen Riloff University of Utah Creating Subjective and Objective Sentence Classifiers from Unannotated Texts (Joint work with Janyce Wiebe at the University of Pittsburgh)
What is Subjectivity? • Subjective language includes opinions, rants, allegations, accusations, suspicions, and speculation. • Distinguishing factual information from subjective information could benefit many applications, including: • information extraction • question answering • summarization • spam filtering
Previous Work on Subjectivity Classification • Document-level subjectivity classification (e.g., [Turney 2002; Pang et al. 2002; Spertus 1997]) But most documents contain subjective and objective sentences. [Wiebe et al. 01] reported that 44% of sentences in their news corpus were subjective! • Sentence-level subjectivity classification [Dave et al. 2003; Yu et al. 2003; Riloff, Wiebe, & Wilson 2003]
Create classifiers that label sentences as subjective or objective. Learn subjectivity and objectivity clues from unannotated corpora. Use information extraction techniques to learn subjective nouns. Use information extraction techniques to learn subjective and objective patterns. Goals of our research
Learning subjective nouns with extraction patterns Automatically generating training data with high-precision classifiers Learning subjective and objective extraction patterns Naïve Bayes classification and self-training Outline of Talk
Information extraction (IE) systems identify facts related to a domain of interest. Extraction patterns are lexico-syntactic expressions that identify the role of an object. For example: Information Extraction <subject> was killed assassinated <dobj> murder of <np>
Learning Subjective Nouns Goal: to learn subjective nouns from unannotated texts. Method: applying IE-based bootstrapping algorithms that were designed to learn semantic categories. Hypothesis: extraction patterns can identify subjective contexts that co-occur with subjective nouns. Example: “expressed <dobj>” concern, hope, support
Extraction Examples expressed <dobj> condolences, hope, grief, views, worries indicative of <np> compromise, desire, thinking inject <dobj> vitality, hatred reaffirmed <dobj> resolve, position, commitment voiced <dobj> outrage, support, skepticism, opposition, gratitude, indignation show of <np> support, strength, goodwill, solidarity <subj> was shared anxiety, view, niceties, feeling
Unannotated Texts Meta-Bootstrapping [Riloff & Jones 99] Ex: hope, grief, joy, concern, worries Ex: expressed <DOBJ> Best Extraction Pattern Ex: happiness, relief, condolences Extractions (Nouns)
corpus extraction patterns and their extractions seed words Pattern Pool best patterns extractions Candidate Word Pool semantic lexicon 5 best candidate words Basilisk [Thelen & Riloff 02]
cowardice embarrassment hatred outrage crap fool hell slander delight gloom hypocrisy sigh disdain grievance love twit dismay happiness nonsense virtue Subjective Seed Words
Bootstrapping corpus: 950 unannotated FBIS documents (English-language foreign news) We ran each bootstrapping algorithm for 400 cycles, generating ~2000 words. We manually reviewed the words and labeled them as strongly subjective or weakly subjective. Together, they learned 1052 subjective nouns (454 strong, 598 weak). Subjective Noun Results
Examples of Strong Subjective Nouns anguish exploitation pariah antagonism evil repudiation apologist fallacies revenge atrocities genius rogue barbarian goodwill sanctimonious belligerence humiliation scum bully ill-treatment smokescreen condemnation injustice sympathy denunciation innuendo tyranny devil insinuation venom diatribe liar exaggeration mockery
aberration eyebrows resistant allusion failures risk apprehensions inclination sincerity assault intrigue slump beneficiary liability spirit benefit likelihood success blood peaceful tolerance controversy persistent trick credence plague trust distortion pressure unity drama promise eternity rejection Examples of Weak Subjective Nouns
Learning subjective nouns with extraction patterns Automatically generating training data with high-precision classifiers Learning subjective and objective extraction patterns Naïve Bayes classification and self-training Outline of Talk
Initial Training Data Creation unlabeled texts subjective clues rule-based subjective sentence classifier rule-based objective sentence classifier subjective & objective sentences
Subjective Clues • entries from manually developed resources [Levin 93; Ballmer & Brennenstuhl 81] • Framenet lemmas with frame element experiencer [Baker et al. 98] • adjectives manually annotated for polarity [Hatzivassiloglou & McKeown 97] • n-grams learned from corpora [Dave et al. 03; Wiebe et al. 01] • words distributionally similar to subjective seed words [Wiebe 00] • subjective nouns learned from extraction pattern bootstrapping [Riloff et al. 03]
a sentence is subjective if it contains 2 strong subjective clues a sentence is objective if: it contains no strong subjective clues the previous and next sentence contain 1 strong subjective clue the current, previous, and next sentence together contain 2 weak subjective clues Creating High-Precision Rule-Based Classifiers GOAL: use subjectivity clues from previous research to build a high-precision (low-recall) rule-based classifier
The MPQA Corpus contains 535 FBIS texts that have been manually annotated for subjectivity. Our test set consisted of 9,289 sentences from the MPQA corpus. We consider a sentence to be subjective if it has at least one private state of strength medium or higher. 54.9% of the sentences in our test set are subjective. Data Set
Accuracy of Rule-Based Classifiers SubjRec SubjPrec SubjF Subj RBC 34.2 90.4 46.6 ObjRec ObjPrec ObjF Obj RBC 30.7 82.4 44.7
We applied the rule-based classifiers to 298,809 sentences from (unannotated) FBIS documents. 52,918 were labeled subjective 47,528 were labeled objective training set of over 100,000 labeled sentences! Generated Data
Learning subjective nouns with extraction patterns Automatically generating training data with high-precision classifiers Learning subjective and objective extraction patterns Naïve Bayes classification and self-training Outline of Talk
Extraction patterns can represent linguistic expressions that are not fixed word sequences. Representing Subjective Expressions with Extraction Patterns • drove [NP] up the wall • - drove him up the wall • - drove George Bush up the wall • - drove George Herbert Walker Bush up the wall • step on [modifiers] toes • - step on her toes • - step on the mayor’s toes • - step on the newly elected mayor’s toes • gave [NP] a [modifiers] look • - gave his annoying sister a really really mean look
Used AutoSlog-TS [Riloff 96] to learn extraction patterns. AutoSlog-TS needs relevant and irrelevant texts as input. Statistics are generated measuring each pattern’s association with the relevant texts. The subjective sentences were called relevant, and the objective sentences were called irrelevant. The Extraction Pattern Learner
active-vp <dobj> endorsed <dobj> infinitive <dobj> to condemn <dobj> active-vp infinitive <dobj> get to know <dobj> passive-vp infinitive <dobj> was meant to show <dobj> subject auxiliary <dobj> fact is <dobj> passive-vp prep <np> opinion on <np> active-vp prep <np> agrees with <np> infinitive prep <np> was worried about <np> noun prep <np> to resort to <np> <subject> passive-vp <subj> was satisfied <subject> active-vp <subj> complained <subject> active-vp dobj <subj> dealt blow <subject> active-vp infinitive <subj> appears to be <subject> passive-vp infinitive <subj> was thought to be <subject> auxiliary dobj <subj> has position
Relevant Irrelevant Syntactic Templates AutoSlog-TS(Step 1) [The World Trade Center], [an icon] of [New York City], was intentionally attacked very early on [September 11, 2001]. Parser Extraction Patterns: <subj> was attacked icon of <np> was attacked on <np>
Relevant Irrelevant Extraction Patterns Freq Prob <subj> was attacked 100 .90 icon of <np>5 .20 was attacked on <np>80 .79 AutoSlog-TS (Step 2) Extraction Patterns: <subj> was attacked icon of <np> was attacked on <np>
Identifying Subjective and Objective Patterns AutoSlog-TS generates 2 statistics for each pattern: F = pattern frequency P = relevant frequency / pattern frequency We call a pattern subjective if F 5 and P .95 (6364 subjective patterns were learned) We call a pattern objective if F 5 and P .15 (832 objective patterns were learned)
Examples of Learned Extraction Patterns Objective Patterns <subj> increased production <subj> took effect delegation from <np> occurred on <np> plans to produce <dobj> Subjective Patterns <subj> believes <subj> was convinced aggression against <np> to express <dobj> support for <np>
<subj> was expected 45 .42 was expected from <np> 5 1.0 <subj> put 187 .67 <subj> put end 10 .90 <subj> talk 28 .71 talk of <np> 10 .90 <subj> is talk 5 1.0 <subj> is fact 38 1.0 fact is <dobj> 12 1.0 Patterns with Interesting Behavior PATTERN FREQ P(Subj | Pattern) <subj> asked 128 .63 <subj> was asked 11 1.0
Augmenting the Rule-Based Classifiers with Extraction Patterns SubjRec SubjPrec SubjF Subj RBC 34.2 90.4 46.6 Subj RBC 58.6 80.9 68.0 w/Patterns ObjRec ObjPrec ObjF Obj RBC 30.7 82.4 44.7 Obj RBC 33.5 82.1 47.6 w/Patterns
Learning subjective nouns with extraction patterns Automatically generating training data with high-precision classifiers Learning subjective and objective extraction patterns Naïve Bayes classification and self-training Outline of Talk
We created an NB classifier using the initial training set and several set-valued features: strong & weak subjective clues from RBCs subjective & objective extraction patterns POS tags (pronouns, modals, adjectives, cardinal numbers, adverbs) separate features for each of the current, previous, and next sentences Naïve Bayes Classifier
subjective patterns subjective clues objective patterns POS features training set extraction pattern learner Naïve Bayes training Naïve Bayes Training
SubjRec SubjPrec SubjF Naïve Bayes 70.6 79.4 74.7 ObjRec ObjPrec ObjF Naïve Bayes 77.6 68.4 72.7 Naïve Bayes Results RWW03 77 81 79 (supervised) RWW03 74 70 72 (supervised)
Naïve Bayes classifier subjective patterns subjective clues objective patterns POS features best N sentences unlabeled sentences training set extraction pattern learner Naïve Bayes training Self-Training Process
RWW03 (supervised) 77 81 79 ObjRec ObjPrec ObjF Obj RBC w/Patts 1 33.5 82.1 47.6 Obj RBC w/Patts 234.8 82.6 49.0 RWW03 (supervised) 74 70 72 Self-Training Results SubjRec SubjPrec SubjF Subj RBC w/Patts 1 58.6 80.9 68.0 Subj RBC w/Patts 262.4 80.4 70.3 Naïve Bayes 1 70.6 79.4 74.7 Naïve Bayes 2 86.3 71.3 78.1 Naïve Bayes 1 77.6 68.4 72.7 Naïve Bayes 2 57.6 77.5 66.1
We can build effective subjective sentence classifiers using only unannotated texts. Extraction pattern bootstrapping can learn subjective nouns. Extraction patterns can represent richer subjective expressions. Learning methods can discover subtle distinctions between very similar expressions. Conclusions
Related Work • Genre classification (e.g., [Karlgren and Cutting 1994; Kessler et al. 1997; Wiebe et al. 2001]) • Learning adjectives, adj. phrases, verbs, and N-grams [Turney 2002; Hatzivassiloglou & McKeown 1997; Wiebe et al. 2001] • Semantic lexicon learning [Hearst 1992; Riloff & Shepherd 1997; Roark & Charniak 1998; Caraballo 1999] • Meta-Bootstrapping [Riloff & Jones 99] • Basilisk [Thelen & Riloff 02]
What is Information Extraction? Extracting facts relevant to a specific topic from narrative text. Example Domains Terrorism: perpetrator, victim, target, date, location Management succession: person fired, successor, position, organization, date Infectious disease outbreaks: disease, organism, victim, symptoms, location, date
Role relationships define the information of interest …keywords and named entities are not sufficient. Information Extraction from Narrative Text Troops were vaccinated against anthrax, cholera, … Researchers have discovered how anthrax toxin destroys cells and rapidly causes death ...
The patterns are ranked using the metric: A domain expert reviews the top-ranked patterns and assigns thematic roles to the good ones. Fi RlogF (patterni) = * log2 (Fi) Ni Fiis the # of instances of patterniin relevant texts Niis the # of instances of patterni in all texts Ranking and Manual Review
A semantic lexicon assigns categories to words. politician human truck vehicle grenade weapon Semantic Lexicons • Semantic dictionaries are hard to come by, especially for specialized domains. • WordNet [Miller 90] is popular but is not always sufficient. [Roark & Charniak 98] found that 3 of every 5 words learned by their system were not present in WordNet.
Unannotated Texts The Bootstrapping Era + = KNOWLEDGE !
Unannotated Texts Meta-Bootstrapping Ex: anthrax, ebola, cholera, flu, plague Ex: outbreak of <NP> Best Extraction Pattern Ex: smallpox, tularemia, botulism Extractions (Nouns)
Semantic Lexicon (NP) Results Iter Company Location Title Location Weapon (Web) (Web) (Web) (Terror) (Terror) 1 5/5 (1.0) 5/5 (1.0) 0/1 (0) 5/5(1.0) 4/4(1.0) 10 25/32 (.78) 46/50 (.92) 22/31 (.71) 32/50 (.92) 31/44 (.70) 20 52/65 (.80) 88/100 (.88) 63/81 (.78) 66/100 (.66) 68/94 (.72) 30 72/113 (.64) 129/150 (.86) 86/131 (.66) 100/150 (.67) 85/144 (.59)
corpus extraction patterns and their extractions seed words Pattern Pool best patterns extractions Candidate Word Pool semantic lexicon 5 best candidate words Basilisk
Fi RlogF (patterni) = * log2 (Fi) where: Ni Fiis the number of category members extracted by patterni Niis the total number of nouns extracted by patterni The Pattern Pool Every extraction pattern is scored and the best patterns are put into a Pattern Pool. The scoring function is:
Ni S j=1 log2 (Fj + 1) AvgLog (wordi) = Ni Scoring Candidate Words Each candidate word is scored by: • 1. collecting all patterns that extracted it • 2. computing the average number of category members extracted by those patterns.