190 likes | 685 Views
Learning Subjective Nouns using Extraction Pattern Bootstrapping. Ellen Riloff School of Computing University of Utah Janyce Wiebe , Theresa Wilson Computing Science University of Pittsburgh CoNLL-03. Introduction (1/2).
E N D
Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff School of Computing University of Utah Janyce Wiebe , Theresa Wilson Computing Science University of Pittsburgh CoNLL-03
Introduction(1/2) • Many Natural Language Processing applications can benefit from being able to distinguish between factual and subjective information . • Subjective remarks come in a variety of forms , including opinions , rants , allegations , accusations and speculation . • QA should distinguish between factual and speculative answers . • Multi-document summarization system need to summarize different opinions and perspectives . • Spam filtering systems must recognize rants and emotional tirades , among other things .
Introduction (2/2) • In this paper , we use Meta-Bootstrapping (Riloff and Jones 1999) , Basilisk (Thelen and Riloff 2002) algorithms to learn lists of subjective nouns : • Both bootstrapping algorithms automatically generated extraction patterns to identify words belonging to a semantic category . • We hypothesize that extraction patterns can also identify subjective words . • The Pattern “expressed <direct_object>” often extracts subjective nouns , such as “concern” , “hope” , “support” . • Both bootstrapping algorithm require only a handful of seed words and unannotated texts for training ; no annotated data is need at all .
Annotation Scheme • The goal of the annotation scheme is to identify and characterize expressions of private states in a sentence . • Private state is a general covering term for opinions , evaluations , emotions and speculations . • “ The time has come , gentleman , for Sharon , the assassin , to realize that injustice cannot last long” -> writer express a negative evaluation . • Annotator are also asked to judge the strength of each private state . A private state can have low , medium , high or extreme strength .
Corpus , Agreement Results • Our data consist of English-language versions of foreign news document from FBIS . • The annotated corpus used to train and test our subjective classifiers (the experiment corpus) consist of 109 documents with a total of 2197 sentences . • We use a separate , annotated tuning corpus to establish experiment parameters .
Extraction Pattern • In the last few years , two bootstrapping algorithms have been developed to create semantic dictionaries by exploiting extraction patterns . • Extraction patterns represent lexico-syntactic expression that typically rely on shallow parsing and syntactic role assignment . • “ <subject> was hired . ” • A bootstrapping process looks for words that appear in the same extraction patterns as the seeds and hypothesize that those words belong to the same semantic category .
Meta-Bootstrapping (1/2) • Meta-Bootstrapping process begins with a small set of seed words that represent a targeted semantic category (eg.” seashore ” is a location) and an unannotated corpus . • Step1 , MetaBoot automatically creates a set of extraction patterns for the corpus by applying syntactic templates . • Step2 , MetaBoot computes a score for each pattern based on the number of the seed words among its extractions . • The best pattern is saved and all of its extracted noun phrase are automatically labeled as the targeted semantic category .
Meta-Bootstrapping (2/2) • MetaBoot then re-scores the extraction patterns , using the original seed words plus the newly labeled words , and the process repeats . (Mutual Bootstrapping) • When the mutual bootstrapping process is finished , all nouns that were put into the semantic dictionary are re-evaluated. • Each noun is assigned a score based on how many different patterns extracted it . • Only the five best nouns are allowed to remain in the dictionary . • Mutual bootstrapping process starts over again using the revised semantic dictionary
Basilisk (1/2) • Step1 , Basilisk automatically creates a set of extraction patterns for the corpus and scores each pattern based on the number of seed words among its extraction . Basilisk Put the best patterns into a Pattern Pool . • Step2 , All nouns extracted by a pattern in the pattern pool are put into a Candidate Word Pool . • Basilisk scores each noun based on the set of patterns that extracted it and their collective association with the seed words . • Step3 , the top 10 nouns are labeled as the targeted semantic class and are added to dictionary .
Basilisk (2/2) • Then the bootstrapping process then repeats , using the original seed and the newly labeled words . • The major difference Basilisk and Meta-Bootstrapping : • Basilisk scores each noun based on collective information gathered from all patterns that extracted it . • Meta-Bootstrapping identify a single best pattern and assumes that everything it extracts belongs to the same semantic category . • In comparative experiment , Basilisk outperformed Meta-Bootstrapping .
Experimental Results (1/2) • We create the bootstrapping corpus , by gathering 950 new texts from FBIS and manually selected 20 high-frequency words as seed words . • We run each bootstrapping algorithm for 400 iterations , generating 5 word per iteration . Basilisk generates 2000 nouns and Meta-Bootstrapping generates 1996 nouns .
Experimental Results (2/2) • Next , we manually review 3996 words proposed by the algorithm and classify the words as StrongSubjective , Weak Subjective or Objective . X - the number of words generated Y - the percentage of those words that were manually classified as subjective
Subjective Classifier (1/3) • To evaluate the subjective nouns , we train a Naïve Bayes classifier using the nouns as features . We also incorporated previously established subjectivity clues , and added some new discourse features . • Subjective Noun Features : • We define four features BA-Strong , BA-weak , MB-Strong , MB-Weak to represent the sets of subjective nouns produced by bootstrapping algorithm . • We create a three-valued feature based on the presence of 0 , 1 , >=2 words from that set .
Subjective Classifier (2/3) • WBO Features : • Wiebe , Bruce and O’Hara (1999) , a machine learning system to classify subjective sentences . • Manual Features : • Levin 1993 ; Ballmer and Brennenstuhl 1981 • Some fragment lemmas with frame element experiencer (Baker et al. 1998) • Adjectives manually annotated for polarity (Hatzivassiloglou and McKeown 1997 ) • Some subjective clues list in (Wiebe 1990)
Subjective Classifier (3/3) • Discourse Features : • We use discourse feature to capture the density of clues in the text surrounding a sentence . • First , we compute the average number of subjective clues and objective clues per sentence . • Next , we characterize the number of subjective and objective clues in the previous and next sentence as : higher-than-expected (high) , lower-than-expected (low) , expected (medium) . • We also define a feature for sentence length .
Classification Result (1/3) • We evaluate each classifier using 25-fold cross validation on the experiment corpus and use paired t-test to measure significance at the 95% confidence level . • We compute Accuracy (Acc) as the percentage that match the gold-standard , and Precision (Prec) , Recall (Rec) with respect to subjective sentences . • Gold-standard : a sentence is subjective if it contains at least one private-state expression of medium or higher strength . • Objective class consist of everything else .
Classification Result (2/3) • We train a Naive Bays classifier using only the SubjNoun features . This classifier achieve good precision (77%) but only moderate recall (64%) . • We discover that the subjective nouns are good indicators when they appear , but not every subjective sentence contains a subjective noun .
Classification Result (3/3) • There is a synergy between these feature set : using both types of features achieves better performance than either one alone . • In Table 8 Row 1 , we use WBO + SubjNoun + manual + discourse feature . This classifier achieve 81.3% precision , 77.4% recall and 76.1% accuracy .
Conclusion • We demonstrate that weakly supervised bootstrapping techniques can learn subjective terms from unannotated texts. • Bootstrapping algorithms can learn not only general semantic category , but any category for which words appear in similar linguistic phrase . • The experiment suggest that reliable subjective classification require a broad array of features .