650 likes | 759 Views
Extraction and Summarization of Opinions. Source Attitude Target. Negative Emotion. Intensity: High. Opinion Frame Source: Angolans Polarity: negative Attitude: emotion Intensity: high Target: Marburg virus.
E N D
Source Attitude Target Negative Emotion Intensity: High Opinion Frame Source:Angolans Polarity:negative Attitude: emotion Intensity:high Target:Marburg virus Subjectivity: opinions, emotions, motivations, speculations, sentiments • Information Extraction of • NL expressions • Components • Properties Angolans are terrified of the Marburg virus
Fine-grained Opinions Australian press has launched a bitter attack on Italy after seeing their beloved Socceroos eliminated on a controversial late penalty. Italiancoach Lippi has also been blasted for his comments after the game. In the opposite camp Lippi is preparing his side for the upcoming game with Ukraine. He hailed 10-man Italy's determination to beat Australia and said the penalty was rightly given. [Stoyanov & Cardie, 2006]
Opinion Frame Source:Australian Press Polarity:negative Attitude: sentiment Intensity:high Target:Italy Fine-grained Opinion Extraction “The Australian Press launched a bitter attack on Italy”
Socceroos Australian Press penalty Italy Marcello Lippi Opinion Summary
Opinion Frame Source: Polarity: Intensity: Direct Subjective Source: Polarity: Intensity: Direct Subjective Source: Polarity: Intensity: Summary Representation Disease Outbreak Victim: Location: Disease: Date: … Summarization of Opinions + Events
Why Opinions? • Provide technology that can aid analysts in their • extracting socio-behavioral information from text • monitoring public health awareness, knowledge and speculations about disease outbreaks, … • Enrich Information Extraction, Question Answering, and Visualization tools
Opinion Frame Source: Polarity:negative Attitude: Intensity:high Target: E.g., are people extremely afraid or angry?
Opinion Frame Source: Polarity: Attitude: Intensity: Target: The industry is scared and so, even if they do find an ornamental carp with KHV, they will keep it secret Recognize motivations Predict actions
Opinion Frame Source: Polarity: Attitude: Intensity: Target: Ban on British beef Brugere-Picoux backs the decision to ban British Beef Search for opinions about particular named targets
Opinion Frame Source: Brugere-Picoux Polarity: Attitude: Intensity: Target: Brugere-Picoux backs the decision to ban British Beef Search for opinions held by particular named sources
Motivation for the Summaries • Quickly determine the opinions of a person, organization, community, region, etc. • Quickly determine the opinions toward a person, organization, issue, event, … • Across an entire document • Across multiple documents • Over time • Reveal relationships and identify cliques and communities of interest • Complement work in social network analysis
Outline • Motivations for opinion extraction • Extracting opinion frames and components • Lexicon of subjective expressions • Contextual disambiguation • Enriched tasks • Opinion summarization
Lexicon • Explore different uses of words, to zero in on the subjective ones • Example: benefit
Lexicon • Example: benefit • Very often objective, as a Verb: Children with ADHD benefited from a 15-course of fish oil
Lexicon • Noun uses look more promising: The innovative economic program has shown benefits to humanity
Lexicon • However, there are objective noun uses too: …tax benefits. …employee benefits. …tax benefits to provide a stable economy. …health benefits to cut costs.
Lexicon • Pattern:benefits as the head of a noun phrase containing a prepositional phrase • Matches this: The innovative economic program has shown proven benefits to humanity • But none of these: …tax benefits. …employee benefits. …tax benefits to provide a stable economy. …health benefits to cut costs.
LexiconLonger Constructionsbe soft on crime <item index="1"> <itemMorphoSyntax> <lemma>be</lemma></itemMorphoSyntax> <itemRelation xsi:type="ngramPattern"> <distance>2</distance> <landmark>2</landmark></itemRelation></item> <item index="2"> <itemMorphoSyntax> <word>soft</word> <majorClass>J</majorClass></itemMorphoSyntax> <itemRelation xsi:type="ngramPattern"> <distance>1</distance> <landmark>3</landmark></itemRelation></item> <item index="3"> <itemMorphoSyntax> <word>on</word></itemMorphoSyntax> <itemRelation xsi:type="ngramPattern"> <distance>1</distance> <landmark>4</landmark></itemRelation></item> <item index="4"> <itemMorphoSyntax> <word>crime</word> <majorClass>N</majorClass> </itemMorphoSyntax>
The entry contains a pattern for finding instances of the construction • Matches variations: • When I look into his past I see a man who is very soft on crime. • The data could also weaken her authority to criticize Patrick for being soft on crime.
Attributive information <entryAttributes origin="j"> <name>be soft on crime</name> <subjective>true</subjective> <reliability>h</reliability> <confidence>h</confidence> <subType>sen</subType> <example>The Obama campaign rejected the notion that the senator might be vulnerable to accusations that he is soft on crime.</example> <morphosyn>vp</morphosyn> <target>s</sp_target> <polarity>n</polarity> <intensity>m</intensity> <confidence>h</confidence> <regex>1:[morph:[lemma="be"] order:[distance="2" landmark="2"]] 2:[morph:[word="soft" majorClass="J"] order:[distance="1" landmark="3"]] 3:[morph:[word="on"] order:[distance="1" landmark="4"]] 4:[morph:[word="crime" majorClass="N"]]</regex> <patterntype>ngramPattern</patterntype>
Lexicon: Summary • Uniform representation for different types of subjectivity clues • Word stem: benefit • Word: benefits • Word/POS: benefits/nouns • Fixed n-grams: benefits to • Syntactic patterns • Combinations of the above • Learn subjective uses from corpora (bodies of texts) • Capture longer subjective constructions • Add relevant knowledge about expressions • Riloff, Wiebe, Wilson 2003; Riloff & Wiebe 2003; Wiebe & Riloff 2005; Riloff, Patwardhan, Wiebe 2006; Ruppenhofer, Akkaya, Wiebe in preparation
Outline • Motivations for opinion extraction • Extracting opinion frames and components • Lexicon of subjective expressions • Contextual disambiguation • Enriched tasks • Opinion summarization
Polarity • Contextual polarity • There is no reason at all to believe that he’s the right choice • Interacts with opinion topics • Example: argument for one type of design is simultaneously an argument against an alternative design
Polarity • Recognizing contextual polarity using rich feature sets and machine learning • Modeling and recognizing discourse relations among opinions and their targets in a text Wilson, Wiebe, Hoffmann EMNLP05 Wilson, Wiebe, Hoffmann, submitted Somasundaran, Wiebe, Ruppenhofer, submitted
Opinion Frame Extraction via CRFs and ILP • Joint extraction of entities and relations CRFs [Lafferty et al., 2001] [Roth & Yih, 2004] [Choi et al., EMNLP 2006]
Opinion-Frame Extraction • Joint extraction of entities and relations for opinion recognition (previous slide) • Choi, Break, Cardie EMNLP 2006 • Linking sources referring to the same entity • Stoyanov and Cardie ACL 2006 Workshop on Sentiment and Subjectivity in Text • Identifying expressions of opinions in context • Breck, Choi, Cardie IJCAI 2007
Outline • Motivations for opinion extraction • Extracting opinion frames and components • Lexicon of subjective expressions • Contextual disambiguation • Enriched tasks • Opinion summarization
Targets and Attitude TypesWilson PhD Dissertation 2008 I think people are happy because Chavez has fallen. direct subjective span: think source: <writer, I> attitude: direct subjective span: are happy source: <writer, I, People> attitude: inferred attitude span: are happy because Chavez has fallen type: neg sentiment intensity: medium target: attitude span: think type: positive arguing intensity: medium target: attitude span: are happy type: pos sentiment intensity: medium target: target span: people are happy because Chavez has fallen target span: Chavez has fallen target span: Chavez
Current Work: Topics • Topic annotations added to the MPQA corpus • Annotations indicate the closest phrase to the opinion expression that adequately describes the topic of the opinion • Include topic “coreference” chains to link all phrases that describe the same topic concept • IAG results Stoyanov and Cardie LREC 2008
Current Work: Topics • Topic coreference resolution • Treat as an NP coreference resolution task • Modify our existing NP coref approach • Initial results look promising • Using topic spans from gold standard • B3 = .709 • MUC = .917 • Topic span = opinion sentence • B3 = .573 • MUC = .914 • Topic span identified automatically • B3 = .574 • MUC = .924 • Best baseline system • B3 = .554 • MUC = .793
Subjectivity Types • Arguing and sentiment in the news and conversations • Manually annotating • Automatically detecting • Exploiting results of automatic detection to improve question answering Somasundaran, Wiebe, Hoffmann, Litman, ACL workshop 2006 Somasundaran, Wilson, Wiebe, Stoyanov ICWSM 2007 Somasundaran, Ruppenhofer, Wiebe SIGdial 2007 Ruppenhofer, Somasundaran, Wiebe LREC 2008
CERATOPS Text Extraction and Data Visualization for Animal Health Surveillance • Collaborative project between CERATOPS, PURVAC, and the Veterinary Information Network (VIN), with funding from LLNL. • Goal: Study of subjectivity in health surveillance texts
Method • Manual Annotation Study • Identify relevant types of topic, source, and subjectivity • Annotate 16 texts from the ProMED (Program for Monitoring Emerging Diseases) mailing list
Hypothesis A fine-grained study of subjectivity will show that health-surveillance texts contain significant amounts of subjectivity recognizing this subjectivity can enhance information extraction and question answering applications
Example Sentence-level annotation Whilst the present tragedy in the UK is extremely distressing to farmers … , so far the number of animals culled is only a miniscule portion of the national herd.
Example Sentence-level annotation Whilst the present tragedy in the UK is extremely distressing to farmers … , so far the number of animals culled is only a miniscule portion of the national herd.
Example Sentence-level annotation Whilst the present tragedy in the UK is extremely distressing to farmers … , so far the number of animals culled is only a miniscule portion of the national herd.
Example Sentence-level annotation Whilst the present tragedy in the UK is extremely distressing to farmers … , so far the number of animals culled is only a miniscule portion of the national herd.
Source types the writer medical experts media (non-media) organizations, including governments and agencies individuals affected by an outbreak members of the general public other explicitly mentioned entities implicit entities
Source type example It has become clear that the UK has been importing significant animal products from areas where FMD is known to be endemic.
Topic types Occurrence of a disease outbreak Danger/severity of an outbreak Cause of a disease Symptoms Treatment Prevention Diagnosis Attitudes of others Development/progression of outbreak Other
Topic type example The crisis has been caused by the koi herpes virus, commonly referred to as KHV, a disease harmless to other animals, but invariably fatal to carp.
Subjectivity types (1) • Sentiment • Belief, distinguishing two sub-types • Beliefs about what is the case • Belief about what should or should not be done • Knowledge & Awareness of facts • Uncertainty & Speculation
Subjectivity types (2) Agreement & Disagreement between various sources in the text Confirmation & Denial of contested statements Intention & Purpose Policies & Actions reflecting the above attitudes, for example, restrictions on the use, manufacture, distribution of substances
Subjectivity type example Professor Jeanne Brugere-Picoux ... said although France has officially registered 75 cases of BSE in the past 10 years, she believed the real figure to be “far higher than that”.
Subjectivity type example Nor did the FSA consider that there would be any needto label meat products derived from animals that have been vaccinated with the FMD vaccine.
Querying the annotations 1 I am afraid people don’t know enough about this disease.