210 likes | 225 Views
Explore the OpQA Corpus for multi-perspective question answering, comparing fact-based and opinion-based questions, annotation framework, classifiers, and future work directions.
E N D
Multi-Perspective Question Answering Using the OpQA Corpus Claire Cardie Janyce Wiebe Veselin Stoyanov Cornell University University of Pittsburgh
Multi-Perspective Question Answering • Fact-based question answering (QA): • When is the first day of spring? • Do Lipton employees take coffee breaks? • Vs Multi-perspective question answering (MPQA). • How does the US regard the terrorist attacks in Iraq? • Is Derek Jeter a bum? HLT/EMNLP 2005.
Talk Outline • Properties of Opinion vs. Fact answers • OpQA corpus • Traditional fact-based QA systems • Different properties of opinion questions • Using fine-grained opinion information for MPQA • Annotation framework and automatic classifiers • QA experiments HLT/EMNLP 2005.
Opinion Question & Answer (OpQA) Corpus [Stoyanov, Cardie, Litman, and Wiebe 2004] • 98 documents manually tagged for opinions (from the NRRC MPQA corpus [Wilson and Wiebe 2003]) • 30 questions • 15 fact • 15 opinion HLT/EMNLP 2005.
OpQA corpus: Answer Annotations • Two annotators • Include every text segment contributing to an answer • Partial answers: • When was the Kyoto protocol ratified? • … before May 2003 … • Are the Japanese unanimous in their support of Koizumi? • … most Japanese support their prime minister … • Minimum spans HLT/EMNLP 2005.
Guesses: 1.Frag 324 2.Frag 213 3. ….. Traditional Fact-based QA systems Documents (document fragments) IR subsystem Linguistic filters Syntactic filters Semantic filters Questions HLT/EMNLP 2005.
Characteristics of Opinion vs. Fact Answers • Answer length • Syntactic and semantic class • Additional processing difficulties • Partial answers • Answer generator HLT/EMNLP 2005.
Fine-grained Opinion Information for MPQA • Recent interest in the area of automatic opinion information extraction. • E.g. [Bethard, Yu, Thornton, Hatzivassiloglou, and Jurafsky 2004], [Pang and Lee 2004], [Riloff and Wiebe 2003], [Wiebe and Riloff 2005], [Wilson, Wiebe, and Hwa 2004], [Yu and Hatzivassiloglou 2003] • In our evaluation: • Opinion annotation framework • Sentence-level automatic opinion classifiers • Subjectivity filters • Source filter HLT/EMNLP 2005.
Opinion Annotation Framework • Described in [Wiebe, Wilson, and Cardie 2002] • Accounts for both: • Explicitly stated opinions • Joe believes that Sue dislikes the Red Sox. • Indirectly expressed opinions • The aim of the report is to tarnish China’s image. • Attributes include strength and source. • Manual sentence-level classification • sentence subjective if it contains one or more opinions of strength >= medium • Described in [Wiebe, Wilson, and Cardie 2002] • Accounts for both: • Explicitly stated opinions • Joebelieves that Suedislikes the Red Sox. • Indirectly expressed opinions • The aim of the report is to tarnish China’s image. • Attributes include strength and source. • Manual sentence-level classification • sentence subjective if it contains one or more opinions of strength >= medium HLT/EMNLP 2005.
Automatic Opinion Classifiers • Two sentence-level opinion classifiers from Wiebe and Riloff [2005] used for evaluation • Both classifiers use unannotated data • Rulebased: Extraction patterns bootstrapped using word lists • NaiveBayes: Trained on data obtained from Rulebased HLT/EMNLP 2005.
Baseline Guesses 1.Sent 324 2.Sent 213 3. ….. Subjectivity Filters Document Sentences IR subsystem Subjectivity filters Manual Rulebased NaiveBayes Opinion Questions HLT/EMNLP 2005.
1.Sent 324 2.Sent 213 3.Sent 007 (ans) 4.Sent 212 5.Sent 211 (ans) 6. … Subjectivity Filters Cont’d • Look for the rank of the first guess containing an answer • Compute: • Mean Reciprocal Rank (MRR) across the top 5 answers • MRR = meanall_questions(1/Rank_of_first_answer) • Mean Rank of the First Answer • MRFA = meanall_questions(Rank_of_first_answer) HLT/EMNLP 2005.
0.4214 Subjectivity Filters Results HLT/EMNLP 2005.
Source Filter • Manually identify the sources in the opinion questions • Does France approve of the war in Iraq? • Retains only sentences that contain opinions with sources matching sources in the question • France has voiced some concerns with the situation. HLT/EMNLP 2005.
Source Filter Results • Performs well on the hardest questions in the corpus • All questions answered within the first 25 sentences with one exception. HLT/EMNLP 2005.
Summary • Properties of opinion vs. fact answers • Traditional architectures unlikely to be effective • Use of fine-grained opinion information for MPQA • MPQA can benefit from fine-grained perspective information HLT/EMNLP 2005.
Future Work • Create summaries of all opinions in a document using fine-grained opinion information • Methods used will be directly applicable to MPQA HLT/EMNLP 2005.
Thank you.Questions? HLT/EMNLP 2005.
Did something surprising happen when Chavez regained power in Venezuela after he was removed by a coup? • What did South Africa want Mugabe to do after the 2002 elections? • What’s Mugabe’s opinion about the West’s attitude and actions towards the 2002 Zimbabwe election? HLT/EMNLP 2005.
Characteristics of Fact vs. Opinion Answers Cont’d • Syntactic Constituent of the answers HLT/EMNLP 2005.
All improvement significant using Wilcoxon Matched-Pairs Signed-Ranks Test (p<=0.05) except for source filter (p=0.81) HLT/EMNLP 2005.