250 likes | 444 Views
Summarizing Contrastive Viewpoints in Opinionated Text. Michael J. Paul, ChengXiang Zhai, Roxana Girju EMNLP ’ 10 Speaker: Hsin-Lan, Wang Date: 2010/12/07. Outline. Introduction Modeling Viewpoints Topic-Aspect Model Features Multi-Viewpoint Summarization Comparative LexRank
E N D
Summarizing Contrastive Viewpoints in Opinionated Text Michael J. Paul, ChengXiang Zhai, Roxana Girju EMNLP’10 Speaker: Hsin-Lan, Wang Date: 2010/12/07
Outline • Introduction • Modeling Viewpoints • Topic-Aspect Model • Features • Multi-Viewpoint Summarization • Comparative LexRank • Summary Generation • Experiment and Evaluation • Conclusion
Introduction • The amount of opinionated text available online has been growing rapidly. • In this paper, we study how to summarize opinionated text in a such a way that highlights contrast between multiple viewpionts.
Introduction • Generate two types of multi-view summaries: • macro multi-view summary • Contains multiple sets of sentences, each representing a different viewpoint. • micro multi-view summary • Contains a set of pairs of contrastive sentences.
Modeling Viewpoints • Challenge: to model and extract viewpoints which are hidden in text. • Solve: Topic-Aspect Model (TAM)
Modeling Viewpoints • TAM
Modeling Viewpoints • Features • Words • baseline approach • do not do any stop word removal • stemming • Dependency Relations • use Stanford parser • full-tuple: rel(a,b) • split-tuple: rel(a,*), rel(*,b)
Modeling Viewpoints • Features • Negation • Rel(wi, wj), if either wi or wj is negated, then we simply rewrite it as . • Polarity • use Subjectivity Clues lexicon • amod(idea, good)→ amod(idea,+) and amod(*,good) • →rel(a,-).
Modeling Viewpoints • Features • Generalized Relations • use Stanford dependencies • Rewrite rel(a,b) as Rrel(a,b).
Multi-Viewpoint Summarization • Comparative LexRank • Make it favor jumping to a good representative excerpt x of any viewpoint v. • Make it favor jumping between two excerpts that can potentially form a good contrastive pair.
Multi-Viewpoint Summarization • Comparative LexRank
Multi-Viewpoint Summarization • Summary Generation • Macro contrastive summarization • Using the random walk stationary distribution across all of the data to rank the excerpts. • Separate the top ranked excerpts into two disjoint sets. • Remove redundancy and produce the summary. • Micro contrastive summarization • Consist of a pair (xi,xj) with the pairwise relevance score. • Rank these pairs and remove redundancy.
Experiments and Evaluation • Experimental Setup • First dataset: 948 verbatim responses to a Gallup phone survey about the 2010 U.S. healthcare bill. • Second dataset: use the Bitterlemons corpus, a collection of 594 editorials about the Israel-Palestine conflict.
Experiments and Evaluation • Stage One: Modeling Viewpoints
Experiments and Evaluation • Stage Two: Summarizing Viewpoints • Gold Standard Summaries • Gallup healthcare poll
Experiments and Evaluation • Stage Two: Summarizing Viewpoints • Baseline Approaches • Graph-based algorithms • When λ=1, the random walk model only transitions to sentences within the same viewpoint. • The modified algorithm produces the same ranking as the unmodified LexRank. • Model-based algorithms • Compare against the approach of Lerman and McDonald.
Experiments and Evaluation • Stage Two: Summarizing Viewpoints • Metrics • using the standard ROUGE evaluation metric • For evaluating the macro-level summaries: • For evaluating the micro-level summaries:
Experiments and Evaluation • Stage Two: Summarizing Viewpoints • Evaluation Results
Experiments and Evaluation • Unsupervised Summarization • Bitterlemons corpus (without a gold set) • Asked 8 people to guess if each viewpoint’s summary was written by Israeli or Palestinian authors.
Experiments and Evaluation • Unsupervised Summarization • Macro-level summaries: • Correctly labeled 78% of the summary sets. • Micro-level summaries: • Many of the sentences are mislabeled, and the ones that are correctly labeled are not representative of the collection.
Conclusion • Present steps toward a two-stage system that can automatically extract and summarize viewpoints in opinionated text. • First: the accuracy of clustering documents by viewpoint can be enhanced by using rich dependency features.
Conclusion • Second: use Comparative LexRank to generate contrastive summaries both at the macro and micro level.