1 / 37

Statistical NLP Spring 2010

Learn about the mid-‘90s technique Maximum Marginal Relevance using Graph algorithms for text summarization. Understand the concept of node centrality, word distributions, regression models, topic models, and optimal search methodologies. Explore how to achieve globally optimal search using Integer Linear Programming and maintain consistency between selected sentences and the overall summary length limit.

pstanley
Download Presentation

Statistical NLP Spring 2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical NLPSpring 2010 Lecture 22: Summarization Dan Klein – UC Berkeley Includes slides from Aria Haghighi, Dan Gillick

  2. s2 Maximize similarity to the query Minimize redundancy Greedy search over sentences s1 Q s4 s3 Selection mid-‘90s • Maximum Marginal Relevance [Carbonell and Goldstein, 1998] present

  3. Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms [Mihalcea 05++] present

  4. Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms s1 s2 present s3 s4 Nodes are sentences

  5. Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms s1 s2 present s3 s4 Nodes are sentences Edges are similarities

  6. Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms Stationary distribution represents node centrality s2 s1 present s4 s3 Nodes are sentences Edges are similarities

  7. Input document distribution Summary distribution Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms • Word distribution models present ~

  8. Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms • Word distribution models SumBasic [Nenkova and Vanderwende, 2005] Value(wi) = PD(wi) Value(si) = sum of its word values Choose si with largest value Adjust PD(w) Repeat until length constraint present

  9. s1 s1 F(x) s2 s2 s3 s3 Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms • Word distribution models • Regression models present frequency is just one of many features

  10. Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms • Word distribution models • Regression models • Topic model-based • [Haghighi and Vanderwende, 2009] present

  11. PYTHY H & V 09

  12. s2 Optimal search using MMR s1 Q s4 s3 Selection mid-‘90s • Maximum Marginal Relevance • Graph algorithms • Word distribution models • Regression models • Topic models Integer Linear Program • Globally optimal search present [McDonald, 2007]

  13. Selection [Gillick and Favre, 2008] The health care bill is a major test for the Obama administration. s1 s2 Universal health care is a divisive issue. s3 President Obama remained calm. s4 Obama addressed the House on Tuesday.

  14. Selection [Gillick and Favre, 2008] The health care bill is a major test for the Obama administration. s1 s2 Universal health care is a divisive issue. s3 President Obama remained calm. s4 Obama addressed the House on Tuesday.

  15. Selection [Gillick and Favre, 2008] The health care bill is a major test for the Obama administration. s1 s2 Universal health care is a divisive issue. s3 President Obama remained calm. s4 Obama addressed the House on Tuesday.

  16. Selection [Gillick and Favre, 2008] The health care bill is a major test for the Obama administration. s1 s2 Universal health care is a divisive issue. s3 President Obama remained calm. s4 Obama addressed the House on Tuesday.

  17. Selection [Gillick and Favre, 2008] The health care bill is a major test for the Obama administration. s1 s2 Universal health care is a divisive issue. s3 President Obama remained calm. s4 Obama addressed the House on Tuesday. Length limit: 18 words greedy optimal

  18. maintain consistency between selected sentences and concepts summary length limit total concept value Selection Integer Linear Program for the maximum coverage model [Gillick, Riedhammer, Favre, Hakkani-Tur, 2008]

  19. Selection [Gillick and Favre, 2009] This ILP is tractable for reasonable problems

  20. First sentences are unique Selection How to include sentence position?

  21. Selection Only allow first sentences in the summary surprisingly strong baseline Up-weight concepts appearing in first sentences included in TAC 2009 system Identify more sentences that look like first sentences first sentence classifier is not reliable enough yet How to include sentence position?

  22. Selection Some interesting work on sentence ordering [Barzilay et. al., 1997; 2002] But choosing independent sentences is easier • First sentences usually stand alone well • Sentences without unresolved pronouns • Classifier trained on OntoNotes: <10% error rate Baseline ordering module (chronological) is not obviously worse than anything fancier

  23. Results [G & F, 2009] • 52 submissions • 27 teams • 44 topics • 10 input docs • 100 word summaries • Gillick & Favre • Rating scale: 1-10 • Humans in [8.3, 9.3] • Rating scale: 0-1 • Humans in [0.62, 0.77] • Rating scale: 1-10 • Humans in [8.5, 9.3] • Rating scale: 0-1 • Humans in [0.11, 0.15]

  24. Error Breakdown? [Gillick and Favre, 2008]

  25. Beyond Extraction? in 25 words? Sentence extraction is limiting ... and boring! But abstractive summaries are much harder to generate…

  26. http://www.rinkworks.com/bookaminute/

More Related