1 / 23

The Pythy Summarization System: Microsoft Research at DUC 2007

This research paper presents the Pythy Summarization System developed by Microsoft Research for the DUC 2007. It includes an overview of the system, training methods, feature inventory, ranking models, and dynamic scoring techniques.

bristow
Download Presentation

The Pythy Summarization System: Microsoft Research at DUC 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Pythy Summarization System: Microsoft Research at DUC 2007 Kristina Toutanova, Chris Brockett, Michael Gamon, Jagadeesh Jagarlamudi, Hisami Suzuki, and Lucy Vanderwende Microsoft Research April 26, 2007

  2. DUC Main Task Results • Automatic Evaluations (30 participants) • Human Evaluations • Did pretty well on both measures

  3. Overview of Pythy • Linear sentence ranking model • Learns to rank sentences based on: • ROUGE scores against model summaries • Semantic Content Unit (SCU) weights of sentences selected by past peers • Considers simplified sentences alongside original sentences

  4. Sentences PYTHY Training Simplified Sentences Docs Docs Targets Ranking/ Training ROUGE Oracle Pyramid/ SCU ROUGE X 2 Model Docs Docs Feature inventory

  5. Sentences PYTHY Testing Simplified Sentences Docs Docs Search Model Dynamic Scoring Docs Docs Summary Feature inventory

  6. Sentences PYTHY Training Simplified Sentences Docs Docs Targets Ranking Training ROUGE Oracle Pyramid/ SCU ROUGE X 2 Model Sentence Simplification Docs • Extension of simplification method for DUC06 • Provides sentence alternatives, rather than deterministically simplify a sentence • Uses syntax-based heuristic rules • Simplified sentences evaluated alongside originals • In DUC 2007: • Average new candidates generated: 1.38 per sentence • Simplified sentences generated: 61% of all sents • Simplified sentences in final output: 60% Docs Feature inventory

  7. Sentences PYTHY Training Simplified Sentences Docs Docs Targets Ranking Training ROUGE Oracle Pyramid/ SCU ROUGE X 2 Model Sentence-Level Features Docs • SumFocus features: SumBasic (Nenkova et al 2006) + Task focus • cluster frequency and topic frequency • only these used in MSR DUC06 • Other content word unigrams: headline frequency • Sentence length features (binary features) • Sentence position features (real-valued and binary) • N-grams (bigrams, skip bigrams, multiword phrases) • All tokens (topic and cluster frequency) • Simplified Sentences (binary and ratio of relative length) • Inverse document frequency (idf) Docs Feature inventory

  8. Sentences PYTHY Training Simplified Sentences Docs Docs Targets Ranking Training ROUGE Oracle Pyramid/ SCU ROUGE X 2 Model Pairwise Ranking Docs • Define preferences for sentence pairs • Defined using human summaries and SCU weights • Log-linear ranking objective used in training • Maximize the probability of choosing the better sentence from each pair of comparable sentences Docs [Ofer et al. 03], [Burges et al. 05] Feature inventory

  9. Sentences PYTHY Training Simplified Sentences Docs Docs Targets Ranking Training ROUGE Oracle Pyramid/ SCU ROUGE X 2 Model Rouge Oracle Metric Docs • Find an oracle extractive summary • the summary with the highest average ROUGE-2 and ROUGE-SU4 scores • All sentences in the oracle are considered “better” than any sentence not in the oracle • Approximate greedy search used for finding the oracle summary Docs Feature inventory

  10. Sentences PYTHY Training Simplified Sentences Docs Docs Targets Ranking Training ROUGE Oracle Pyramid/ SCU ROUGE X 2 Model Pyramid-Derived Metric Docs • University of Ottawa SCU-annotated corpus (Copeck et al 06) • Some sentences in 05 & 06 document collections are: • known to contain certain SCUs • known not to contain any SCUs • Sentence score is sum of weights of all SCUs • for un-annotated sentences, the score is undefined • A sentence pair is constructed for training s1 >s2 iff w(s1)>w(s2) Docs Feature inventory

  11. Sentences PYTHY Training Simplified Sentences Docs Docs Targets Ranking Training ROUGE Oracle Pyramid/ SCU ROUGE X 2 Model Model Frequency Metrics Docs • Based on unigram and skip bigram frequency • Computed for content words only • Sentence siis “better” than sj if Docs Feature inventory

  12. Sentences PYTHY Training Simplified Sentences Docs Docs Targets Ranking Training ROUGE Oracle Pyramid/ SCU ROUGE X 2 Model Combining multiple metrics Ranking Training Docs Feature inventory • From ROUGE oracle all sentences in oracle summary better than other sentences • From SCU annotations sentences with higher avg SCU weights better • From model frequency sentences with words occurring in models better • Combined loss: adding the losses according to all metrics Docs

  13. Sentences PYTHY Testing Simplified Sentences Docs Docs Search Model Dynamic Scoring Docs Docs Summary Feature inventory

  14. Search Dynamic Sentence Scoring Dynamic Scoring • Eliminate redundancy by re-weighting • Similar to SumBasic (Nenkova et al 2006), re-weighting given previously selected sentences • Discounts for features that decompose into word frequency estimates

  15. Search Search Dynamic Scoring • The search constructs partial summaries and scores them: • The score of a summary does not decompose into an independent sum of sentence scores • Global dependencies make exact search hard • Used multiple beams for each length of partial summaries • [McDonald 2007]

  16. Impact of Sentence Simplification • Trained on 05 data, tested on O6 data

  17. Impact of Sentence Simplification • Trained on 05 data, tested on O6 data

  18. Impact of Sentence Simplification • Trained on 05 data, tested on O6 data

  19. Evaluating the Metrics Trained on 05 data, tested on 06 data Includes simplified sentences

  20. Evaluating the Metrics Trained on 05 data, tested on 06 data Includes simplified sentences

  21. Update Summarization Pilot • SVM novelty classifier trained on TREC 02 & 03 novelty track

  22. Summary and Future Work • Summary • Combination of different target metrics for training • Many sentence features • Pair-wise ranking function • Dynamic scoring • Future work • Boost robustness • Sensitive to cluster properties (e.g., size) • Improve grammatical quality of simplified sentences • Reconcile novelty and (ir)relevance • Learn features over whole summaries rather than individual sentences

  23. Thank You

More Related