220 likes | 769 Views
Automatic summarization. Dragomir R. Radev University of Michigan radev@umich.edu. Outline. What is summarization Genres of summarization (Single-doc, Multi-doc, Query-based, etc.) Extractive vs. non-extractive summarization Evaluation metrics Current systems Marcu/Knight MEAD/Lemur
E N D
Automatic summarization Dragomir R. Radev University of Michigan radev@umich.edu
Outline • What is summarization • Genres of summarization (Single-doc, Multi-doc, Query-based, etc.) • Extractive vs. non-extractive summarization • Evaluation metrics • Current systems • Marcu/Knight • MEAD/Lemur • NewsInEssence/NewsBlaster • What is possible and what is not
Goal of summarization • Preserve the “most important information” in a document. • Make use of redundancy in text • Maximize information density |S| Compression Ratio = |D| i (S) Retention Ratio = i (D) i (S) |S| Goal: > i (D) |D|
Sentence-extraction based (SE) summarization • Classification problem • Approximation
Typical approaches to SE summarization • Manually-selected features: position, overlap with query, cue words, structure information, overlap with centroid • Reranking: maximal marginal relevance [Carbonell/Goldstein98]
Non-SE summarization • Discourse-based [Marcu97] • Lexical chains [Barzilay&Elhadad97] • Template-based [Radev&McKeown98]
Evaluation metrics • Intrinsic measures • Precision, recall • Kappa • Relative utility [Radev&al.00] • Similarity measures (cosine, overlap, BLEU) • Extrinsic measures • Classification accuracy • Informativeness for question answering • Relevance correlation
Web resources http://www.summarization.com http://duc.nist.gov http://www.newsinessence.com http://www.clsp.jhu.edu/ws2001/groups/asmd/ http://www.cs.columbia.edu/~jing/summarization.html http://www.dcs.shef.ac.uk/~gael/alphalist.html http://www.csi.uottawa.ca/tanka/ts.html http://www.ics.mq.edu.au/~swan/summarization/
Generative probabilistic models for summarization Wessel Kraaij TNO TPD
complexity Summarization architecture • What do human summarizers do? • A: Start from scratch: analyze, transform, synthesize (top down) • B: Select material and revise: “cut and paste summarization” (Jing & McKeown-1999) • Automatic systems: • Extraction: selection of material • Revision: reduction, combination, syntactic transformation, paraphrasing, generalization, sentence reordering Extracts Abstracts
Examples of generative models in summarization systems • Sentence selection • Sentence / document reduction • Headline generation
Ex. 1: Sentence selection • Conroy et al (DUC 2001): • HMM on sentence level, each state has an associated feature vector (pos,len, #content terms) • Compute probability of being a summary sentence • Kraaij et al (DUC 2001) • Rank sentences according to posterior probability given a mixture model • Grammaticality is OK • Lacks aggregation, generalization, MDS
Knight & Marcu (AAAI2000) • Compression: delete substrings in an informed way (based on parse tree) • Required: PCFG parser, tree aligned training corpus • Channel model: probabilistic model for expansion of a parse tree • Results: much better than NP baseline • Tight control on grammaticality • Mimics revision operations by humans
Daumé & Marcu (ACL2002) • Document compression, noisy channel • Based on syntactic structure and discourse structure (extension of Knight & Marcu model) • Required: Discourse & syntactic parsers • Training corpus where EDU’s in summaries are aligned with the documents • Cannot handle interesting document lengths (due to complexity)
Berger & Mittal (sigir2000) • Input: web pages (often not running text) • Trigram language model • IBM model 1 like channel model: • Choose length, draw word from source model and replace with similar word, independence assumption • Trained on Open Directory • Non-extractive • Grammaticality and coherence are disappointing: indicative
Zajic, Dorr & Schwartz (duc2002) • Headline generation from a full story: P(S|H)P(H) • Channel model based on HMM consisting of a bigram model of headline words and a unigram model of story words, bigram language model • Decoding parameters are crucial to produce good results (length, position, strings) • Good results in fluency and accuracy
Conclusions • Fluent headlines within reach of simple generative models • High quality summaries (coverage, grammaticality, coherence) require higher level symbolic representations • Cut & paste metaphor divides the work into manageable sub-problems • Noisy channel method effective, but not always efficient
Open issues • Audience (user model) • Types of source documents • Dealing with redundancy • Information ordering (e.g., temporal) • Coherent text • Cross-lingual summarization (Norbert Fuhr) • Use summaries to improve IR (or CLIR) - relevance correlation • LM for text generation • Possibly not well-defined problem (low interjudge agreement) • Develop models with more linguistic structure • Develop integrated models, e.g. by using priors (Rosenfeld) • Build efficient implementations • Evaluation: Define a manageable task