1.04k likes | 1.21k Views
Automatic Summarization: A Tutorial Presented at RANLP’2003 Inderjeet Mani Georgetown University Tuesday, September 9, 2003 2-5:30 pm @georgetown.edu complingone.georgetown.edu/~linguist/inderjeet.html. AGENDA.
E N D
Automatic Summarization: A Tutorial Presented at RANLP’2003 Inderjeet Mani Georgetown University Tuesday, September 9, 2003 2-5:30 pm @georgetown.edu complingone.georgetown.edu/~linguist/inderjeet.html
AGENDA • 14:10 pm I. Fundamentals (Definitions, Human Abstracting, Abstract Architecture) • 14:40 II. Extraction (Shallow Features, Revision, • Corpus-Based Methods) • 15:30 Break • 16: 00 III. Abstraction (Template and Concept-Based) • 16:30 IV. Evaluation • 17:00 pm V. Research Areas • Multi-document, Multimedia, Multilingual • Summarization • 17:30 pm Conclusion
Human Summarization is all around us • Headlines newspapers, Headline News • Table of contents of a book, magazine, etc. • Preview of a movie • Digest TV or cinema guide • Highlights meeting dialogue, email traffic • Abstract summary of a scientific paper • Bulletin weather forecast, stock market, ... • Biography resume, obituary, tombstone • Abridgment Shakespeare for kids • Review of a book, a CD, play, etc. • Scale-downs maps, thumbnails • Sound bite/video clip from speech, conversation, trial
Current Applications • Multimedia news summaries: watch the news and tell me what happened while I was away • Physicians' aids: summarize and compare the recommended treatments for this patient • Meeting summarization: find out what happened at that teleconference I missed • Search engine hits: summarize the information in hit lists retrieved by search engines • Intelligence gathering: create a 500-word biography of Osama bin Laden • Hand-held devices: create a screen-sized summary of a book • Aids for the Handicapped: compact the text and read it out for a blind person
Example BIOGEN Biographies Vernon Jordan is a presidential friend and a Clinton adviser. He is 63 years old. He helped Ms. Lewinsky find a job. Hetestified that Ms. Monica Lewinsky said that she had conversations with the president, that she talked to the president.He has numerous acquaintances, including Susan Collins, Betty Currie, Pete Domenici, Bob Graham, James Jeffords and Linda Tripp. Henry Hyde is a Republican chairman of House Judiciary Committee and a prosecutor in Senate impeachment trial. He will lead the Judiciary Committee's impeachment review.Hyde urged his colleagues to heed their consciences , “the voice that whispers in our ear , ‘duty, duty, duty.’” . Victor Polay is the Tupac Amaru rebels' top leader, founder and the organization's commander-and-chief. He was arrested again in 1992 and is serving a life sentence. His associates include Alberto Fujimori, Tupac Amaru Revolutionary, and Nestor Cerpa.
Columbia University’s Newsblaster www.cs.columbia.edu/nlp/newsblaster/summaries/11_03_02_5.html
Terms and Definitions • Text Summarization • The process of distilling the most important information from a source (or sources) to produce an abridged version for a particular user (or users) and task (or tasks). • Extract vs. Abstract • An extract is a summary consisting entirely of material copied from the input • An abstract is a summary at least some of whose material is not present in the input, e.g., subject categories, paraphrase of content, etc.
Illustration of Extracts and Abstracts 25 Percent Extract of Gettysburg Address (sents 1, 2, 6) • Fourscore and seven years ago our fathers brought forth upon this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. The brave men, living and dead, who struggled here, have consecrated it far above our poor power to add or detract. 10 Percent Extract (sent 2} • Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. 15 Percent Abstract • This speech by Abraham Lincoln commemorates soldiers who laid down their lives in the Battle of Gettysburg. It offers an eloquent reminder to the troops that it is the future of freedom in America that they are fighting for.
Illustration of the power of human abstracts • Mrs. Coolidge: What did the preacher discuss in his sermon? • President Coolidge: Sin. • Mrs. Coolidge: What did he say? • President Coolidge: He said he was against it. • - Bartlett’s Quotations (via Graeme Hirst) President Calvin Coolidge, Grace Coolidge, and dog, Rob Roy, c.1925. Plymouth Notch, Vermont.
Indicative Summary Function Informative evaluative • Indicative summaries • An indicative abstract provides a reference function for selecting documents for more in-depth reading. • Informative summaries • An informative abstract covers all the salient information in the source at some level of detail. • Evaluative summaries • A critical abstract evaluates the subject matter of the source, expressing the abstractor's views on the quality of the work of the author The indicative/informative distinction is a prescriptive distinction, intended to guide professional abstractors (e.g., ANSI 1996).
User-Oriented Summary Types • Generic summaries • aimed at a particular - usually broad - readership community • Tailored summaries (aka user-focused, topic-focused, query-focused summaries) • tailored to the requirements of a particular user or group of users. • User’s interests: • full-blown user models • profiles recording subject area terms • a specific query. • A user-focused summary needs, of course, to take into account the influence of the user as well as the content of the document. • A user-focused summarizer usually includes a parameter to influence this weighting.
Summarization Architecture Compression Audience Function Coherence Type Extract Abstract Summaries Characteristics Span Source Genre Media Language Analysis Transformation Synthesis
Characteristics of Summaries • Reduction of information content • Compression Rate, also known as condensation rate, reduction rate • Measured by summary length / source length ( 0 < c < 100) • Target Length • Informativeness • Fidelity to Source • Relevance to User’s Interests • Well-formedness/Coherence • Syntactic and discourse-level • Extracts: need to avoid gaps, dangling anaphors, ravaged tables, lists, etc. • Abstracts: need to produce grammatical, plausible output
One Text, Many Summaries(Evaluation preview) 25 Percent Leading Text Extract (first 3 sentences) - seems OK, too! Four score and seven years ago our fathers brought forth upon this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met here on a great battlefield of that war. 15 Percent Synopsis by human (critical summary) - seems even better! This speech by Abraham Lincoln commemorates soldiers who laid down their lives in the Battle of Gettysburg. It offers an eloquent reminder to the troops that it is the future of freedom in America that they are fighting for. 11 Percent Extract (by human, out of context) - is bad! (sents5, 8) It is altogether fitting and proper that we should do this. The world will little note, nor long remember, what we say here, but can never forget what they did here. We can usually tell when a summary is incoherent, but how do we evaluate summaries in general?
Studies of human summaries • Cremmins (1996) prescribed that abstractors • use surface features: headings, key phrases, position • use discourse features: overall text structure • revise and edit abstracts • Liddy (1991) • studied 276 abstracts structured in terms of background, purpose, methodology, results and conclusions • Endres-Niggemeyer et al. (1995, 1998) found abstractors • use top-down strategy exploiting discourse structure • build topic sentences, use beginning/ends as relevant, prefer top level segments, examine passages/paragraphs before individual sentences, exploit outlines, formatting ...
Endres-Niggemeyer et al. (1995, 1998) • Abstractors never attempt to read the document from start to finish. • Instead, they use the structural organization of the document, including formatting and layout (the scheme) to skim the document for relevant passages, which are fitted together into a discourse-level representation (the theme). • This representation uses discourse-level rhetorical relations to link relevant text elements capturing what the document is about. • They use a top-down strategy, exploiting document structure, and examining paragraphs and passages before individual sentences. • The skimming for relevant passages exploits specific shallowfeatures such as: • cue phrases (especially in-text summaries) • location of information in particular structural positions (beginning of the document, beginning and end of paragraphs) • information from the title and headings.
Stages of Abstracting: Cremmins (1996) Cremmins recommends 12-20 mins to abstract an average scientific paper - much less time than it takes to really understand one.
drop vague or redundant terms wording prescriptions contextual lexical choice reference adjustment Abstractors’ Editing Operations: Local Revision • Cremmins (1996) described two kinds of editing operations that abstractors carry out • Local Revision - revises content within a sentence • Global Revision - revises content across sentences
AGENDA • 14:10 pm I. Fundamentals (Definitions, Human Abstracting, Abstract Architecture) • 14:40 II. Extraction (Shallow Features, Revision, • Corpus-Based Methods) • 15:30 Break • 16: 00 III. Abstraction (Template and Concept-Based) • 16:30 IV. Evaluation • 17:00 pm V. Research Areas • Multi-document, Multimedia, Multilingual • Summarization • 17:30 pm Conclusion
Shallower approaches result in sentence extraction sentences may/will be extracted out of context synthesis here involves smoothing include window of previous sentences adjust references can be trained using a corpus Deeper approaches result in abstracts synthesis involves NL generation can be partly trained using a corpus requires some coding for a domain Summarization Approaches
Some Features used in Sentence Extraction Summaries • Location: position of term in document, position in paragraph/section, section depth, particular sections (e.g., title, introduction, conclusion) • Thematic: presence of statistically salient terms (tf.idf) • these are document-specific • Fixed phrases: in-text summary cue phrases (“in summary”, “our investigation shows”, “the purpose of this article is”,..), emphasizers (“important”, “in particular”,...) • these are genre-specific • Cohesion: connectivity of text units based on proximity, repetition and synonymy, coreference, vocabulary overlap • Discourse Structure: rhetorical structure, topic structure, document format
Putting it Together: Linear Feature Combination U is a text unit such as a sentence, Greek letters denote tuning parameters • LocationWeight assigned to a text unit based on whether it occurs in initial, medial, or final position in a paragraph or the entire document, or whether it occurs in prominent sections such as the document’s intro or conclusion • FixedPhraseWeight assigned to a text unit in case fixed-phrase summary cues occur • ThematicTermWeight assigned to a text unit due to the presence of thematic terms (e.g., tf.idf terms) in that unit • AddTermWeight assigned to a text unit for terms in it that are also present in the title, headline, initial para, or the user’s profile or query
Shallow Approaches Synthesis (Smoothing) Analysis Transformation(Selection) Feature Extractor Sentence Selector Feature Combiner aF1+bF2+gF3 Feature Extractor Sentence Revisor Source(s) Summary Feature Extractor
Revision as Repair • structured environments (tables, etc.) • recognize and exclude • **recognize and summarize • anaphors • exclude sentences (which begin) with anaphors • include a window of previous sentences • **reference adjustment • gaps • include low-ranked sentences immediately between two selected sentences • add first sentence of para if second or third selected • **model rhetorical structure of source
A Simple Text Revision Algorithm • Construct initial “sentence-extraction” draft from source by picking highest weighted sentences in source until compression target is reached • Revise draft • Use syntactic trees (using a statistical parser) augmented with coreference classes 1 Procedure Revise(draft, non-draft, rules, target-compression): 2 for each rule in rules 3 while ((compression(draft)- target-compression) < d) 4 while (<x, y> := next-candidates(draft, non-draft)) # e.g., binary rule 5 result := apply-rule(rule, x, y); # returns first result which succeeds 6 draft := draft U result
Deleted Salient Aggregated Example of Sentence Revision
Informativeness vs. Coherence in Sentence Revision Mani, Gates, and Bloedorn (ACL’99): 630 summaries from 7 systems (of 90 documents) were revised and evaluated using vocabulary overlap measure against TIPSTER answer keys. A: Aggregation, E: Elimination > is good A > I, A+E > I (initial draft) A >* E, A+E >* E Informativeness Sentence Complexity < is good A+E <* I A >* I
The Need for Corpus-Based Sentence Extraction • Importance of particular features can vary with the genre of text • e.g., location features: • newspaper stories: leading text • scientific text: conclusion • TV news: previews • So, there is a need for summarization techniques that are adaptive, that can be trained for different genres of text
Learning Sentence Extraction Rules Few corpora available; labeling can be non-trivial, requiring aligning each document unit (e.g., sentence) with abstract. Learns to extract just individual sentences (though feature vectors can include contextual features).
Example1: Kupiec et al. (1995) • Input • Uses a corpus of 188 full-text/abstract pairs drawn from 21 different scientific collections • Professionally written abstracts 3 sentences long on the average • The algorithm takes each sentence and computes a probability that it should be included in a summary, based on how similar it is to the abstract • Uses Bayesian classifier • Result • About 87% (498) of all abstract sentences (568) could be matched to sentences in the source (79% direct matches, 3% direct joins, 5% incomplete joins) • Location was best feature at 163/498 = 33% • Para+fixed-phrase+sentence length cutoff gave best sentence recall performance … 217/498=44% • At compression rate = 25% (20 sentences), performance peaked at 84% sentence recall
Example 2: Mani & Bloedorn (1998) • cmp-lg corpus (xxx.lanl.gov/cmp-lg) of scientific texts, prepared in SGML form by Simone Teufel at U. Edinburgh • 198 pairs of full-text sources and author-supplied abstracts • Full-text sources vary in size from 4 to 10 pages, dating from 1994-6 • SGML tags include: paragraph, title, category, summary, headings and heading depth (figures, captions and tables have been removed) • Abstract length averages about 5% (avg. 4.7 sentences) of source length • Processing • Each sentence in full-text source converted to feature vector • 27,803 feature-vectors (reduces to 903 unique vectors) • Generated generic and user focused summaries
Comparison of Learning Algorithms 20% compression, 10 fold cv Generic User-focused
Example Rules • Generic summary rule, generated by C4.5Rules (20% compression) If sentence is in the conclusion and it is a high tf.idf sentence Then it is a summary sentence • User-focused rules, generated by AQ (20% compression) If the sentence includes 15..20 keywords* present Then it is a summary sentence (163 total, 130 unique) If the sentence is in the middle third of the paragraph and the paragraph is in the first third of the section Then it is a summary sentence (110 total, 27 unique) *keywords - terms occurring in sentences ranked as highly-relevant to query (abstract)
Issues in Learning Sentence Extraction Rules • Choice of corpus • size of corpus • availability of abstracts/extracts/judgments • quality of abstracts/extracts/judgments • compression, representativeness, coherence, language, etc. • Choice of labeler to label a sentence as summary-worthy or not based on a comparison between the source document sentence and the document's summary. • Label a source sentence (number) as summary worthy if it found in the extract • Compare summary sentence content with source sentence content (labeling by content similarity – L/CS) • Create an extract from an abstract (e.g., by alignment L/A->E ) • Feature Representation, Learning Algorithm, Scoring
L/CS in KPC • To determine ifsÎE, they use a content-based match (since the summaries don’t always lift sentences from the full-text). • They match the source sentence to each sentence in the abstract. Two varieties of matches: • Direct sentence match: • the summary sentence and source text sentence are identical or can be considered to have the same content. (79% of matches) • Direct join: • two or more sentences from the source text (called joins) appear to have the same content as a single summary sentence. (3% of matched)
L/CS in MB98: Generic Summaries • For each source text • Represent abstract (list of sentences) • Match source text sentences against abstract, giving a ranking for source sentences (ie, abstract as “query”) • combined-match: compare source sentence against entire abstract (similarity based on content-word overlap + weight) • individual-match: compare source sentence against each sentence of abstract (similarity based on longest string match to any abstract sentence) • Label top C% of the matched source sentences’ vectors as positive • C (Compression) = 5,10,15,20,25 • e.g., C=10 => for a 100-sentence source text, 10 sentences will be labeled positive
L/A->E in Jing et al. 98 f1 Abstract Source w1 w2 f2 Find the fr which maximizes P(fr(w1…wn)) i.e., using Markov Assumption P(fr(w1….wn)) i=1,n P(fr(wi)|fr(wi-1))
Sentence Extraction as Bayesian Classification P(sÎE | F1,…, Fn) = Õj=1,nP(Fj|sÎE) P(sÎE) / Õj=1,nP(Fj) P(sÎE) - compression rate c P(sÎE | F1,…, Fn) - probability that sentence s is included in extract E, given the sentence’s feature-value pairs P(Fj) - probability of feature-value pair occurring in a source sentence P(Fj|sÎE) - probability of feature -value pair occurring in a source sentence which is also in the extract The features are discretized into Boolean features, to simplify matters
Cohesion • There are links in text, called ties, which express semantic relationships • Two classes of relationships: • Grammatical cohesion • anaphora • ellipsis • conjunction • Lexical cohesion • synonymy • hypernymy • repetition
Martian Weather with Grammatical and Lexical Cohesion Relations Withitsdistant orbit 50 percent farther from the sun than Earth and slim atmospheric blanket, Mars experiences frigidweather conditions. Surface temperatures typically average about 60 degrees Celsius (76 degrees Fahrenheit) at the equatorand […]can dip to 123 degrees C near the poles. Only the midday sun at tropical latitudes is warm enough to thawice on occasion, but any liquid water formed in this way would evaporate almost instantly because of the low atmospheric pressure. Althoughthe atmosphere holds a small amount of water, and waterice clouds sometimes develop, most Martianweatherinvolves blowing dust or carbon dioxide. Each winter, for example, a blizzard of frozencarbon dioxide rages over one pole, and a few meters of this dryicesnow accumulate as previously frozencarbon dioxide evaporates from the opposite polar cap. Yet even on thesummerpole, where the sun remains in the sky all day long, temperatures never warm enough to meltfrozenwater.
Text Graphs based on Cohesion • Represent a text as a graph • Nodes: words (or sentences) • Links: Cohesion links between nodes • Graph Connectivity Assumption: • More highly connected nodes are likely to carry salient information.
chain ring monolith piecewise Cohesion based Graphs 1 2 3 Link between nodes > 5 apart ignored Best 30p links at density 2.00, seg_csim 0.26 P5 P9 P8 P10 P7 P5 P12 Facts about an issue P3 P13 P15 Legality of an issue P24 P16 P23 P18 P19 P21 Node: Sentence Link: RelatedP Method: node centrality and topology Node: Paragraph Link: Cosine Similarity Method: Local segmentation then node centrality Node: Words/Phrases Link: Lexical/Grammatical Cohesion Method: node centrality discovered by spreading activation (see also clustering using lexical chains) Skorochodhko 1972 Salton et al. 1994 Mani & Bloedorn 1997
Coherence • Coherence is the modeling of discourse relations using different sources of evidence, e.g., • Document format • layout in terms of sections, chapters, etc. • page layout • Topic structure • TextTiling (Hearst) • Rhetorical structure • RST (Mann & Mathiessen) • Text Grammars (vanDijk, Longacre) • Genre-specific rhetorical structures (Methodology, Results, Evaluation, etc.) (Liddy , Swales, Teufel & Moens, Saggion & Lapalme, etc.) • Narrative structure
Using a Coherence-based Discourse Model in Summarization • Choose a theory of discourse structure • Parse text into a labeled tree of discourse segments, whose leaves are sentences or clauses • Leaves typically need not have associated semantics • Weight nodes in tree, based on node promotion and clause prominence • Select leaves based on weight • Print out selected leaves for summary synthesis
Martian Weather Summarized Using Marcu’s Algorithm (target length = 4 sentences) [With its distant orbit {– 50 percent farther from the sun than Earth –} and slim atmospheric blanket,1] [Mars experiences frigid weather conditions.2] [Surface temperatures typically average about –60 degrees Celsius (–76 degrees Fahrenheit) at the equator and can dip to –123 degrees C near the poles.3] [Only the midday sun at tropical latitudes is warm enough to thaw ice on occasion,4] [but any liquid water formed that way would evaporate almost instantly5] [because of the low atmospheric pressure.6] [Although the atmosphere holds a small amount of water, and water-ice clouds sometimes develop,7] [most Martian weather involves blowing dust or carbon dioxide.8] [Each winter, for example, a blizzard of frozen carbon dioxide rages over one pole, and a few meters of this dry-ice snow accumulate as previously frozen carbon dioxide evaporates from the opposite polar cap.9] [Yet even on the summer pole, {where the sun remains in the sky all day long,} temperatures never warm enough to melt frozen water.10] 2 > 8 > {3, 10} > {1, 4, 5, 7, 9}