170 likes | 179 Views
This paper presents the ICSI Summarization System, which aims to maximize the content value and linguistic quality of summaries by using concepts, word bigrams, and linear programming techniques. The system is evaluated using TAC 2008 data and achieves excellent performance in terms of content value, although there is room for improvement in linguistic quality.
E N D
The ICSI Summarization System Dan Gillick, Benoit Favre, and Dilek Hakkani-Tür {dgillick, favre, dilek}@icsi.berkeley.edu International Computer Science Institute Berkeley, CA
Who Are We? Graduate student at UC Berkeley Dan Gillick Postdoc at ICSI, PhD from Avignon Benoit Favre Senior Researcher at ICSI Dilek Hakkani-Tür ICSI at TAC 2008
Summarization Assumptions • Information is conveyed by discrete, independent concepts. • The content value of a summary can be measured by the total value of the unique concepts it contains. • Linguistic quality is enforced primarily by units of selection (e.g. sentences). ICSI at TAC 2008
What are Concepts? Christians make up just 3 percent of Iraq's population of about 25 million. Original sentence (1) Christians make up 3 percent of Iraq’s population (2) The population of Iraq is 25 million Pyramid concepts (1) Christians make (2) 3 percent (3) Iraq’s population (4) 25 million Word bigram concepts ICSI at TAC 2008
ILP Formulation Maximize a single linear objective function: Image: chilton-computing.org.uk i : concept index ci: indicator for concept i in summary wi: weight (value) of concept i ICSI at TAC 2008
ILP Formulation Maximize a single linear objective function: Subject to linear constraints: Image: chilton-computing.org.uk i : concept index j : sentence index ci: indicator for concept i in summary sj: indicator for sentence j in summary wi: weight (value) of concept i lj : length of sj oij: indicator for ci in sj L : maximum summary length ICSI at TAC 2008
Building Systems (1) ICSI-1 • Concepts: word bigrams • Mapping Function: document frequency • only include sentences with some query overlap • prune concepts appearing in fewer than 3 documents • Units of Selection: sentences ICSI-2 • Units of Selection: compressed sentence candidates ICSI at TAC 2008
Building Systems (2) MRO (Maximum ROUGE Oracle) • Concepts: word bigrams • Mapping Function: document frequency in human “gold” summaries • Units of Selection: sentences ICSI at TAC 2008
Pre/post - processing • Sentence segmentation, tokenization, stop-words, Porter stemming – NLTK • Simple rules for removing newswire headers and formatting markup • ICSI-1, MRO: ordering first by source date, then by sentence number • ICSI-2: dendrogram ordering (not clear this is better) ICSI at TAC 2008
Only the Most Related Work • Assigning value to words based on frequency (Nenkova and Vanderwende, 2005) • Global optimization with learned word values using a beam search (Yih, et al., 2007) • Set cover formalism for summarization (Filatova and Hatzivassiloglou, 2004) • ILP for summarization (McDonald, 2007) • Approximate ROUGE-1 oracle results (Conroy et al., 2006) ICSI at TAC 2008
TAC Results (1) • Excellent performance on non-update problems, t-test shows no significant difference between ICSI-1 and the best system in every category • No specific update task processing ICSI at TAC 2008
TAC Results (2) • Overall best ROUGE scores • Relatively poor linguistic quality ICSI at TAC 2008
Linguistic Quality Analysis Among summaries receiving linguistic quality scores of 1 or 2, we counted how many contained each type of error: • ICSI-1 could be drastically improved by better sentence segmentation and rules for removing a few words. • ICSI-2 is too aggressive with sentence compression. • Co-reference resolution is a major problem. ICSI at TAC 2008
An Oracle Experiment (1) Data: DUC 2007 update task set A (10 topics) Note: “Content responsiveness” evaluation does not include linguistic quality as in TAC 2008. Systems Evaluated: B1: Returns all leading sentences up to the length limit from the most recent document B2: NIST’s “high performance generic summarizer” (Conroy, et al., 2004) ICSI-1: Our submitted system MRO: The oracle system H: Each of 4 human summaries written by NIST’s IR experts. ICSI at TAC 2008
An Oracle Experiment (2) • MRO gets better content scores than ICSI-1, but worse than humans • All differences significant at 95% confidence interval, using Tukey’s “Honestly Significant Differences” • Suggests there is room for improvement in sentence extraction ICSI at TAC 2008
Not Randomly Selected Example Summaries for D0808-A: “Describe the events related to Christian minorities in Iraq and their current status.” MRO: Iraq's Christians, increasingly targeted by insurgents, are fleeing Baghdad for the safety of the Kurdish north or neighboring Syria and Jordan. But the exodus is temporary, insist many, because they are not selling their homes and property. Christians make up just 3 percent of Iraq's population of about 25 million.Officials estimate that as many as 15,000 of Iraq's nearly one million Christians have left the country since August, when four churches in Baghdad and one in Mosul were attacked in a coordinated series of car bombings. Insurgents abducted Syrian Catholic Archbishop Basile Casmoussa apparently to frighten Iraqi Christians. ICSI-1:In an interview, Yonadem Kana, the leader of the Assyrian Democratic Movement in Iraq and a member of the Iraqi National Council, said the fighters have been deployed in Baghdida near the northern city of Mosul.Christians make up just 3 percent of Iraq's population of about 25 million.Officials estimate that as many as 15,000 of Iraq's nearly one million Christians have left the country since August, when four churches in Baghdad and one in Mosul were attacked in a coordinated series of car bombings. Most of Christians in Iraq are in Baghdad and northern cities. ICSI-2:Officials estimate that as many as 15000 of Iraq's nearly one million Christians have left the country since August, when four churches in Baghdad and one in Mosul were attacked in a coordinated series of car bombings.Most of Christians in Iraq are in Baghdad and northern cities.Christians make up just 3 percent of Iraq's population of about 25 million. Armed men kidnapped a Catholic archbishop in Iraq's main northern city of Mosul Monday. In an interview, Yonadem Kana, the leader of the Assyrian Democratic Movement in Iraq and a member, said the fighters have been deployed in Baghdida. Responsiveness: ? Linguistic Quality: ? Pyramid: ? ROUGE-2: 0.186 Responsiveness: 3 Linguistic Quality: 3 Pyramid: 0.489 ROUGE-2: 0.118 Responsiveness: 4 Linguistic Quality: 4 Pyramid: 0.517 ROUGE-2: 0.119 ICSI at TAC 2008
Conclusion • ICSI system is simple, fast, and performs well. • Linguistic quality needs work but a set of rules for cleaning sentences will help a lot. • Oracle system suggests: • room for improvement in sentence selection • more is likely needed to match human performance • Source code available soon (www.dgillick.com/summarize.html) ICSI at TAC 2008