Towards Automated Re lated Wo rk S ummarization ( ReWoS )

Towards Automated Related Work Summarization(ReWoS) HOANG Cong Duy Vu 03/12/2010

Outline • Recall • A Motivating Example • The Proposed Approach • General Content Summarization (GCSum) • Specific Content Summarization (SCSum) • Generation • Experiments & Results • Future Work • Conclusion

Recall RW: related work A set of articles [] RW Summarizer User A desired length [,] [] A RW summary Topic hierarchy tree assumption

A Motivating Example A related work section extracted from “Bilingual Topic Aspect Classiﬁcation with A few Training Examples” (Wu et al., 2008)

The Proposed Approach For leaf nodes For internal nodes The ReWoS architecture, Decision edges are labeled as (T)rue, (F)alse or (R)elevant.

The Proposed Approach • Pre-Processing • Based on heuristic rules of sentence length and lexical clues • Sentences with token-based length is too short (<7) or too long (>80) • Sentences referring to future tenses • Sentences containing obviously redundant clues such as: “in the section ...”, “ﬁgure XXX shows ...”, “for instance” …

The Proposed Approach • Agent-based rule • Attempts to distinguish whether the sentence describes an author’s own work or not. • Based on the presence of tokens that signals work done by the author, such as “we”, “our”, “us”, “this approach”, and “this method” … • Says that if a sentence does not satisfy this rule, route for GCSum, otherwise for SCSum

General Content Summarization (GCSum) • The objective of GCSum is to extract sentences containing useful background information on the topics of the internal node in focus.

General Content Summarization (GCSum) General content informative indicative • Text classification is a task that assigns a certain number of pre-defined labels for a given text. • Statistical machine translation (SMT) seeks to develop mathematical models of the translation process whose parameters can be automatically estimated from a parallel corpus. • Many previous studies have approached the problem of mono-lingual text classification. • This paper refers to the problem of sentiment analysis.

General Content Summarization (GCSum) • Informative sentences • Give detail on a speciﬁc aspect of the problem, e.g. deﬁnitions, purpose or application of the topic • Indicative sentences • simpler, inserted to make the topic transition explicit and rhetorically sound • Summarization issue • Given a topic: • For indicative sentences, using pre-defined templates • For informative sentences, extract from input articles

General Content Summarization (GCSum) GCSum ﬁrst checks the subject of each candidate sentence, ﬁltering ones whose subjects do not contain at least one topic keyword. (Subject-based rule) Or GCSum checks whether stock verb phrases (i.e., “based on”, “make use of” and 23 other patterns) are used as the main verb. (Verb-based rule) Or GCSum checks for the presence of at least one citation – general sentences may list a set of citations as examples. (Citation-based rule) Importantly note that if cannot find out any informative sentences from input articles, generate indicative sentences instead!

General Content Summarization (GCSum) • Topic relevance computation (GCSum) • ranks sentences based on keyword content • states that the topic of an internal node is affected by its surrounding nodes – ancestor, descendants and others - scoreS is the ﬁnal relevance score - scoreSQA, scoreSQ, and scoreSQR mean the component relevance score of the sentence S with respect to the ancestor, current or other remaining nodes,respectively.

General Content Summarization (GCSum) • Topic relevance computation (GCSum) ancestors 1 The maximum number of sentences for each intermediate node is 2-3. itself 4 5 others 2 3 6 7 The linear combination: S’( ) = S( ) + S( ) – S(5 x ) 4 4 1 ancestors itself others

General Content Summarization (GCSum) • To obtain each component relevance score, we employ TF×ISF relevance computation

Specific Content Summarization (SCSum) • Sentences that are marked with author-as-agent are input to the Speciﬁc Content Summarization (SCSum) module. • SCSum aims to extract sentences that contain detailed information about a speciﬁc author’s work that is relevant to the input leaf nodes’ topic.

Specific Content Summarization (SCSum) • Topic relevance computation (SCSum) 1 ancestors Initially, the number of sentences for each leaf node is assigned equivalently. The relevance score is computed using the formula similar to GCSum presented earlier. 4 5 siblings 2 3 6 7 itself The linear combination: S’( ) = S( + ) + S( ) – S( ) 4 2 3 2 1 itself siblings ancestors

Specific Content Summarization (SCSum) • Context modeling • Motivation: single sentences occasionally do not contain enough context to clearly express the idea mentioned in original articles • Try to use the contexts to increase the confidence of agent-based sentences topic score(contexts) final_score(sentence) score(sentence) +

SCSum - Context modeling Example extracted from (Bannard and Callison-Burch 2005) *** Weevaluated the accuracy of each of the paraphrases that was extracted from the manually aligned data, as well as the top ranked paraphrases from the experimental conditions detailed below in Section 3.3. *** Because the accuracy of paraphrases can vary depending on context, we substituted each set of candidate paraphrases into between 2-10 sentences which contained the original phrase. *** Figure 4 shows the paraphrases for under control substituted into one of the sentences in which it occurred. *** We created a total of 289 such evaluation sets, with a total of 1366 unique sentences created through substitution. *** We had two native English speakers produce judgments as to whether the new sentencespreserved the meaning of the original phrase and as to whether they remainedgrammatical. *** Paraphrases that were judged to preserve both meaning and grammaticality were considered to be correct, and examples which failed on either judgment were considered to be incorrect. Agent-based sentence Adjacent sentences Summary sentence *** (Bannard and Callison-Burch 2005) replaced phrases with paraphrases in a number of sentences and askedjudges whether the substitutions “preservedmeaningandremainedgrammatical.”

Specific Content Summarization (SCSum) • Context modeling • Choose nearby sentences within a contextual window (size 5) after the agent-based sentence to represent more for given topic.

Specific Content Summarization (SCSum) • Weighting • The observation is that the presence of one or more of current, ancestor and sibling nodes may affect the final score from the computation • Add a new weighting coefficient for the score computed from the topic relevance computation (SCSum) Values as follows: If sentence contains no keywords in siblings: + Keywords in both ancestors & itself  1 + Keywords in itself only  0.5 + Keywords in ancestors only  0.25 If sentence contains keywords in siblings  0.1 (penalty) a weighting coefficient that takes on differing values based on the presence of keywords in the sentence

Specific Content Summarization (SCSum) • Ranking & Re-ranking • Sentences are ranked descendingly according to their relevance scores • Then, simplified MMR (SimRank) is performed: • A sentence X is removed if it has the maximum cosine similarity value exceeding a pre-deﬁned threshold (0.75) with any sentence Y which is already chosen at previous steps of SimRank.

Post-Proccessing • Two steps: • First, replace agentive forms (e.g., “we”, “our”, “this study”, ...) with a citation to the articles • Second, resolves abbreviations found in the extracted sentences • E.g. SMT  Statistical Machine Translation

Generation • In this work, we only generate the related work summaries by using depth-ﬁrst traversals to form the ordering of topic nodes in a topic tree Node ordering 1 − 4 −2 − 3 − 5 − 6 − 7

Experiments & Results • Dataset • Use RWSData described before, including 20 sets • 10 out of 20 sets were evaluated automatically and manually. • Baselines • LEAD (title + abstract – based RW) • MEAD (centroid + cosine similarity) • Proposed systems • ReWoS-WCM (ReWoS without context modeling) • ReWoS-CM (ReWoS with context modeling)

Experiments & Results • Automatic evaluation • Use ROUGE variants (ROUGE-1, ROUGE-2, ROUGE-S4, ROUGE-SU4) • Manual evaluation (measure over 5-point scale of 1 (very poor) to 5 (very good) • Correctness: Is the summary content actually relevant to the hierarchical topics given? • Novelty: Does the summary introduce novel information that is signiﬁcant in comparison with the human created summary? • Fluency: Does the summary’s exposition ﬂow well, in terms of syntax as well as discourse? • Usefulness: Is the summary acceptable in terms of its usefulness in supporting the researchers to quickly grasp the related works relevant to hierarchical topics given? • Summary length: 1% of the original relevant articles, measured in sentences

Experiments & Results • ROUGE evaluation seems to work unreasonably when dealing with verbose summaries, often produced by MEAD. • Related work summaries are multi-topic summaries of multi-article references. This may cause miscalculation from overlapping n-grams that occur across multiple topics or references.

Experiments & Results • The table shows that both ReWoS–WCM and ReWoS-CM perform signiﬁcantly better than baseline in terms of correctness, novelty, and usefulness. • Comparing with LEAD, showing that necessary information is not only located in titles or abstracts, but also in relevant portions of the research article body. • ReWoS–CM (with context modeling) performed equivalent to ReWoS–WCM (without it) in terms of correctness and usefulness. • - For novelty, ReWoS–CM is better than ReWoS–WCM. It proved that the proposed component of context moding is useful in providing new information.

Future work • Overcome the assumption about topic hierarchy tree • Investigate better generation • Focus on local coherence and topic transition

Conclusion • According to the best of our knowledge, automated related work summarization has not been studied before. • This work took initial steps towards solving this problem, by dividing the task into general and speciﬁc summarization processes. • Initial results showed an improvement over generic multi-document baselines in both automatic and human evaluation.

Thank you! • Questions???

Towards Automated Re lated Wo rk S ummarization ( ReWoS )

Towards Automated Re lated Wo rk S ummarization ( ReWoS )

Presentation Transcript

Towards Automated Tuning of Parallel Programs

RK Narayan

Automated MediaWiki to WordPress Migration Plugin: How It Wo

Putting Games to Wo rk

Towards Automated Acoustic Model Training

RK Patrol

SASB: S patial A ctivity S ummarization using B uffers

Comparing T wo S tories

Automated Models for Building’s Re trofit

Automated Software Evolution Towards Design Patterns

D E co D in gE ur As i AnN o ma dic Net Wo RK s:

Towards Automated Model Output Analysis

Towards Automated Related Work Summarization

Towards Re-focusing NWRI

Towards an Automated Analysis of Biomedical Abstracts

Towards Automated Verification Through Type Discovery

Automated Software Evolution Towards Design Patterns

Towards Automated Tuning of Parallel Programs

IMAS requirements towards Automated Plasma Reconstruction

TOPIC 3 : STEEL WO RK