300 likes | 406 Views
Towards Automated Re lated Wo rk S ummarization ( ReWoS ). HOANG Cong Duy Vu 03/12/2010. Outline. Recall A Motivating Example The Proposed Approach General Content Summarization (GCSum) Specific Content Summarization (SCSum) Generation Experiments & Results Future Work Conclusion.
E N D
Towards Automated Related Work Summarization(ReWoS) HOANG Cong Duy Vu 03/12/2010
Outline • Recall • A Motivating Example • The Proposed Approach • General Content Summarization (GCSum) • Specific Content Summarization (SCSum) • Generation • Experiments & Results • Future Work • Conclusion
Recall RW: related work A set of articles [] RW Summarizer User A desired length [,] [] A RW summary Topic hierarchy tree assumption
A Motivating Example A related work section extracted from “Bilingual Topic Aspect Classification with A few Training Examples” (Wu et al., 2008)
The Proposed Approach For leaf nodes For internal nodes The ReWoS architecture, Decision edges are labeled as (T)rue, (F)alse or (R)elevant.
The Proposed Approach • Pre-Processing • Based on heuristic rules of sentence length and lexical clues • Sentences with token-based length is too short (<7) or too long (>80) • Sentences referring to future tenses • Sentences containing obviously redundant clues such as: “in the section ...”, “figure XXX shows ...”, “for instance” …
The Proposed Approach • Agent-based rule • Attempts to distinguish whether the sentence describes an author’s own work or not. • Based on the presence of tokens that signals work done by the author, such as “we”, “our”, “us”, “this approach”, and “this method” … • Says that if a sentence does not satisfy this rule, route for GCSum, otherwise for SCSum
General Content Summarization (GCSum) • The objective of GCSum is to extract sentences containing useful background information on the topics of the internal node in focus.
General Content Summarization (GCSum) General content informative indicative • Text classification is a task that assigns a certain number of pre-defined labels for a given text. • Statistical machine translation (SMT) seeks to develop mathematical models of the translation process whose parameters can be automatically estimated from a parallel corpus. • Many previous studies have approached the problem of mono-lingual text classification. • This paper refers to the problem of sentiment analysis.
General Content Summarization (GCSum) • Informative sentences • Give detail on a specific aspect of the problem, e.g. definitions, purpose or application of the topic • Indicative sentences • simpler, inserted to make the topic transition explicit and rhetorically sound • Summarization issue • Given a topic: • For indicative sentences, using pre-defined templates • For informative sentences, extract from input articles
General Content Summarization (GCSum) GCSum first checks the subject of each candidate sentence, filtering ones whose subjects do not contain at least one topic keyword. (Subject-based rule) Or GCSum checks whether stock verb phrases (i.e., “based on”, “make use of” and 23 other patterns) are used as the main verb. (Verb-based rule) Or GCSum checks for the presence of at least one citation – general sentences may list a set of citations as examples. (Citation-based rule) Importantly note that if cannot find out any informative sentences from input articles, generate indicative sentences instead!
General Content Summarization (GCSum) • Topic relevance computation (GCSum) • ranks sentences based on keyword content • states that the topic of an internal node is affected by its surrounding nodes – ancestor, descendants and others - scoreS is the final relevance score - scoreSQA, scoreSQ, and scoreSQR mean the component relevance score of the sentence S with respect to the ancestor, current or other remaining nodes,respectively.
General Content Summarization (GCSum) • Topic relevance computation (GCSum) ancestors 1 The maximum number of sentences for each intermediate node is 2-3. itself 4 5 others 2 3 6 7 The linear combination: S’( ) = S( ) + S( ) – S(5 x ) 4 4 1 ancestors itself others
General Content Summarization (GCSum) • To obtain each component relevance score, we employ TF×ISF relevance computation
Specific Content Summarization (SCSum) • Sentences that are marked with author-as-agent are input to the Specific Content Summarization (SCSum) module. • SCSum aims to extract sentences that contain detailed information about a specific author’s work that is relevant to the input leaf nodes’ topic.
Specific Content Summarization (SCSum) • Topic relevance computation (SCSum) 1 ancestors Initially, the number of sentences for each leaf node is assigned equivalently. The relevance score is computed using the formula similar to GCSum presented earlier. 4 5 siblings 2 3 6 7 itself The linear combination: S’( ) = S( + ) + S( ) – S( ) 4 2 3 2 1 itself siblings ancestors
Specific Content Summarization (SCSum) • Context modeling • Motivation: single sentences occasionally do not contain enough context to clearly express the idea mentioned in original articles • Try to use the contexts to increase the confidence of agent-based sentences topic score(contexts) final_score(sentence) score(sentence) +
SCSum - Context modeling Example extracted from (Bannard and Callison-Burch 2005) *** Weevaluated the accuracy of each of the paraphrases that was extracted from the manually aligned data, as well as the top ranked paraphrases from the experimental conditions detailed below in Section 3.3. *** Because the accuracy of paraphrases can vary depending on context, we substituted each set of candidate paraphrases into between 2-10 sentences which contained the original phrase. *** Figure 4 shows the paraphrases for under control substituted into one of the sentences in which it occurred. *** We created a total of 289 such evaluation sets, with a total of 1366 unique sentences created through substitution. *** We had two native English speakers produce judgments as to whether the new sentencespreserved the meaning of the original phrase and as to whether they remainedgrammatical. *** Paraphrases that were judged to preserve both meaning and grammaticality were considered to be correct, and examples which failed on either judgment were considered to be incorrect. Agent-based sentence Adjacent sentences Summary sentence *** (Bannard and Callison-Burch 2005) replaced phrases with paraphrases in a number of sentences and askedjudges whether the substitutions “preservedmeaningandremainedgrammatical.”
Specific Content Summarization (SCSum) • Context modeling • Choose nearby sentences within a contextual window (size 5) after the agent-based sentence to represent more for given topic.
Specific Content Summarization (SCSum) • Weighting • The observation is that the presence of one or more of current, ancestor and sibling nodes may affect the final score from the computation • Add a new weighting coefficient for the score computed from the topic relevance computation (SCSum) Values as follows: If sentence contains no keywords in siblings: + Keywords in both ancestors & itself 1 + Keywords in itself only 0.5 + Keywords in ancestors only 0.25 If sentence contains keywords in siblings 0.1 (penalty) a weighting coefficient that takes on differing values based on the presence of keywords in the sentence
Specific Content Summarization (SCSum) • Ranking & Re-ranking • Sentences are ranked descendingly according to their relevance scores • Then, simplified MMR (SimRank) is performed: • A sentence X is removed if it has the maximum cosine similarity value exceeding a pre-defined threshold (0.75) with any sentence Y which is already chosen at previous steps of SimRank.
Post-Proccessing • Two steps: • First, replace agentive forms (e.g., “we”, “our”, “this study”, ...) with a citation to the articles • Second, resolves abbreviations found in the extracted sentences • E.g. SMT Statistical Machine Translation
Generation • In this work, we only generate the related work summaries by using depth-first traversals to form the ordering of topic nodes in a topic tree Node ordering 1 − 4 −2 − 3 − 5 − 6 − 7
Experiments & Results • Dataset • Use RWSData described before, including 20 sets • 10 out of 20 sets were evaluated automatically and manually. • Baselines • LEAD (title + abstract – based RW) • MEAD (centroid + cosine similarity) • Proposed systems • ReWoS-WCM (ReWoS without context modeling) • ReWoS-CM (ReWoS with context modeling)
Experiments & Results • Automatic evaluation • Use ROUGE variants (ROUGE-1, ROUGE-2, ROUGE-S4, ROUGE-SU4) • Manual evaluation (measure over 5-point scale of 1 (very poor) to 5 (very good) • Correctness: Is the summary content actually relevant to the hierarchical topics given? • Novelty: Does the summary introduce novel information that is significant in comparison with the human created summary? • Fluency: Does the summary’s exposition flow well, in terms of syntax as well as discourse? • Usefulness: Is the summary acceptable in terms of its usefulness in supporting the researchers to quickly grasp the related works relevant to hierarchical topics given? • Summary length: 1% of the original relevant articles, measured in sentences
Experiments & Results • ROUGE evaluation seems to work unreasonably when dealing with verbose summaries, often produced by MEAD. • Related work summaries are multi-topic summaries of multi-article references. This may cause miscalculation from overlapping n-grams that occur across multiple topics or references.
Experiments & Results • The table shows that both ReWoS–WCM and ReWoS-CM perform significantly better than baseline in terms of correctness, novelty, and usefulness. • Comparing with LEAD, showing that necessary information is not only located in titles or abstracts, but also in relevant portions of the research article body. • ReWoS–CM (with context modeling) performed equivalent to ReWoS–WCM (without it) in terms of correctness and usefulness. • - For novelty, ReWoS–CM is better than ReWoS–WCM. It proved that the proposed component of context moding is useful in providing new information.
Future work • Overcome the assumption about topic hierarchy tree • Investigate better generation • Focus on local coherence and topic transition
Conclusion • According to the best of our knowledge, automated related work summarization has not been studied before. • This work took initial steps towards solving this problem, by dividing the task into general and specific summarization processes. • Initial results showed an improvement over generic multi-document baselines in both automatic and human evaluation.
Thank you! • Questions???