200 likes | 303 Views
An evolutionary approach for improving the quality of automatic summaries. Constantin Orasan Research Group in Computational Linguistics School of Humanities, Languages and Social Sciences University of Wolvehampton C.Orasan@wlv.ac.uk.
E N D
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities, Languages and Social Sciences University of Wolvehampton C.Orasan@wlv.ac.uk Proceeding of the ACL 2003 Workshop on Multilingual Summarization and Question Answering
Introduction • There are two main approaches for producing automatic summarizations. • Extract and rearrange • Understand and generate • Given that “understand” a text is usually domain-specific, extraction methods are preferred when robustness needed. • Here we present a novel approach to improve the quality of summarization by ameliorating their local cohesion.
Continuity Principle • Use the continuity principle defined in Centering Theory (Grosz et al., 1995) to improve the quality. • This principle requires that two consecutive utterances have at least one entity in common. • In general utterances are clauses or sentences, here we consider sentences as utterances. • We try to produce summaries which do not violate the continuity principle. • Produce sequences of sentences that refer the same entity, and therefore be more coherent.
Corpus Investigation • We consider two utterances have an entity in common if the same head noun phrase appear in both utterances. • Use the FDG tagger to determine the head of noun phrases. • We investigated 146 human produced abstracts from the Journal of Artificial Intelligence Research and almost 75% satisfy the principle.
Use CP in Summarization and Text Generation • In order to produce a summary which violate continuity principle least, we score a sentence use both content and context information. • Karamanis and Manurung (2002) used the CP in text generation, however, summarization is harder because it needs firstly identify the important information in the document. • Another difference is that we do not intend to change the order of the extracted sentences because preliminary experiments did not lead to any promising results.
Content-based scoring • The existing heuristics are: • Keyword method: TFIDF scores of words, the score of sentence is the sum of scores of words. • Indicator phrase method: such as in this paper, we present, in conclusion(meta-discourse markers), … • Location method: sentences in the first and last 13 paragraphs have their scores boosted. • Title and headers method: sentences containing the words in title and headers have their score boosted. • Special formatting rules: sentences that contain equations are excluded. • The score of a sentence is a weighted function of these parameters established through experiments. • One of the most important heuristics proved to be the indicating phrase method.
Context-based scoring • Depending on the context in which a sentence appears in a summary, its score can be boosted or penalized. • If the continuity principle satisfied with either the sentence that precedes or follows it the score boosted, otherwise penalized. • After experiment we decide to boost the sentence’s score with the TFIDF scores of the common NPs’ heads and penalize with the highest TFIDF score in the document.
The Greedy Algorithm • Extract the highest scored sentence from those not extracted yet. • Scores are computed in the way described above. • Given the original order of sentences is maintained, the algorithm in Figure 1 is performed.
The Greedy Algorithm • At score computing stage, a sentence’s score is computed as if it is included in the extract. • The one with highest score is extracted, repeat until the required length reached. • The first extracted sentence is always the one with highest content-based score. • It is possible to extract S1 and S2 but in a later iteration extract S3 between S1 and S2 that violates continuity principle with S2.
1 4 5 7 9 8 10 4 6 7 9 1 2 3 4 5 6 7 8 9 10
The Evolutionary Algorithm • The inclusion of a sentence in the above method depends on sentences existing in the summary. • A specific type of evolutionary algorithms are genetic algorithm which encode the problem as a series of genes, called chromosome. • Our genes take integer values representing the position of sentence in document.
The Evolutionary Algorithm • Genetic algorithms use a fitness function to assess how good a chromosome is, in our case the function is the sum of the scores of the sentences. • Genetic algorithms use genetic operations to evolve a population of chromosomes, in our case use weighted roulette wheel selection to select chromosomes. • Once several chromosomes selected, they are evolved using crossover and mutation.
The Evolutionary Algorithm • We use the single point crossover operator and two mutation operators. • The first one replaces the value of a gene with a randomly generated integer value (try to include random sentences in the summary). • The second replaces the values of a gene with the value of the preceding gene incremented by one (introduce consecutive sentences in the summary). • Start with a population of randomly generated chromosomes which is then evolves using the operators, each has a certain probability of being applied.
The Evolutionary Algorithm • The best chromosome (the one with highest fitness score) during all generations is the solution to the problem. • In our case we iterated a population of 500 chromosomes for 100 generations.
Evaluation and Discussion • We evaluated on 10 scientific papers on Journal of Artificial Intelligence Research, total 90000 words, given that from each text we produce eight different summaries which had to be assessed by humans, the evaluation was very time consuming. • The quality of a summary can be measured in terms of coherence, cohesion and informativeness. • Cohesion is indicated by # of dangling anaphoric expressions. • Coherence is indicated by # of ruptures in the discourse. • For informativeness we compute the similarity between summary and document.
Evaluation and Discussion • In evaluation, TFIDF extracts sentences with highest TFIDF scores, Basic method refers to the content-based scoring, Greedy and Evolutionary are two algorithms which additionally use the continuity principle. • Noticing only slight improvement in the 3% summary, we decided to increase the length to 5% (value shown in brackets).
We consider a discourse rupture occurs when a sentence seems completely isolated from the rest of the text. • Usually happens due to presence of isolated discourse markers such as firstly, however, on the other hand,… • For 3% summaries, context information has little influence because the indicating phrases has greater influence on coherence than the continuity principle. • When longer summaries, evolutionary algorithm better than basic method in all cases, but greedy algorithm not. • We believe that the improvement is due to the discourse information used by the methods.
Even though anaphora is not directly addressed here, a subsidiary effect of improving local cohesion should decrease # of dangling references. • As in the case of DR, greedy algorithm does not perform significantly better than the basic method. • Most frequent dangling references were due to referring to tables, figures, definitions and theorems (e.g. As we showed in Table 3…).
We use a content-based evaluation metric (Donaway et al., 2000) which computes similarity between summary and document. • The evolutionary algorithm does not lead to major loss of information, and for several text this method obtains highest score. • In contrast, the greedy method seems to exclude useful information, for several texts, performing worse than basic method and baseline.
Conclusion and Future Work • We presented two algorithms combining content and context information. Experiments show that the evolutionary method performs better in coherence and cohesion, and does not degrade the information content. • One could argue that 5% summary is too long, but these summaries can be shortened by using aggregation rules where two sentences referring to the same entity merged into one. • We intend to extend the experiments and test combination of centering theory’s principle and the evaluation using other types of texts.