180 likes | 328 Views
Event-Centric Summary Generation. Lucy Vanderwende, Michele Banko and Arul Menezes One Microsoft Way, WA, USA DUC 2004. Abstract. Our primary interest is two folds: To explore an event-centric approach to summarization To explore a generation approach to summary realization.
E N D
Event-Centric Summary Generation Lucy Vanderwende, Michele Banko and Arul Menezes One Microsoft Way, WA, USA DUC 2004
Abstract • Our primary interest is two folds: • To explore an event-centric approach to summarization • To explore a generation approach to summary realization
Introduction • Identifying important events, as opposed to entities • Generation component • Human-authored rely less on sentence extraction • Graph-scoring algorithm • To identify highest weighted node to guide content selection
System Description • MSR-NLP • Analysis component • Rule-base syntactic analysis component • Produces a logical form • Syntactic variations, words label • Generation component • Syntactic realization component • Produces a syntactic tree
Creating document representations • Cluster sentence • Analysis sentence and get logical form
Creating document representations • Produces triples result from logical form • (LFNodei, rel, LFNodej)
Forming Document Graph • Take those triples and join nodes by way of their semantic relation using a bidirectional link structure • Keep track of how many times we observe the relationship • Stop words are not included in the graph construction
Node scoring Using Pagerank • Using Pagerank algorithm • Hyperlink such as WWW • When link between nodes, vote for that node
Node scoring Using Pagerank • Pagerank framework • “Pages”, correspond to base forms of words in the documents • “hyperlink”, correspond to semantic relationships • Verbs, identify events • Noun, Identify entities • Use event to identify summary content • Typically, the algorithm converges around 40 iterations
Graph Scoring • Use pagerank scores to assess the link weight (LW(i->n))
Summary Generation • Generated by extracting and merging of logical form • Identify important triples • Defined highly link weight node, and together with most highly weighted • (leave, Tobj, LonLondon_Bridge_Hospital) • Not (leave, Tobj, government) • Extract fragments divided into “event” and “entity” • Event used to generate summary • Entity used to expanded upon reference to the same entity within the selected event fragment
Summary Generation • Event fragment order • Cluster event fragment by they refer to • Choose the greatest number of argument node for the event • Order the selected event fragments • To group sentence referring to the same entity together • Order sentence which exhibit event-coreference
Experiments and Evaluation (Rule-based pronoun resolution method, 75% accuracy)
Experiments and Evaluation Reason: the potential to introduce disfluent text
Directions and Future Work • Produce more human-like generated summaries • Further study the impact of anaphora resolution • Study new page-ranking algorithm • While ordering groups event fragments mentioning the same entity, we have not yet implemented a system to combine them into larger logical form construction