380 likes | 387 Views
An IR Approach to Multi-document Summarization. Joint work with Prasad Pingali, Jagadeesh J, Chandan Kumar and Praveen B. Vasudeva Varma, IIIT Hyderabad. Agenda. Multi-document Summarization (MDS) - Motivation Three Flavors of MDS Query focused Summarization Update Summarization
E N D
An IR Approach to Multi-document Summarization • Joint work with Prasad Pingali, Jagadeesh J, Chandan Kumar and Praveen B Vasudeva Varma, IIIT Hyderabad
Agenda • Multi-document Summarization (MDS) - Motivation • Three Flavors of MDS • Query focused Summarization • Update Summarization • Personalized summaries • Abstract Summarization – A demo
DUC Task Description • Task is to create, from the document set, a response which answers the information need expressed • Input: Information need, Cluster of relevant documents (assumed to contain answer) • Output: Answer of required length
Extract vs. Abstract Summarization • We conducted a study (2005) • Generated best possible extracts • Calculated the scores for these extracts • Evaluation with respect to the reference summaries
Sentence Extraction Approaches • MEAD: sentence-level and inter-sentence features, cluster centroids, position, TF*IDF • CLASSY: Learned HMM model to identify summary and non-summary sentence, QR algorithm, linguistic component • Mani & Erkon: Graph-connectivity model • Lin & Hovy: sentence position, term frequency, topic signature and term clustering • LexPageRank, Hardy, Harabagiu and Lacatusu etc….. • Supervised Approaches: sentence classifiers trained using human-generated summaries as training examples for feature extraction and parameter estimation
Our approach • Documents should be ranked in order of probability of relevance to the request or information need, as calculated from whatever evidence is available to the system • Central idea: • Query Dependent Ranking (DUC 2005) • Language Models (HAL, RBLM) • Query Independent Ranking (DUC 2006) • Sentence Prior
Relevance Based Language Models (RBLM) • An IR approach • Query and document are samples of unknown relevance model R • Overcomes the problem of sparseness of the document language model • Conditional Sampling computes the conditional probabilities, decomposed from joint probabilities
Hyperspace Analogue to Language (HAL): Co-occurrence strengths • A concept can be understood from its context: Constructs dependencies of a term w on other terms based on their occurrence in its context in the corpus • Input: corpus of documents with vocabulary of size |T | • Output: a |T | x |T | matrix, rows represent the vectors corresponding to different terms in vocabulary • Process: Sliding window of K words • Words in window are proportional with strengths inversely proportional to their distance • Vectors of semantically related words are similar • Frequency counts are converted into probabilities ─ P(w1/w2)
Putting All Together HAL Feature:
Modified HAL - Addressing Phrases • Phrases are identified using Chunker • Co-occurrence strength of word on chunks • Word/Phrase is weighted proportional to their TFIDF value
Query independent Features: Sentence Prior • Captures importance of sentence explicitly using pseudo relevant documents (Web, Wikipedia, DUC Document Sets) • Based on Domain knowledge, Background Information, Centrality • Log Linear Relevance • Information Measure in a sentence • Entropy is a measure of information contained in a message
DUC 2006: Official Results Total 38 systems participated Significant difference between first two systems 5th Rank on linguistic quality
Progressive Summarization • Emerging area of research in summarization • Summarization with a sense of prior knowledge • Introduced as “Update Summarization” at DUC 2007, TAC 2008, TAC 2009 • Generate a short summary of a set of newswire articles, under the assumption that the user has already read a given set of earlier articles. • To keep track of news stories, reviews of products
Key challenge • To detect information that is not only relevant but also new given the prior knowledge of reader • Relevant and new Vs • Non-Relevant and new Vs • Relevant and redundant
Novelty Detection • Identifying sentences containing new information (Novelty Detection) from cluster of documents is the key of progressive summarization • Shares similarity with Novelty track at TREC from 2002 – 2004 • Task 1: Extract relevant sentences from a set of documents for a topic • Task 2: Eliminate redundant sentences from relevant sentences • Progressive summarization differs, as in producing summary from novel sentences (requires scoring and ranking)
Three level approach to Novelty Detection • Sentence Scoring • Developing new features that capture novelty along with relevance of a sentence • Ranking • Sentences are re ranked based on the amount of novelty it contains • Summary Generation • Summary of recent cluster will have minimum overlap with previous summaries • A selected pool of sentences that contain novel facts. All remaining sentences are filtered out
Consider a stream of articles published on a topic over time period T All articles published from time 0 to t are considered to be read previously (prior Knowledge) Articles published from t to T are new that contains new information. Let td represent the chronological time stamp of document d. Novelty Detection Features: NF (Novelty Factor) and New Words
Ranking Features • Ranked set is Re-ordered based on redundancy score of each sentence • Re-Ranked set used for summary generation • Cosine Similarity (CoSim) • Each sentence in new cluster compared against all sentences in previous clusters • Average Cosine similarity value is considered as the redundancy score Rank = relweight*rank - redweight*redundancy_score relweight = 0.9, redweight = 1-relweight
Evaluation and Results • TAC 2008 Update Summarization data for training: 48 topics • Each topic divided into A, B with 10 documents • Summary for cluster A is normal summary and cluster B is update summary • TAC 2009 update Summarization for testing: 44 topics • Baseline summarizer generates summary by picking first 100 words of last document • Run1 – DFS + SL1 • Run2 – PHAL + KL
Personalized Summarization - Motivation • When different humans summarize the same text • 71% overlap [Marcu-1997] • 25% overlap [Rath-1961] • 46% overlap [Salton-1997] • They include different content from each other, reflecting their personal interest and background knowledge • Each User has different perspective on the same text • Need to incorporate user in the automatic summarization process • Summarization is not only a function of the input text but also of its reader.
Term interest differ for each person • Incorporating User Model P(w/Mu) to smooth the original Document distribution
Experiments and Evaluation • Users: Research scholars from different fields of computer science • Web-based profile creation: Personal information available on web- a conference page, a project page, an online paper, or even in a Weblog. • Put the person's full name to a search engine ("Vasudeva Varma") and retrieve top 'n' documents to build user profile • Estimate Model P(w/Mu) to incorporate user in sentence extraction process • 5 Users, 25-Doc Clusters • Two versions of summary generated for each user • Generic Summary • Personalize summary • Each User was asked to asked to give his relevance score to the summary on a 5-point scale..
Evaluation Average Scores for different Uses Scores for different topics for a user
Abstract Summarization – A Demo • http://compare.setooz.com • Alias http://comparison.to
Thank you Questions?