200 likes | 325 Views
Less is More Probabilistic Model for Retrieving Fewer Relevant Docuemtns. Harr Chen and David R. Karger MIT CSAIL SIGIR2006. Abstract. Probability Ranking Priciple (PRP) Rank documents in decreasing order of probability of relevance.
E N D
Less is MoreProbabilistic Model for Retrieving Fewer Relevant Docuemtns Harr Chen and David R. Karger MIT CSAIL SIGIR2006
Abstract • Probability Ranking Priciple (PRP) • Rank documents in decreasing order of probability of relevance. • Propose a greedy algorithm that approximately optimizes the following objectives • %no metric: the percentages of queries for which no relevant documents are retrieved. • The diversity of results.
Introduction • Probability Ranking Principle • Rule of thumb: “optimal”. • TREC robust track • %no metric • Question answering and finding a homepage. • Diversity • For example, “Trojan horse” • PRP-based method may choose one “most likely” interpretation. • Greedy algorithm • Fill each position in the ranking by assuming that all previous documents in the ranking are not relevant.
Introduction (Cont.) • Other measures • Search length (SL) • Reciprocal rank (RR) • Instance recall: the number of difference subtopics in a given result set. • Retrieving for Diversity • The diversity automatically arises as a consequence of the objective function.
Related Work • Algorithm • Zhai and Lafferty: a risk minimization framework • Bookstein: a sequential learning retrieval system • Diversity • Zhai et al.: novelty and redundancy • Clustering is an approach to quickly cover a diverse range of query interpretations.
Evaluation Metrics • MSL (mean search length) • MRR (mean reciprocal rank) • %no • k-call at n: 1 if at least k of the top n docs returned by system for the given query are deemed relevant; otherwise 0. • mean 1-call: one minus the %no metric • n-call at n: perfect precision • Instance recall at rank n
Bayesian Retrieval • Standard Bayesian Information Retrival • The documents in a corpus should be ranked by Pr[r|d] • By a monotonic transformation • Focus on the objective function, so use Naïve Bayes framework with multinomial models (θi) as the family of distributions. • Determine the parameters (training) • Dirichlet prior: prior probability distribution over the parameters (θi). • Estimate the probability of parameters of the relevant distribution (i.e., Pr[d|r]).
Object Function • Considering optimizing for the k-call at nmetric. • k=1: the probability that at least one of the first n relevance variables be true • For arbitrary k: the probability that at least k docs are relevant
Optimization Methods • NP-hard Problem • To perfectly optimize the k-call of any specific set of n docs objective function from a corpus of m docs, because • Greedy algorithm (approximately optimize it) • Successively select each result of the result set. • Select first result by applying the conventional PRP. • For the ith result, we hold results 1 throught i-1 to their already selected value, and consider all remaining corpus documents as a possibility for document i. • Pick the document with highest k-call score as the ith result.
Applying the Greedy Approach • k=1 • First, choose the doc d0 maximizing Pr[r0|d0]. • Wish to choose d1 maximizing the below quantity: • Choose d2 by maximizing • In general, select the optimal di that maximizes
Applying the Greedy Approach (Cont.) • k=n (perfect precision) • Select the ith document according to: • 1<k<n • The objective is to maximize the probability of having at least k relevant docs in the top n. • Focus on k=1 and k=n cases in this paper.
Optimizing for Other Metrics • Optimizing 1-call • Choose greedily conditioned on there being no previous document relevant. • Equal to minimize expected search length and maximize expected reciprocal rank. • Also optimize instance recall metric, which measures the number of distinct subtopics retrieved. • If a query has t subtopics, then instance recall is
Google Examples • Two ambiguous queries: “Trojan horse” and “virus” • Usd the titles, summaries, and snippets of Google’s results to form a corpus of 1000 docs for each query.
Experiments • Methods • 1-greedy, 10-greedy, and conventional PRP • Datasets • ad hoc topics from TREC-1, TREC-2, and TREC-3 to set the weight parameters of model appropriately. • TREC2004 robust track • TREC-6,7,8 interactive track • TREC-4 and TREC-6 ad hoc tracks
Tuning the Weights • Key weight • For the proposed model, the key weights are the strength of the relevant distribution and irrelevant distribution priors with respect to the strength of the docs. • TRECs 1, 2, and 3 • Consisting about 724,000 docs, and 150 topics (topics 51-200) • Used for tuning weight
Robust Track Experiments • TREC2004 robust track • 249topics in total, about 528,000 docs • 50 topics were selected by TREC as being “difficult” queries.
Instance Retrieval Experiments • TREC-6, 7, and 8 interactive track • Test the performance of diversity • Total 20 topics with between 7 and 56 aspects each, and about 210,000 docs. • Zhai et al’s LM approach is better for aspect retrieval.
Multiple Annotator Experiments • TREC-4 and TREC-6 • Multiple independent annotators are asked to make relevant judgments for the same topics over the same corpus. • TREC-6 had three annotators, TREC-6 had two.
Query Analysis • A specific topic 100 • The description is:
Conclusions and Future Work • Conclusions • Identify the PRP is not optimal, and given an approach to directly optimize other desired objective. • The approach is algorithmically feasible. • Future work • Other objective functions • More sophisticated techniques, such as local search alg. • The likelihood of relevance collections of docs • Two-Poisson model • Language model