Diversifying Search Results

Diversifying Search Results RakeshAgrawalSreenivasGollapudi Search Labs Search Labs Microsoft Research Microsoft Research rakesha@microsoft.comsreenig@microsoft.com Alan Halverson Samuel Ieong Search Labs Search Labs Microsoft Research Microsoft Research alanhal@microsoft.comsaieong@microsoft.com WSDM ’09

Outline • Introduction • Problem Formulation • A Greedy Algorithm for DIVERSIFY(K) • Performance Metrics • Evaluation • Conclusions

Introduction • Minimize the risk of dissatisfaction of the average user • Assume that there exist • a taxonomy • a model user intents • Consider both the relevance of the documents and diversity of the search result • Tradeoff relevance and diversity

Problem Formulation • The number of results to show for each category according to the percentage of users interested in that category may perform poorly • Example : Flash • Technology : 0.6

Problem Formulation • Non-order • Our algorithm is also designed to generate an ordering of results rather than just a set of results

Problem Formulation • DIVERSIFY(k) is NP-hard • Optimal for DIVERSIFY (k-1) need not be a subset of documents optimal for DIVERSIFY (k) • Example : p(c1|q)=p(c2|q)=0.5 DIVERSIFY(1):d1,d2,d3 DIVERSIFY(2):d2,d3,d1

A Greedy Algorithm for DIVERSIFY(K)

Performance Metrics • NDCG,MRR,MAP do not take into account the value of diversification • Intent Aware Measure example: p(c2|q)>>p(c1|q) d1 is Excellent for c1(but unrelated to c2) d2is Good for c2(but unrelated to c1) Classical IR metrics:d1,d2 Intent aware measures:d2,d1

Intent Aware Measure

Evaluation • Evaluate our approach against three commercial search engine • Conduct three sets of experiments • Differ in how the distributions of intents and how the relevance of the documents are obtained

Experiment 1 • The distributions of intents for both queries and documents via standard classifiers • The relevance of documents from a proprietary repository of human judgements that we have been granted access to • Dataset : 10,000 random queries with top 50 documents • Many documents are assigned human judgments in the top 10 for each query

Experiment 1 • sample about 900 queries • at least two categories • a significant fraction of associated documents have human judgments

Experiment 1

Experiment 2 • Obtain the distributions of intents for queries and the document relevance using the Amazon Mechanical Turk platform • Sample 200 queries from the dataset • at least three categories • Submit these queries along with the three most likely categories as estimated by the classier and the top five results produced by IA-Select to the Turks

Experiment 2

Experiment 3 • IA-Select : p(c|q) from Amazon Mechanical Turk platform • Metrics : p(c|q) and relevance documents are the same as used in Experiment 1

Conclusions • Provide a greedy algorithm with good approximation gurantees • To evaluate the effectiveness of our approach, we proposed generalizations of well-studied metrics to take into account of the intentions of the users • Our approach outperforms results produced by commercial search engines over all of the metrics

Diversifying Search Results

Diversifying Search Results

Presentation Transcript

Search Results

Clustering Web Search Results

Organizing Search Results

Search Results

Clustering Web Search Results

Searches and Search Results

Diversifying Music Recommendations

Sustaining by diversifying

Search Pages and Results

Diversifying Revenue

Diversifying Farming

Diversifying Search Results

Diversifying PRSP

Search strategy and results

Search Results

*Search results from clinicaltrials

Diversifying Query Results on Semi-Structured Data

Search Pages and Results

Branding SharePoint Search Results

search engine results page

Music Search Results

Ranking Search Results