1 / 50

Personalized Web Search Uncommon Responses to Common Queries

Study on personalizing web search for improved results, evaluating user-relevance, algorithms, and future work. Discusses novel ranking methods like Seesaw Algorithm.

tineo
Download Presentation

Personalized Web Search Uncommon Responses to Common Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Personalized Web SearchUncommon Responses to Common Queries Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

  2. Personalizing Web Search • Motivation • Algorithms • Results • Future Work

  3. Personalizing Web Search • Motivation • Algorithms • Results • Future Work

  4. Study of Personal Relevancy • 15 participants • Microsoft employees • Managers, support staff, programmers, … • Evaluate 50 results for a query • Highly relevant • Relevant • Irrelevant • ~10 queries per person

  5. Study of Personal Relevancy • Query selection • Chose from 10 pre-selected queries • Previously issued query Pre-selected cancer Microsoft traffic … Las Vegas rice McDonalds … bison frise Red Sox airlines … Mary Joe Total: 137 53 pre-selected (2-9/query)

  6. Relevant Results Have Low Rank Highly Relevant Relevant Irrelevant

  7. Relevant Results Have Low Rank Highly Relevant Rater 1 Rater 2 Relevant Irrelevant

  8. Same Results Rated Differently • Average inter-rater reliability: 56% • Different from previous research • Belkin: 94% IRR in TREC • Eastman: 85% IRR on the Web • Asked for personalrelevance judgments • Some queries more correlated than others

  9. Same Query, Different Intent • Different meanings • “Information about the astronomical/astrological sign of cancer” • “information about cancer treatments” • Different intents • “is there any new tests for cancer?” • “information about cancer treatments”

  10. Same Intent, Different Evaluation • Query: Microsoft • “information about microsoft, the company” • “Things related to the Microsoft corporation” • “Information on Microsoft Corp” • 31/50 rated as not irrelevant • Only 6/31 do more than one agree • All three agree only for www.microsoft.com • Inter-rater reliability: 56%

  11. Search Engines are for the Masses Joe Mary

  12. Much Room for Improvement • Group ranking • Best improves on Web by 38% • More people  Less improvement

  13. Much Room for Improvement • Group ranking • Best improves on Web by 38% • More people  Less improvement • Personal ranking • Best improves on Web by 55% • Remains constant

  14. Personalizing Web Search • Motivation • Algorithms • Results • Future Work - Seesaw Search Engine - See - Seesaw

  15. Personalization Algorithms • Related to relevance feedback • Query expansion • Standard IR Query Server Document Client User

  16. Personalization Algorithms • Related to relevance feedback • Query expansion • Standard IR Query Server Document Client User v. Result re-ranking

  17. Result Re-Ranking • Ensures privacy • Good evaluation framework • Can look at rich user profile • Look at light weight user models • Collected on server side • Sent as query expansion

  18. BM25 with Relevance Feedback Score = Σtfi * wi N ni R ri N ni wi = log

  19. BM25 with Relevance Feedback Score = Σtfi * wi N ni R ri (ri+0.5)(N-ni-R+ri+0.5) (ni-ri+0.5)(R-ri+0.5) wi = log

  20. User Model as Relevance Feedback Score = Σtfi * wi N R N’ = N+R ni’ = ni+ri ri ni (ri+0.5)(N-ni-R+ri+0.5) (ni-ri+0.5)(R-ri+0.5) wi = log

  21. User Model as Relevance Feedback Score = Σtfi * wi N R N’ = N+R ni’ = ni+ri ri ni (ri+0.5)(N’-ni’-R+ri+0.5) (ni’- ri+0.5)(R-ri+0.5) wi = log

  22. User Model as Relevance Feedback World Score = Σtfi * wi N User R ri ni

  23. User Model as Relevance Feedback World Score = Σtfi * wi N User World related to query R ri ni ni N

  24. User Model as Relevance Feedback World Score = Σtfi * wi N User World related to query R ri ni R ni N User related to query ri Query Focused Matching

  25. User Model as Relevance Feedback World Focused Matching World Score = Σtfi * wi N User Web related to query R ri ni R ni N User related to query ri Query Focused Matching

  26. Parameters • Matching • User representation • World representation • Query expansion

  27. Parameters • Matching • User representation • World representation • Query expansion Query focused World focused

  28. Parameters • Matching • User representation • World representation • Query expansion Query focused World focused

  29. User Representation • Stuff I’ve Seen (SIS) index • MSR research project [Dumais, et al.] • Index of everything a user’s seen • Recently indexed documents • Web documents in SIS index • Query history • None

  30. Parameters • Matching • User representation • World representation • Query expansion Query focused World focused All SIS Recent SIS Web SIS Query history None

  31. Parameters • Matching • User representation • World representation • Query expansion Query Focused World Focused All SIS Recent SIS Web SIS Query History None

  32. World Representation • Document Representation • Full text • Title and snippet • Corpus Representation • Web • Result set – title and snippet • Result set – full text

  33. Parameters • Matching • User representation • World representation • Query expansion Query focused World focused All SIS Recent SIS Web SIS Query history None Full text Title and snippet Web Result set – full text Result set – title and snippet

  34. Parameters • Matching • User representation • World representation • Query expansion Query focused World focused All SIS Recent SIS Web SIS Query history None Full text Title and snippet Web Result set – full text Result set – title and snippet

  35. Query Expansion • All words in document • Query focused The American Cancer Society is dedicated to eliminating cancer as a major health problem by preventing cancer, saving lives, and diminishing suffering through ... The American Cancer Society is dedicated to eliminating cancer as a major health problem by preventing cancer, saving lives, and diminishing suffering through ...

  36. Parameters • Matching • User representation • World representation • Query expansion Query focused World focused All SIS Recent SIS Web SIS Query history None Full text Title and snippet Web Result set – full text Result set – title and snippet All words Query focused

  37. Parameters • Matching • User representation • World representation • Query expansion Query focused World focused All SIS Recent SIS Web SIS Query history None Full text Title and snippet Web Result set – full text Result set – title and snippet All words Query focused

  38. Personalizing Web Search • Motivation • Algorithms • Results • Future Work

  39. Best Parameter Settings • Matching • User representation • World representation • Query expansion Query focused World focused Query focused Query focused World focused All SIS Recent SIS Web SIS Query history None All SIS Recent SIS Web SIS Query history None All SIS Web SIS All SIS Recent SIS Full text Full text Title and snippet Title and snippet Result set – title and snippet Web Result set – full text Result set – title and snippet Result set – title and snippet Web All words All words Query focused Query focused

  40. Seesaw Improves Retrieval • No user model • Random • Relevance Feedback • Seesaw

  41. Text Alone Not Enough

  42. Incorporate Non-text Features

  43. Summary • Rich user model important for search personalization • Seesaw improves text based retrieval • Need other features • to improve Web • Lots of room • for improvement future

  44. Personalizing Web Search • Motivation • Algorithms • Results • Future Work • Further exploration • Making Seesaw practical • User interface issues

  45. Further Exploration • Explore larger parameter space • Learn parameters • Based on individual • Based on query • Based on results • Give user control?

  46. Making Seesaw Practical • Learn most about personalization by deploying a system • Best algorithm reasonably efficient • Merging server and client • Query expansion • Get more relevant results in the set to be re-ranked • Design snippets for personalization

  47. User Interface Issues • Make personalization transparent • Give user control over personalization • Slider between Web and personalized results • Allows for background computation • Creates problem with re-finding • Results change as user model changes • Thesis research – Re:Search Engine

  48. Thank you!

  49. Search Engines are for the Masses • Best common ranking • DCG(i) = { • Sort results by number marked highly relevant, then by relevant • Measure distance with Kendall-Tau • Web ranking more similar to common • Individual’s ranking distance: 0.469 • Common ranking distance: 0.445 Gain(i), if i = 1 DCG(i–1) + Gain(i)/log(i), otherwise

More Related