1 / 30

Beyond Position Bias: Examining Result Attractiveness as a Source of Presentation Bias in Clickthrough Data

Beyond Position Bias: Examining Result Attractiveness as a Source of Presentation Bias in Clickthrough Data. WWW 2010. Yisong Yue Cornell Univ. Rajan Patel Google Inc. Hein Roehrig Google Inc. User Feedback in Search Systems. Cheap & representative feedback Evaluation metrics

betty_james
Download Presentation

Beyond Position Bias: Examining Result Attractiveness as a Source of Presentation Bias in Clickthrough Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Beyond Position Bias: Examining Result Attractiveness as a Source of Presentation Bias in Clickthrough Data WWW 2010 Yisong Yue Cornell Univ. Rajan Patel Google Inc. Hein Roehrig Google Inc.

  2. User Feedback in Search Systems • Cheap & representative feedback • Evaluation metrics • Optimization criterion • How to interpret feedback accurately? • Clicks on (web) search results • Data plentiful • Important domain

  3. Interpreting Clicks • What does click mean? • Click means good? • How good?

  4. How Are Clicks Biased? • In what ways do clicks not directly reflect user utility or preferences? • Presentation Bias • Only click on what they pay attention to • E.g., position bias (more clicks at top of ranking) • Understanding presentation bias essential to more accurately interpreting feedback

  5. Maybe 3rd result looked more relevant • i.e., judging a book by its cover • Maybe 3rd result attracted more attention • E.g., eye-catching • Many matching query terms (in bold)

  6. Summary Attractiveness • Goal: quantify the effect of summary attractiveness on click behavior • Web search context • First study to conduct a rigorous statistical analysis on summary attractiveness bias

  7. Controlling for Position • Position bias is the largest biasing effect • Need to control for it in order to analyze other biasing effects • Use FairPairs randomization • [Radlinski & Joachims, 2006]

  8. FairPairs Example • Original: 1 2 3 4 5 6 7 8 9 10 • FairPair1: 1 2 3 4 5 6 7 8 9 10 • Swap: 2 1 3 4 6 5 8 7 9 10 • FairPair2: 12 3 4 5 6 7 8 9 10 • Swap: 12 3 5 4 7 6 9 8 10 • Randomly choose pairing scheme • Randomly swap each intra-pair ordering independently [Radlinski & Joachims, AAAI 2006]

  9. Interpreting FairPairs Clicks Conclusion: B > A Clicks indicate pairwise preference (relative quality).

  10. Thought Experiment • Two results A & B • Equally relevant for some query • Ranked adjacently in search results • AB and BA shown equally often (FairPairs) • A has an attractive title. B does not. • Who gets more clicks, A or B?

  11. Click Data • Ran FairPairs randomization • A portion of Google US web search traffic. • 8/1/2009 to 8/20/2009 • 439,246 clicks collected

  12. Human Judged Ratings • Sampled a subset of 1150 FairPairs. • Asked human raters to explicitly judge which of the pair is more relevant. • 5 judgments for each • Human raters must navigate to landing page.

  13. Measuring Attractiveness • Relative measure of attractiveness • Difference of bolded query terms in title & abstract • Bottom result has +2 bolded terms in title • Bottom result has +2 bolded terms in abstract

  14. Measuring Attractiveness • Clearly, query/title similarity is informative. • Good results should have titles that strongly match • But would blindly counting clicks cause us to over-value query/title similarity?

  15. Rated Clicks Model

  16. Null Hypothesis • Title & abstract bolding have 0 effect • Position and relative (judged) quality are the only factors affecting click probability.

  17. Fitted Model

  18. Leveraging All Clicks • Previous model required human judgments • We need to calibrate against relative quality • How to do this on all 400,000+ clicks? • Make independence assumptions!

  19. Intuition • Virtually all search engines predict rankings using many attributes (or features). • Query/title similarity is only one component. • Example: a document with low query/title similarity might achieve high ranking due to very relevant body text.

  20. Example > > > > 1st feature: query/title similarity 2nd feature: query/body similarity

  21. Example > > > > 1st feature: query/title similarity 2nd feature: query/body similarity

  22. Assumption • Take pairs of adjacent documents at random • Collect relative relevance ratings • Human rated preferences • Should be independent of title bolding difference • Can check using statistical model

  23. Rated Agreement Model

  24. Fitted Model Assumption approximately satisfied for query/title similarity.

  25. Title Bias Effect (All Clicks) • Bars should be equal if not biased

  26. Evaluation Metrics & Optimization • Pairwise preferences common for evaluation • E.g., maximize FairPairs agreement • Goal: maximize pairwise relevance agreement • Want to be aligned with click agreement • Danger: might conclude current system is undervaluing query/title similarity • Down-weight clicks on results with more title bolding • E.g., weight clicks by exp(-wTXT)

  27. Directions to Explore • Other ways to measure summary attractiveness • Use other summary content • Other forms of presentation bias • Anything that draws people’s attention • Ways to interpret and adjust for bias • More accurate ways to quantify bias • More accurate evaluation metrics

  28. Extra Slides

  29. Fitted Model (All Clicks)

More Related