Two Models to predict Query-URL relevance in CA

Two Models to predict Query-URL relevance in CA Tan liwen 2012.11.7

Introduction • The general interaction picture: Publishers, Advertisers, Users, & “Ad agency” • Each actor has its own goal (more later)

Interactions in Sponsored Search • Advertisers: • Submit ads associated to certain bid phrases • Bid for position • Pay CPC • Users • Make queries to search engine, expressing some intent • Search engine • Executes query against web corpus + other data sources • Executes query against the ad corpus • Displays a Search Results Page (SERP) = integration of web results, other data, and ads • Each of the SE, Advertisers, and Users has its own utility

Key messages: Computational advertising = A principled way to find the "best match" between a user in a context and a suitable ad.

Model 1: UBM UBM: User Browsing Model to Predict Search Engine Click Data Georges Dupret, Yahoo! Research Latin America Benjamin Piwowarski, Yahoo! Research Latin America

Search Instance Rank r=1 Doc. u/ui If click c=1 If exami. e=1 If attract. a=1 title snippet Query q URL d=1

Previous Models • The baseline hypothesis • The Examination hypothesis • The Cascade Model biased P(e|r) unchangeable Clicks > 1 ?

Single Browsing Model • Hypothesis: • Starts with the first result and goes down the list • For each position, the user first decides whether to look at the snippet or not • If click, provided that the snippet is attractive enough • Whether he clicked or not, the user continues his scan from the following position • attractiveness of snippet u for query q Attractive(0/1) Examination(0/1) probability of examination at distance d and position r

Single Browsing Model • Model the click probability as: • is deterministic • If c=1, a=1, e=1, • If c=0, then • Use EM(Expectation Maximization) algorithm to compute α and γ by:

Multiple Browsing Model • Query types: • For a navigation, for information, for some result…. • Assumption: • users browse differently the list of results depending on the query type • Start with M models • In which Click doc. set Skip doc. set

Model 2: BBM BBM: Bayesian Browsing Model from Petabyte-scale Data Chao Liu, MSR-Redmond Fan Guo, Carnegie Mellon University Christos Faloutsos, Carnegie Mellon University

Massive Log Streams • Search log • 10+ terabyte each day (keeps increasing!) • Involves billions of distinct (query, url)’s • Questions • Can we infer user-perceived relevance for each (query, url) pair? • How many passes of the data are needed? Is one enough? • Can the inference be parallel? • Our answer: Yes, Yes, and Yes!

Exact Model Inference • For a given query • Top-M positions, usually M=10 • Positional relevance • M(M+1)/2 combinations of (r, d)’s • n search instances • N documents impressed in total: • Document relevance

An Example n=3, M=3, N=4

BBM: Bayesian Browsing Model URL1 URL2 URL3 URL4 query S4 S1 S2 S3 Relevance Examine Snippet E4 E1 E2 E3 C4 C1 C2 C3 ClickThroughs

Dependencies in BBM … Si S1 S2 … Ei E1 E2 the preceding click position before i Ci C1 C2 …

Model Inference • Ultimate goal • Observation: conditional independence

P(C|S) by Chain Rule • Likelihood of search instance • From S to R:

Putting things together • Posterior with • Re-organize by Rj’s How many times dj was not clicked when it is at position (r + d) and the preceding click is on position r How many times dj was clicked

What Tells US • At most M(M+1)/2 + 1 numbers to fully characterize each posterior • Count vector:

LearnBBM: One-Pass Counting Find Rj

Conclusions • UBM are simple, it models the user’s browsing behavior • BBM for Search streams • A single pass suffices • Map-Reducible for Parallelism • Admissible to incremental updates • Good at mining click streams

Q&A

Two Models to predict Query-URL relevance in CA

Two Models to predict Query-URL relevance in CA

Presentation Transcript

Relevance Feedback and Query Expansion

Relevance Models In Information Retrieval

Working With Simple Models to Predict Contaminant Migration

Query Understanding for Relevance Measurement

5-7: Predict with Linear Models

On the Privacy Concerns of URL Query Strings

Unit 1.8 – Predict with Linear models

4.7 Predict with Linear Models

Ch 9 Relevance feedback and Query expansion

Two models in sentence processing

Relevance Models [draft]

Two-wave Two-variable Models

Relevance Feedback and other Query Modification Techniques

Query Models

Query Models

Incorporating Non-Relevance Information in the Estimation of Query Models

Relevance-Based Language Models

Query Relevance Feedback and Ontologies

Query Reformulation: User Relevance Feedback

Relevance feedback using query-logs

2.1 Using Scientific Models to Predict Speed