Predicting Short-Term Interests Using Activity-Based Search Context

Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh

Outline • Introduction • Modeling Search Activity • Study • Conclusions

Introduction • Satisfying searchers’ information needs involves a through understanding of their interests through: - search query - search engine result page (SERP) clicks - post-SERP browsing behavior • Construct interest models of the current query which including: - previous queries - previous clicks on SERP • Evaluate the predictive effectiveness of these models using future actions

Modeling Search Activity • Data - The data set contained browser logs with both searching and browsing episodes. - Log entries include a timestamp for each page view, and the URL of the Web page visited - Only in English-speaking United States locale - Search sessions on the Bing Web search engine were extracted

Modeling Search Activity • ODP Labeling - Represented context a distribution across categories in ODP topical hierarchy. - Provides a consistent topical representation of queries and page visits from which to build the models. - ODP category label can also reflect topical differences in the search results for a query or a user’s interests - Automatic classification skill to assign an ODP category labels to each page. - 219 categories at the top two levels of the ODP hierarchy were used ( called L) -

Modeling Search Activity • ODP Labeling - Strategy of labeling a page 1. Begin with URLs present in the ODP 2. Incrementally prunes non-present URLs until a match is found, or miss declared 3. Check for exact match with logistic regression classifier

Modeling Search Activity • Sources and Source Combinations - ODP labels automatically assigned to the following sources: 1. Query: the top 10 search results for the query 2. SERPClick: the search results clicked by the user during the search session 3. NavTrai: Web pages that the user visits from a SERP click

Modeling Search Activity • Model Definitions– Query Model(Q) - For each query, the category labels for the top 10 search results were obtained. - Probabilities are assigned to the categories in L by 1. normalized click frequencies for each top 10 results from search-engine click log data 2. the distribution across all ODP category labels - ODP categories in L that are not used to label are assigned the prior probabilities

Modeling Search Activity • Model Definitions– Context Model(X) - The context model is constructed based on actions which comprise previous data as follows: 1. Queries 2. Web pages visited through a SERP click 3. Web pages visited on the navigational trail following a SERP click

Modeling Search Activity

Modeling Search Activity • Model Definition – Intent Model(I)

Modeling Search Activity • Relevance Model or Ground Truth (R) - The relevance model contains actions that occur following the current query in the session

Modeling Search Activity

Study

Study • Learning Optimal Context Weights Steps 1. Identify the optimal context weight (w) for each query on a held out training set 2. Create features for the query and the context that could be useful in predicting w

Study • Learning Optimal Context Weights - To create a training set, the query, context, and relevance models were used to compute the optimal context weight per query by minimizing the regularized cross-entropy for each query independently.

Study A regularizer that penalizes deviations from w=0.5

Study • Generating Features of Query and Context - Divide features into three classes: 1. Query class: capturing characteristics of the current query and the query model. 2. Context class: capturing aspects of the pre-query interaction behavior as well as features of the context model themselves. 3. QueryContext: capturing aspects of how the query model and context model compare. - These features were generated for each session in the set and used to train a predictive model

Study • Generating Features of Query and Context - Query class

Study • Generating Features of Query and Context - Context class

Study • Generating Features of Query and Context - QueryContext class

study

study • Predicting the Optimal Context Weight - 60% of those queries for training, 20%for validation, 20% for testing - 10-fold cross validation was performed to improve result reliability. - The folds were constructed by splitting session, so that all queries in a session are used for either training, validation, or testing

study

study • Predicting the Optimal Context Weight The most performant features related to the information divergence to the query models and the context model

study • Predicting the Optimal Context Weight

study

study • Varying Context and Relevance Information

Conclusions • A study of investigating the effectiveness of activity-based context in predicting user’s search interests. • Explored the value of modeling the current query, its context and their combination, and different sources. • Intent models developed from many sources perform best overall. • Developed techniques to learn the optimal combinations.

Predicting Short-Term Interests Using Activity-Based Search Context

Predicting Short-Term Interests Using Activity-Based Search Context

Presentation Transcript

Short-Term Incentives

Predicting User Interests from Contextual Information

Short Term Memory

Mumbai short term rentals| Mumbai Short Term Accommodation |

Short Term Financing

Short Term Market Timing Using The VIX

Short Term

Short Term Expiration Activity

Short-term Insurance

Short Term Scheduling

Short term

Objective Short-Term Alert based on signatures of severe weather using radar.

SHORT-TERM ACTIVITY

Short – Term Scheduling

SHORT TERM MEMORY

Short Term Savings

Short Term Increased knowledge of physical activity benefits.

Short Term Loans

Short term rentals

Short Term Relationships

short term courses

Short-term disability