320 likes | 418 Views
Search Query Disambiguation from Short Sessions. Lilyana Mihalkova & Raymond Mooney The University of Texas at Austin. Query Disambiguation. scrubs. ?. Existing Work. Well-studied problem: [e.g., Sugiyama et al. 04, Sun et al. 05, Dou et al. 07]
E N D
Search Query Disambiguationfrom Short Sessions Lilyana Mihalkova & Raymond Mooney The University of Texas at Austin
Query Disambiguation scrubs ?
Existing Work • Well-studied problem: [e.g., Sugiyama et al. 04, Sun et al. 05, Dou et al. 07] • Common Assumption: Information about each user is available over a relatively long period of time.
Privacy Concerns • NY Times: “A Face is Exposed for AOL Searcher no. 4417749” • [Conti, 06]: “Googling Considered Harmful”
Pragmatic Concerns • Identifying users across search sessions • Log-in? • IP Address? • Managing and protecting user-specific information
Proposed Setting • Base personalization only on short-term search histories • complete search histories cannot be reconstructed • Relate current session to previous short sessions of other users, based on the search activity in these sessions
How Short is Short-Term? Number of sessions with that many queries Number of queries before ambiguous query
98.7 fm huntsville hospital www.star987.com www.huntsvillehospital.com kroq ebay.com www.kroq.com www.ebay.com scrubs scrubs ??? ??? scrubs.com scrubs-tv.com Is This Enough Info?
More Closely Related Work • [Almeida & Almeida 04]: Similar assumption of short sessions, but better suited for a specialized search engine (e.g. on computer science literature) • [Krause & Horvitz 08]: Explicitly models the tradeoff between better performance and more user information.
Main Challenge • How to harness this small amount of potentially noisy information available for each user? • Exploit the relations among users, sessions, URLs • Use statistical relational learning (SRL) [Getoor & Taskar 07]
huntsville hospital huntsvillehospital.org ebay ebay.com scrubs ??? Using Relational Information huntsville school . . . scrubs scrubs.com . . . hospitallink.com scrubs scrubs-tv.com … ebay.com
Details • Used Markov logic networks (MLNs) [Richardson & Domingos 06] • MLN structure is provided as domain knowledge • Weights are learned from the data • Weight learning: Adapted contrastive divergence [Lowd & Domingos 07] for incremental learning
Predicates • Evidence predicates • provide information about clicked URLs and keywords shared between sessions, i.e. • shares-keyword-between-clicks(ActiveS, backgroundS, keyword) • shares-keyword-between-click-and-search(ActiveS, backgroundS, keyword) • shares-clicks(ActiveS, BackgroundS, hostname) • provide information about clicked URLs and keywords in current session • Query predicate • states that user will chose particular URL • clicks-on(ActiveS, hostname)
Re-Ranking of Search Results • Search engine produces a list of search results • For each possible search result R, compute the probability that clicks-on(ActiveS, R) • Rank the search results by their likelihood of being clicked
ambiguous query some query www.clickedResult.com www.someplace1.com . . . ambiguous query www.someplace1.com MLN 1 • User will click on at least one result • User will select result chosen by previous user with whom a click is shared
ambiguous query some query www.clickedResult.com www.someplace1.com some other ambiguous query www.aClick.com MLN 2 • MLN1 + • User will select result chosen by previous user with whom a keyword is shared • click-to-click, click-to-search, search-to-click, search-to-search
MLN 3 • MLN 2 + • User will choose result that shares a keyword with a previous search or click in the current session www.someResult.com some query www.anotherPossibility.com www.someplace1.com www.yetAnother.com ambiguous query
Data • Collected from the MSN engine in May 2006 • Contains time-stamped records of searches and clicked URLs, grouped by sessions • Average session length is 3.28 • No across-session identifiers • Used first 25 days for training/validation and last 6 days for test
Data Limitation #1: • Data does not specify what queries are ambiguous • Consider query as ambiguous, if over all pages clicked after searching for this query, at least 2 fall in different high-level categories in the DMOZ (dmoz.org) hierarchy. • Limit to query strings of up to two words (43.7%) • 6,360 ambiguous queries (2.4% of all two-word query strings)
Data Limitation #2 • Data does not provide the full list of search results presented to the user; only the ones actually clicked • Assume that the URLs seen by the user are those clicked by at least one person after searching for the exact query string • Consequence: result sets have differing lengths
Result Set Sizes Number of queries with that result set size Size of result set for ambiguous query
Evaluation Metrics: MAP • Mean average precision • identical to the area under the interpolated precision/recall curve
Evaluation Metrics: AUC-ROC • Area under the ROC curve • identical to the mean average true negative rate
Baselines • Random: Rank randomly • Click-Sim: Rank by similarity based on shared clicks • Click-KW-Sim: Rank by similarity based on shared clicks and keywords
Click-Sim . . . huntsville hospital scrubs huntsvillehospital.org scrubs.med scrubs Average similarity . . . based on shared clicks scrubs ??? scrubs.med . . . . . . . . . . . . scrubs scrubs scrubs scrubs scrubs.med scrubs.med scrubs.tv scrubs.tv scrubs.tv
Click-KW-Sim . . . huntsville hospital scrubs huntsvillehospital.org scrubs.med scrubs Average similarity . . . based on shared clicks scrubs ??? and keywords scrubs.med . . . . . . . . . . . . scrubs scrubs scrubs scrubs scrubs.med scrubs.med scrubs.tv scrubs.tv scrubs.tv
Results (MAP) * * * *
Results (AUC-ROC) * * *
Current/Future Work • Incorporating more information in the models • Actual content of clicked pages • Popularity of pages • Weighing evidence based on how close it is in time to ambiguous query • Learning separate weights for each connecting keyword or domain/group of keywords or domains • Revising the provided clauses
1 • The popularity of a possible result provides a strong signal, but providing relational information on top of popularity gives further performance improvements • Rank by popularity + click-KW-Sim baseline: • MAP (0.383), AUC-ROC (0.536) • Rank by popularity only: • MAP(0.380), AUC-ROC (0.525)
2 Number of sessions with that many clicks Number of distinct clicks before ambiguous query