Reiner Kraft rekraft@cse.ucsc

A machine learning approach to improve precision for navigational queries in a Web information retrieval system Reiner Kraft rekraft@cse.ucsc.edu

Motivation • Ranking of search results: • Require high precision vs. recall • Navigational queries (homepage finding task) should have desired result on top • Users are impatient and don’t examine low ranked results • Want to incorporate users relevance judgment to improve overall ranking

Project Goal • Use on-line learning algorithm, that given query q, find homepage hq • Rank r(q,hq) is within top k ranked search results, where k<20 • More ambitious: Let r(q,hq) =1 • Improve precision of top k search results • Algorithm design has to be space and time efficient to be of practical use

Overall setup • On-line learning algorithm based on weighted majority algorithm • Predict with weighted median for query q • User is teacher and provides reinforcements: • Negative Vote: document ranked too high (-) • Positive Vote: document ranked too low (+) • Algorithm incorporate feedback and update ranking for q

LearnRank 1 • Use good quality ranking of search engine for query q as initialization of expert’s weights • Uses matrix of experts per query q • Each expert predicts fixed rank (linear distribution) • Rows of experts are managed by k master algoritms (MA) and combine predictions • MA predict with weighted median • Master rank algorithm (MRA) then combines predictions of MA’s by sorting • Need to resolve ties using heuristics based on votes • MA’s are using fixed multiplicative update to punish poorly performing experts

The expert weight matrix Mq Example: MA1 predicts: 1 MA2 predicts: 2 MA3 predicts: 3 MRA predicts then: (d2,1),(d3,2),(d1,3)

LearnRank 2 • Uses absolute loss based on distance to voted rank • Uses shared update • Takes some of the weight of misleading experts and distributes it among the other experts • Better adaptability

Average precision of one query over time

Average Votes Distribution

Average Precision compared to initial search engine ranking

Conclusion • LearnRank 1 and LearnRank 2 outperform initial search engine ranking in terms of average precision over time • LearnRank 2 performs better because of shared update (more adaptive) • Algorithms are time and space efficient and can be easily implement in search engines

Reiner Kraft rekraft@cse.ucsc

Reiner Kraft rekraft@cse.ucsc

Presentation Transcript

kraft coupons

M. J. Reiner

Reiner Fine Jewelry

Kraft dinner

M. J. Reiner

Kraft Packaging