110 likes | 263 Views
A machine learning approach to improve precision for navigational queries in a Web information retrieval system. Reiner Kraft rekraft@cse.ucsc.edu. Motivation. Ranking of search results: Require high precision vs. recall
E N D
A machine learning approach to improve precision for navigational queries in a Web information retrieval system Reiner Kraft rekraft@cse.ucsc.edu
Motivation • Ranking of search results: • Require high precision vs. recall • Navigational queries (homepage finding task) should have desired result on top • Users are impatient and don’t examine low ranked results • Want to incorporate users relevance judgment to improve overall ranking
Project Goal • Use on-line learning algorithm, that given query q, find homepage hq • Rank r(q,hq) is within top k ranked search results, where k<20 • More ambitious: Let r(q,hq) =1 • Improve precision of top k search results • Algorithm design has to be space and time efficient to be of practical use
Overall setup • On-line learning algorithm based on weighted majority algorithm • Predict with weighted median for query q • User is teacher and provides reinforcements: • Negative Vote: document ranked too high (-) • Positive Vote: document ranked too low (+) • Algorithm incorporate feedback and update ranking for q
LearnRank 1 • Use good quality ranking of search engine for query q as initialization of expert’s weights • Uses matrix of experts per query q • Each expert predicts fixed rank (linear distribution) • Rows of experts are managed by k master algoritms (MA) and combine predictions • MA predict with weighted median • Master rank algorithm (MRA) then combines predictions of MA’s by sorting • Need to resolve ties using heuristics based on votes • MA’s are using fixed multiplicative update to punish poorly performing experts
The expert weight matrix Mq Example: MA1 predicts: 1 MA2 predicts: 2 MA3 predicts: 3 MRA predicts then: (d2,1),(d3,2),(d1,3)
LearnRank 2 • Uses absolute loss based on distance to voted rank • Uses shared update • Takes some of the weight of misleading experts and distributes it among the other experts • Better adaptability
Conclusion • LearnRank 1 and LearnRank 2 outperform initial search engine ranking in terms of average precision over time • LearnRank 2 performs better because of shared update (more adaptive) • Algorithms are time and space efficient and can be easily implement in search engines