230 likes | 302 Views
Task-aware query recommendation. Henry Feild James Allan Center for Intelligent Information Retrieval University of Massachusetts Amherst July 29, 2013 Dublin, Ireland. Related searches: dsec info dsec delaware solar energy coalition created
E N D
Task-aware query recommendation Henry Feild James Allan Center for Intelligent Information Retrieval University of Massachusetts Amherst July 29, 2013 Dublin, Ireland
Related searches: dsec info dsec delaware solar energy coalition created deaf smith electric cooperative created dayton service engineers collaborative created Delaware Solar Energy Coalition The Delaware Solar Energy Coalition (DSEC) was founded… http://delsec.org when was the dsec created what is the dupont science essay contest when was the dsec created Deaf Smith Electric Cooperative – Home The Deaf Smith Electric Cooperative (DSEC). http://dsec.org Related searches: dupontchallenge information dupont essay submission dupont challenge history deaf smith electric cooperative created dayton service engineers collaborative created DuPont Challenge The DuPont Challenge calls on students to rsearch, think critically… http://thechallenge.dupont.com DuPont For more than 200 years, DuPont has brought world-class science and engineering… http://www.dupont.com
Potential benefits of context Find key terms/phrases crisis intervention professional services crisis intervention services object management group andrewwatson watsonomg elliptical trainer elliptical trainer benefits michworks michigan unemployed Disambiguation Term expansion
Likelihood of x-length tasks 57% of AOL tasks consist of 2 or more queries Based on a sample of 503 AOL users; tasks were manually labeled by UMass students.
A couple of issues Interleaved tasks Multi-tasking within a fixed window pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits You are more likely than not to encounter off-task queries even going back one query! Aggregated over 503 AOL users 3 days of Yahoo! data 3 months of AOL data Jones & Klinkner (CIKM 2009)
Ideal system Identify on-task queries pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits pampered chef hosting pampered chef elliptical trainer elliptical trainer benefits Ignore off-task queries
Research questions Does on-task context help? 1 pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits How much does off-task context hurt? 2 How well does current technology deal with mixed contexts? 3
Basic models Reference query only pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits Term-Query Graph Query Recommender (Bonchiet al. SIGIR’12) 0.6 the benefits of an elliptical trainer 0.5 image 8.0 elliptical trainer 0.4 total trainer pro 0.4 total trainer pro reviews
Basic models Reference query only pampered chef pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits elliptical trainer hosting pampered chef elliptical trainer benefits 0.6 the benefits of an elliptical trainer 0.5 image 8.0 elliptical trainer 0.4 total trainer pro 0.4 total trainer pro reviews Decay model λ: 0.4 λ: 0.6 λ: 0.8 0.7 pampered chef merchandise 0.4 pampered chef parties 0.3 cooking supplies 0.3 pampered chef stuff λ: 1.0 0.6 orbitrek elliptical trainer 0.4 image 8.0 elliptical trainer 0.4 total trainer pro 0.2 total trainer pro reviews 0.6 pampered chef parties 0.3 pampered chef parties 0.2 cooking supplies 0.2 pampered chef stuff 0.6 the benefits of an elliptical trainer 0.5 image 8.0 elliptical trainer 0.4 total trainer pro 0.4 elliptical trainer vs treadmill 0.2 elliptical trainer vs treadmill 0.2 the benefits of an elliptical trainer 0.1 total trainer pro 0.1 orbitrek elliptical trainer
Basic models Reference query only pampered chef pampered chef elliptical trainer hosting pampered chef elliptical trainer benefits elliptical trainer hosting pampered chef elliptical trainer benefits 0.6 the benefits of an elliptical trainer 0.5 image 8.0 elliptical trainer 0.4 total trainer pro 0.4 total trainer pro reviews Decay model λ: 0.4 λ: 0.6 λ: 0.8 0.7 pampered chef merchandise 0.4 pampered chef parties 0.3 cooking supplies 0.3 pampered chef stuff λ: 1.0 0.6 orbitrek elliptical trainer 0.4 image 8.0 elliptical trainer 0.4 total trainer pro 0.2 total trainer pro reviews 0.6 pampered chef parties 0.3 pampered chef parties 0.2 cooking supplies 0.2 pampered chef stuff 0.6 the benefits of an elliptical trainer 0.5 image 8.0 elliptical trainer 0.4 total trainer pro 0.4 elliptical trainer vs treadmill 0.2 elliptical trainer vs treadmill 0.2 the benefits of an elliptical trainer 0.1 total trainer pro 0.1 orbitrek elliptical trainer
Task-aware models Hard task threshold model 0.0 0.01 0.01 1.0 0.90 0.90 elliptical trainer benefits pampered chef elliptical trainer benefits hosting pampered chef elliptical trainer elliptical trainer elliptical trainer hosting pampered chef elliptical trainer benefits hosting pampered chef pampered chef pampered chef 0.0 0.02 0.02 Same task? (Lucchese et al. WSDM’11) 1.0 1.0 weight = task_decay (decay over on-task queries only) Soft task threshold model λ: 0.1 0.01 λ: 0.8 λ: 0.5 0.90 λ: 0.1 0.02 λ: 1.0 λ: 1.0 1.0 weight = decay × same_task_score
Task-aware models Hard task threshold model 0.0 0.01 0.01 1.0 0.90 0.90 elliptical trainer benefits pampered chef elliptical trainer benefits hosting pampered chef elliptical trainer elliptical trainer elliptical trainer hosting pampered chef elliptical trainer benefits hosting pampered chef pampered chef pampered chef 0.0 0.02 0.02 Same task? (Lucchese et al. WSDM’11) 1.0 1.0 weight = task_decay (decay over on-task queries only) Soft task threshold model λ: 0.1 0.01 λ: 0.8 λ: 0.5 0.90 λ: 0.1 0.02 λ: 1.0 λ: 1.0 1.0 weight = decay × same_task_score
Task-aware models (cont’d) 0.0 0.0 elliptical trainer benefits hosting pampered chef pampered chef elliptical trainer pampered chef hosting pampered chef elliptical trainer benefits elliptical trainer 0.0 0.0 weight = decay × same_task_scoreif on-task; 0 otherwise (soft task score for on-task queries; 0 for off-task queries) Firm task threshold model 1 Firm task threshold model 2 0.01 0.01 0.0 0.7 0.5 0.90 0.90 0.0 0.02 0.02 1.0 1.0 1.0 1.0 weight = task_decay × same_task_score
Data and evaluation Query recommendations: 2006 AOL search log used to bulida Term-Query Graph Tasks: 2010—2011 TREC Session Track On-task contexts - 212 judged sessions - 2+ queries/session - task = session Off-task contexts Mixed contexts Evaluation - When’s the soonest a user can find a novel document? - MRR over documents retrieved with the top scoring recommendation (ClueWeb ‘09) - removed documents retrieved in top 10 of context queries 0.6 the benefits of an elliptical trainer 0.5 image 8.0 elliptical trainer 0.4 total trainer pro 0.4 elliptical trainer vs treadmill ClueWeb elliptical-trainer.com sportsequipment.com totaltrainer.com …
Effect of on-/off-task context Does on-task context help? 1 on-task context reference-query only How much does off-task context hurt? 2 off-task context How far back in the user’s history we look, including the reference query Off-task context hurts, on average Yes, on average
Effect of noise (mixed contexts) Firm1-, Firm2-, and hard-task reference-query only soft-task decay How well does current technology deal with mixed contexts? — Pretty well 3 Soft task threshold model is easily distracted by noise Decay model can’t handle even 1 noisy query Firm 1 & 2 and Hard Task Threshold models are robust to noise Lucchese et al. (WSDM 2011)
Variance across task set Some really high highs Some really low lows One tiny little gain Tasks MRR Difference Tasks Several moderate lows
Conclusions • on-task context can be very helpful • it can also hurt • off-task context is bad • current state-of-the-art task segmentation works well
Future work • do these results hold for other query recommendation algorithms? • are our findings consistent across additional tasks? • can we predict when context will help?
Example Reference-query only recommendations: 0.08 alabama satellite internet providers 0.25 sattelite internet 1.00 satellite internet providers 0.02 satelite internet for 30.00 0.06 satellite internet providers northern california Decay recommendations: 0.00 2006 volvo xc90 reviews 0.00 volvo xc90 reviews 1.00 hughesinternet.com 1.00 hughessatelite internet 0.08 alabama satellite internet providers Hard task recommendations: 1.00 hughesinternet.com 1.00 hughessatelite internet 0.08 alabama satellite internet providers 0.17 satellite high speed internet 0.25 sattelite internet Context: 5. (1.00) satellite internet providers 4. (0.44) hughes internet 3. (0.02) ocdbeckham 2. (0.04) reviews xc90 1. (0.04) buy volvo semi trucks MRR same-task score MRR MRR
Data and evaluation Query recommendations: 2006 AOL search log used to bulida Term-Query Graph Tasks: 2010—2011 TREC Session Track Mixed contexts On-task contexts Off-task contexts - 50 × 212 sessions - all queries but the last are replaced by queries from other tasks - 50 samples per TREC session - 212 judged sessions - 2+ queries/session - task = session - 10 × 50 × 212 sessions - added up to 10 off-task queries per session - 50 samples per TREC session and noise level Evaluation - MRR over documents retrieved with the top scoring recommendation (ClueWeb ‘09) - removed documents retrieved in top 10 of context queries