420 likes | 554 Views
Personalized Query Expansion for the Web. Paul- Alexandru Chirita , Claudiu S. Firan , Wolfgang Nejdl. Gabriel Barata. Motivation. by Tojosan @ Flickr. What is query expansion?. Add meaningful search terms to the query…. What is PIR based query expansion?.
E N D
Personalized Query Expansion for the Web Paul-AlexandruChirita, Claudiu S. Firan, Wolfgang Nejdl Gabriel Barata
Motivation byTojosan @ Flickr
What is query expansion? Add meaningful search terms to the query…
What is PIR based query expansion? Add meaningful search terms to the query… … related to the use’s interests.
Why PIR based query expansion? More personalization quality! More privacy!
Example Google search: “canon book”
Example Top 3 results: • The Canon: A Whirligig Tour of the Beautiful Basics of Science (Hardcover) @ Amazon • Western Canon @ Wikipedia • Biblical Canon @ Wikipedia
Example Top 3 results: • The Canon: A Whirligig Tour of the Beautiful Basics of Science (Hardcover) @ Amazon • Western Canon @ Wikipedia • Biblical Canon @ Wikipedia
Example Expanded query: “canon book bible”
Example Top 3 results: • Biblical Canon @ Wikipedia • Books of the Bible @ Wikipedia • The Canon of the Bible @ catholicapologetics.org
Query Expansion using Desktop data by Old Shoe Woman @ Flickr
Algorithms • Expanding with Local Desktop Analysis • Expanding with Global Desktop Analysis
Algorithms • Expanding with Local Desktop Analysis • Expanding with Global Desktop Analysis
Expanding with Local Desktop Analysis • Term and Document Frequency • Lexical Compounds • Sentence Selection
Expanding with Local Desktop Analysis • Term and Document Frequency • Lexical Compounds • Sentence Selection
Expanding with Local Desktop Analysis • Term and Document Frequency • Lexical Compounds • Sentence Selection
Lexical Compounds { adjective? Noun+ }
Expanding with Local Desktop Analysis • Term and Document Frequency • Lexical Compounds • Sentence Selection
Expanding with Global Desktop Analysis • Term Co-occurrence Statistics • Thesaurus based Expansion
Expanding with Global Desktop Analysis • Term Co-occurrence Statistics • Thesaurus based Expansion
Expanding with Global Desktop Analysis • Term Co-occurrence Statistics • Thesaurus based Expansion
Experiments & Evaluation by Canadian Museum of Nature @ Flickr
Experiments • 18 users • Files indexed within user selected paths, Emails and Web cache
Experiments • They chose 4 queries: • 1 from the top 2% log queries (avg. length = 2.0) • 1 random log query (avg. length = 2.3) • 1 self-selected specific query (avg. length = 2.9) • 1 self-selected ambiguous query (avg. length = 1.8)
Evaluation • Evaluated algorithms: • Google: Google query output • TF, DF: Term and Document Frequency • LC, LC[O]: Regular and Optimized Lexical Compounds • TC[CS], TC[MI], TC[LR]: Term Co-occurrences Statistics using Cosine Similarity, Mutual Information and Likelihood Ratio • WN[SYN], WN[SUB], WN[SUP]: WordNet based expansion with synonyms, sub-concepts and super-concepts.
Results Log queries:
Results Self-selected queries:
Introducing Adaptativity by RavenCore17 @ Flickr
Experiments • Same experimental setup as for the previous analyzis.
Results Log queries:
Results Self-selected queries:
Conclusions by ThisIsIt2 @ Flickr
Conclusions • Five techniques for determining expansion terms from personal documents. • Empirical analysis showed that these approaches perform very well. • Expansion process adapts accordingly to query features. • Adaptive expansion process proved to yield significant improvements over the static one.
End Any questions?