190 likes | 285 Views
Automatic Query Reformulation with Syntactic Operators to Alleviate Search Difficulty . Huizhong Duan, Rui Li, chengxiang Zhai University of illinois at urbana-champaign. Introduction. Search Engine No. 1 important tool for getting information. We use everyday. Queries
E N D
Automatic Query Reformulation with Syntactic Operators to Alleviate Search Difficulty Huizhong Duan, Rui Li, chengxiangZhai University of illinois at urbana-champaign
Introduction • Search Engine • No. 1 important tool for getting information. • We use everyday. • Queries • We are trained to use keyword queries. • Advanced Query Syntax • No idea what it is…
Advanced Query Syntax • Necessity Operator • E.g. green tree +street • I’m looking for a street! • Phrase Operator • E.g. “green tree street” • Not green street with trees! • Synonym Operator • E.g. green tree ~street • Hmm, I’m not sure it’s a street/road/avenue… • …… • Syntactic Operator, Syntactic Query, Syntactic Reformulation
Syntactic Operators • Extend our ability to express our information needs. • Potentially useful in formulating more effective queries.
Syntactic Operators • Are very effective if used appropriately. • Rarely used by ordinary users. • Difficult to use due to the lack of knowledge of the dataset. • Question: Can we automatically formulate syntactic queries given users’ keyword queries?
Problem formulation • Input:a keyword query q, a syntactic operator op and a target performance metric M. • Goal:to find a list of syntactic reformulations of q through the use of op:Sop(q)={q1,q2,…, qn| M(q1)>M(q2)>…>M(qn)}. • Tasks: • implicit refine: use q,q1,q2,…qmwith probabilities. • explicit refine: output top ranked query q1 if M’(q1)=M(q1)-M(q)>0, or otherwise the original query q. • diagnose query: users resort to help with an ineffective keyword query (negative / pseudo negative feedback is available)
The Model • Learning to rank • Learns a scoring function to score each sample • Pairwise or Listwise loss function • The score indicates the ranking • Score each candidate reformulation with the learned model • “green tree street” • “green tree” street • green “tree street” • green tree street
The features • Difficulty
The features • Distinguishability
The features • Negativity • Corresponds to a scenario where users resort to the reformulation only when they are not satisfied with the result from the keyword query • Negative feedback or pseudo negative feedback is available
Combining operators • Operator Combination • predict syntax queries with different operators jointly • Result-Combination • predict each operator separately and select the reformulation with the best predicted performance.
Experiments • Automatic reformulation: works for negative feedback scenario • Necessity operator: more useful for long queries • Phrase operator: more useful for short queries • Result-Combination: better than Operator-Combination • Syntactic reformulation: makes further improvement over existing negative feedback methods
Case studies • Discover representative keywords/phrases
Case studies • Discover undermatched concepts
Case studies • Eliminate ambiguities caused by matching keywords separately
conclusion • Automatic query reformulation through the use of query syntax operators • Formulate automatic syntactic reformulation as a supervised learning problem under the framework of learning to rank • Propose a set of effective features to represent the characteristics of syntax queries • Method is general, applicable to more syntactic operators