1 / 24

Mining Term Association Patterns from Search Logs for Effective Query Reformulation

Mining Term Association Patterns from Search Logs for Effective Query Reformulation. Xuanhui Wang and ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign. Ineffective Queries. reduce space command latex. Effective Queries. squeeze space command latex.

trevor
Download Presentation

Mining Term Association Patterns from Search Logs for Effective Query Reformulation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Term Association Patterns from Search Logs for Effective Query Reformulation Xuanhui Wang and ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign ACM CIKM 2008, Oct. 26-30, Napa Valley

  2. Ineffective Queries reduce space command latex ACM CIKM 2008, Oct. 26-30, Napa Valley

  3. Effective Queries squeeze space command latex ACM CIKM 2008, Oct. 26-30, Napa Valley

  4. More Examples • If you want to wash your vehicle • “vehicle wash”, “auto wash” • “car wash”, “truck wash” • If you want to buy a car • “auto quotes” • “auto sale quotes”? • “auto insurance quotes”? ACM CIKM 2008, Oct. 26-30, Napa Valley

  5. What Makes a Query Ineffective? • Vocabulary mismatch • “reduce space command latex” vs “squeeze space command latex” • “auto wash” vs “car wash” • Lack of discrimination • “auto quotes” vs “auto sale quotes” • … Term substitution Term addition How can we help improving ineffective queries? ACM CIKM 2008, Oct. 26-30, Napa Valley

  6. Our Contribution • We cast query reformulation as term levelpattern mining from search logs • We define two basic types of patterns at term level and propose probabilistic methods • Context-sensitive term substitution • “autocar | _wash”, “car  auto | _trade” • Context-sensitive term addition • “+sale | auto_quotes” • We evaluate our methods on commercial search engine logs and show their effectiveness ACM CIKM 2008, Oct. 26-30, Napa Valley

  7. Problem Formulation q = auto wash Search logs Task 1:Contextual Models Task 3: Pattern Mining Query Collection autocar | _washautotruck | _wash Patterns Task 2:Translation Models +southland | _auto wash… car washtruck wash southland auto wash… Offline part Online part ACM CIKM 2008, Oct. 26-30, Napa Valley

  8. Task 1: Contextual Models • Syntagmatic relations • Capture terms frequently co-occur with w inside queries enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents… Sample query collection G: General context rental: 0.375enterprise: 0.125budget: 0.125pricing: 0.125… Model PG( * |car) ACM CIKM 2008, Oct. 26-30, Napa Valley

  9. Task 1: Contextual Models Syntagmatic relations Capture terms frequently co-occur with w inside queries enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents… Sample query collection L1: 1st Left Context rental: 0.333enterprise: 0.333budget: 0.333… Model: P L1( * | car) ACM CIKM 2008, Oct. 26-30, Napa Valley 9

  10. Task 1: Contextual Models Syntagmatic relations Capture terms frequently co-occur with w inside queries enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents… Sample query collection R1: 1st Right context rental: 0.4pricing: 0.2pictures: 0.2accidents: 0.2 … Model: P R1( * |w) ACM CIKM 2008, Oct. 26-30, Napa Valley 10

  11. Task 2: Translation Models • Paradigmatic relations (“car” and “auto”) • Capture terms that are substitutable with w • Similar contexts  high translation probability • Translation models Probability of generating s’s context from w’s contextual model Size of L1 context Size of R1 context ACM CIKM 2008, Oct. 26-30, Napa Valley

  12. Task 3.1: Pattern Mining–Term Substitution q=[w1…wi-1wiwi+1…wn] Global factor:translation model Substitute wi by s q’=[w1…wi-1swi+1…wn] Local factor Which word s should be chosen? ACM CIKM 2008, Oct. 26-30, Napa Valley

  13. Estimating Local Factor s w1…wi-1__wi+1…wn Independence … … Ignore those terms far away ACM CIKM 2008, Oct. 26-30, Napa Valley

  14. Task 3.2: Pattern Mining–Term Addition q=[w1…wi-1wi…wn] Uniform Adding r before wi q’=[w1…wi-1rwi…wn] Similar to the Local Factor in Term Substitution Patterns ACM CIKM 2008, Oct. 26-30, Napa Valley

  15. Evaluation: Data Preparation Future logs History Logs 5/1/2006 5/20/2006 5/31/2006 • From Microsoft Live Labs History Collection 4.4M queries 1.6M are distinct 1.3M user sessions Used to construct test cases ACM CIKM 2008, Oct. 26-30, Napa Valley

  16. Examples of Contextual Models • Left and Right contexts are different • General context mixed them together ACM CIKM 2008, Oct. 26-30, Napa Valley

  17. Examples of Translation Models • Conceptually similar keywords have high translation probabilities • Provide possibility for exploratory search in an interactive manner ACM CIKM 2008, Oct. 26-30, Napa Valley

  18. Examples of Term Substitution • Substitution is context sensitive • Intuitively, reworded queries are more effective ACM CIKM 2008, Oct. 26-30, Napa Valley

  19. Effectiveness Comparison of Term Substitution – Experiment Design … Q1 Q2 Session Qk R21 R22 R23 … Rk1 Rk2 Rk3 … C1 … C3 C2 How well can a reformulated query rank C1, C2, and C3 on the top? reformulation Q1 Q1’ Q2’ Q3’ dx C3 C1 C2 dx … dx C1 dx dx dx … dx C2 dx C3 dx … Best P@5=0.6 P@5 0.6 0.2 0.4 ACM CIKM 2008, Oct. 26-30, Napa Valley

  20. Results Our method [Jones’06] #Recommended Queries Our method reformulates queries more effectively ACM CIKM 2008, Oct. 26-30, Napa Valley

  21. Term Addition Patterns Term addition patterns can refine a broad query ACM CIKM 2008, Oct. 26-30, Napa Valley

  22. Related Work • Query suggestions [e.g., Jones’06, Sahami et al’06] • Discover pattern at query level • Rely on external resources or training data • Does not consider the effectiveness • Query modifications in IR [Rocchio’71, Anick’03] • Expand queries from returned documents • Does not rely on search logs, mostly adding terms • Related work in NLP community [Lin’98, Rapp’02] • Finding synonym or near synonyms • Syntagmatic and paradigmatic relations • Not used for query reformulation ACM CIKM 2008, Oct. 26-30, Napa Valley

  23. Conclusions and Future Work • We propose a new way to mine search logs for patterns to address ineffective queries • Vocabulary mismatch • Lack of discrimination • We define and mine two basic patterns at term level • Context-sensitive term substitution patterns • Context-sensitive term addition patterns • Experiments show the effectiveness of our methods • In the future, • Use relevance judgments instead of clicks • Exploit click information for better query reformulation ACM CIKM 2008, Oct. 26-30, Napa Valley

  24. Thank You! ACM CIKM 2008, Oct. 26-30, Napa Valley

More Related