1 / 19

Improving Similarity Measures for Short Segments of Text

Improving Similarity Measures for Short Segments of Text. Scott Wen -tau Yih & Chris Meek Microsoft Research. query mariners. Query Suggestion. How similar are they? mariners vs. seattle mariners mariners vs. 1st mariner bank. Keyword Expansion for Online Ads. Chocolate Cigarettes.

haruko
Download Presentation

Improving Similarity Measures for Short Segments of Text

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving Similarity Measures for Short Segments of Text Scott Wen-tau Yih & Chris Meek Microsoft Research

  2. querymariners Query Suggestion How similar are they? mariners vs. seattle mariners • mariners vs. 1st mariner bank

  3. Keyword Expansion for Online Ads Chocolate Cigarettes How similar are they? • chocolate cigarettes vs. cigarettes • chocolate cigarettes vs. chocolate cigars • chocolate cigarettes vs. old fashioned candy Candy cigarettes Chocolate cigars Chocolate candy Old fashioned candy Nostalgic candy Novelty candy

  4. Measuring Similarity • Goal: create a similarity function • fsim: (String1,String2)  R • Rank suggestions • Fix String1 as q; vary String2 as s1 , s2 ,  , sk • Whether the function is symmetric is not important • For query suggestion – fsim(q,s) • fsim(“mariners”, “seattle mariners”) = 0.9 • fsim(“mariners”, “1st mariner bank”) = 0.6

  5. Enabling Useful Applications • Web search • Ranking query suggestions • Segmenting web sessions using query logs • Online advertising • Suggesting alternative keywords to advertisers • Matching similar keywords to show ads • Document writing • Providing alternative phrasing • Correcting spelling errors

  6. Challenges • Short text segments may not overlap • “Microsoft Research” vs. “MSR”  0 cosine score • Ambiguous terms • “Bill Gates” vs. “Utility Bill”  0.5 cosine score • “taxi runway” vs. “taxi”  0.7 cosine score • Text segments may rarely co-occur in corpus • “Hyatt Vancouver” vs. “HayttVancover”  1 page • Longer query  Fewer pages

  7. Our Contributions • Web-relevance similarity measure • Represent the input text segments as real-valued term vectors using Web documents • Improve term weighting scheme based on relevant keyword extraction • Learning similarity measure • Fit user preference for the application better • Compare learning similarity function vs. learning ranking function

  8. Outline • Introduction • Problem, Applications, Challenges • Our Methods • Web-relevance similarity function • Combine similarity measures using learning • Learning similarity function • Learning ranking function • Experiments on query suggestion

  9. Web-relevance Similarity Measure • Query expansion of x using a search engine • Let Dn(x) be the set of top n documents • Build a term vector vi for each document diDn(x) • Elements are scores representing the relevancy of the words in document di • C(x) = 1n vi /||vi|| (L2-normalized, centroid) • QE(x) = C(x) / ||C(x)|| (L2-normalized) • Similarity score is simply the inner product • fsim (q,s) = QE(q)  QE(s)

  10. Web-kernel Similarity • Relevancy = TFIDF [Sahami&Heilman ‘06] • Why TFIDF? • High TF: important or relevant to the document • High DF: stopwords or words in template blocks • Crude estimate of the importance of the word Can we do better than TFIDF?

  11. Web-relevance Similarity • Relevancy = Prob(relevance | wj,di) • Keyword extraction can judge the importance of the words more accurately! [Yih et al. WWW-06] • Assign relevancy scores (probabilities) to words/phrases • Machine Learning model learned by logistic regression • Use more than 10 categories of features • Query-log frequency • High-DF words may be popular queries • The position of the word in the document • The format, hyperlink, etc.

  12. Learning Similarity • Similarity measures should depend on application • q=“Seattle Mariners” s1=“Seattle”s2=“Seattle Mariners Ticket” • Let human subjects decide what’s similar • Parametric similarity function fsim(q,s|w) • Learn the parameter (weights) from data • Use Machine Learning to combine multiple base similarity measures

  13. Base Similarity Measures • Surface matching methods • Suppose Q and S are the sets of words in a given pair of query q and suggestion s • Corpus-basedmethods • Web-relevance, Web-kernel, KL-divergence

  14. Learning Similarity Function • Data – pairs of query and suggestion (qi,sj) • Label: Relevance judgment (rel=1 or rel=0) • Features: Scores on (qi,sj) provided by multiple base similarity measures • We combine them using logistic regression z = w1Cosine(q,s) + w2Dice(q,s) + w3Matching(q,s) + w4Web-relevance(q,s) + w5KL-divergence(q,s) +  fsim(q,s|w) = Prob(rel|q,s;w) = exp(z)/(1+exp(z))

  15. Learning Ranking Function • We compare suggestions sj,sk to the same query q • Data – tuples of a query q and suggestions sj, sk • Label: [sim(q,sj) > sim(q,sk)] or[sim(q,sj) < sim(q,sk)] • Features: Scores on pairs (q,sj) and (q,sk) provided by multiple base similarity measures • Learn a probabilistic model using logistic regression • Prob([sim(q,sj) > sim(q,sk)] | q,sj,sk;w)

  16. Experiments • Data: Query suggestion dataset [Metzler et al. ’07] • |Q| = 122, |(Q,S)| = 4852; {Ex,Good} vs. {Fair,Bad} • Results • 10-fold cross-validation • Evaluation metrics: AUC and Precision@k

  17. AUC Scores

  18. Precision@3

  19. Conclusions • Web-relevance • New term-weighting scheme from keyword extraction • Outperform existing methods on query suggestion • Learning similarity • Fit the application – better suggestion ranking • Learning similarity function vs. learning ranking function • Future work • Experiment with alternative combination methods • Explore other probabilistic models for similarity • Apply our similarity measures to different tasks

More Related