1 / 46

Learning to Rank --A Brief Review

Learning to Rank --A Brief Review. Yunpeng Xu. Ranking and sorting. Rank: only has K structured categories Sorting: each sample has a distinct rank Generally, no need to differentiate them. Overview. Rank aggregation Label ranking Query and rank by example Preference learning

abrial
Download Presentation

Learning to Rank --A Brief Review

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning to Rank --A Brief Review Yunpeng Xu

  2. Ranking and sorting • Rank: only has K structured categories • Sorting: each sample has a distinct rank • Generally, no need to differentiate them

  3. Overview • Rank aggregation • Label ranking • Query and rank by example • Preference learning • Problems left, what we can do?

  4. Ranking aggregation • Needs of combining different ranking results • Voting systems, welfare economics, decision making 1. Hillary Clinton > John Edwards > Barack Obama 2. Barack Obama >John Edwards > Hillary Clinton => ?

  5. Ranking aggregation (cont.) • Arrow’s impossibility theorem • Kenneth Arrow, 1951 If the decision-making body has at least two members and at least three options to decide among, then it is impossible to design a social welfare function that satisfies all these conditions at once.

  6. Ranking aggregation (cont.) • Arrow’s impossibility theorem • 5 fair assumptions • non-dictatorship, unrestricted domain or universality, independence of irrelevant alternatives, positive association of social and individual values or monotonicity, non-imposition or citizen sovereignty • Cannot be satisfied simultaneously

  7. Ranking aggregation (cont.) • Borda’s method (1971) • Given lists , each has n items • For each Define as the number of items rank below j in • Rank all items by Hillary Clinton: 2, John Edwards: 2, Barack Obama: 2

  8. Ranking aggregation (cont.) -- Border • Condorcet Criteria • If the majority prefers x to y, then x must be ranked above y • Border’s method does not satisfy CC, neither any method that assigns weights to each rank position

  9. Ranking aggregation (cont.) • Assumption relaxation • Maximize consensus criteria • Equivalent to minimize disagreement (Kemeny, Social Choice Theorem) • NP Hard! • Sub-optimal solutions using heuristics

  10. Ranking aggregation (cont.) • Basic idea • Assign different weights to different experts • Supervised aggregation • Weighting according to a final judger (ground truth) • Unsupervised aggregation • Aims to minimize the disagreement measured by certain distances

  11. Ranking aggregation (cont.) • Distance measure • Spearman footrule distance • Kendal tau distance • Kendal tau distance for multiple lists • Scaled footrule distance

  12. Ranking aggregation (cont.) -Distance Measure • Kemeny optimal ranking • Minimizing Kendal distance • Still NP-Hard to compute • Local Kemenization (local optimal aggregation) • Can be computed in O(knlogn)

  13. Ranking aggregation (cont.) • Supervised Ranking Aggregation (SRA WWW07) • Ground truth: preference matrix H • Example • Goal: rank by the score • It can be seen that , or with relaxation

  14. Ranking aggregation (cont.) -- SRA • Method • Use Borda’s score • Objective

  15. Ranking aggregation (cont.) • Markov Chain Rank Aggregation(MCRA, WWW05) • Map a ranked list to a Markov Chain M • Compute the stationary distribution of M • Rank items based on • Example: • B > C > D • A > D > E • A > B > E

  16. Ranking aggregation (cont.) - MCRA • Different transition strategies • MC1 all out-degree edgeshave uniform probabilities • MC2 choose a list, then choose next item on the list; • … • For disconnected graph, define transition probability based on measure item similarity

  17. Ranking aggregation (cont.) • Unsupervised Learning Algorithm for Rank Aggregation(ULARA: Dan Roth ECML07) • Goal: • Method: maximize agreement

  18. Ranking aggregation (cont.) - UCLRA • Method • Algorithm: iterative gradient decent • Initially, w is uniform, then updated iteratively

  19. Overview • Rank aggregation • Label ranking • Query and rank by example • Preference learning • Problems left, what we can do?

  20. Mountain > Sea> Beach Label Ranking • Goal: Map from the input space to the set of total order over a finite set of labels • Related to multi-label or multi-class problems Input: Customer information Output: Porsche > Toyota > Ford

  21. Label Ranking (cont.) • Pairwise ranking (ECML03) • Train a classifier for each pair of labels • When judge on an example : If the classifier predicts , then count it as a vote on Then rank all labels according to their votes • Total classifiers

  22. Label Ranking (cont.) • Constraint Classification (NIPS 02) • Consider a linear sorting function • Goal: learn the values of rank all labels by the score

  23. Label Ranking (cont.) -- CC • Expand the feature vector • Generate positive/ negative samples in

  24. Label Ranking (cont.) -- CC • Learn a separating hyper plane • Can be solved by SVM

  25. Overview • Rank aggregation • Label ranking • Query and rank by example • Preference learning • Problems left, what we can do?

  26. Query and rank by example • Given one query, rank retrieved items according to their relevancy w.r.t the query.

  27. Query and rank by example (cont.) • Rank on manifold • Convergence form • Essentially, this is an one-class semi-supervised method

  28. Preference learning • Given a set of items, and a set of user preference over these items, to rank all items according to the user preference. • Motivated by the needs of personalized search.

  29. Preference learning • Input: preference: a set of partial order on X Output: a total order on X or, map X onto a structured label space Y • Preference function

  30. Existing methods • Learning to order things [W. Cohen 98] • Large margin ordinal regression [R. Herbrich 98] • PRanking with Ranking[K Crammer 01] • Optimizing Search Engines using Clickthrough Data [T Joachims 02] • Efficient boosting algorithm for combining preferences[Yoav Freund 03] • Classification Approach towards Rankingand Sorting Problems[SRajaram 03]

  31. Existing methods • Learning to Rank using Gradient Descent[C Burges 05] • Stability and Generalization of Bipartite Ranking[S Agarwal 05] • Generalization Bounds for k-Partite Ranking[S Rajaram 05] • Ranking with a p-norm push [C Rudin 05] • Magnitutde-Preserving Ranking Algorithms [C Cortes 07] • From Pairwise Approach to Listwise[Z Cao 07]

  32. Large Margin Ordinal Regression • Mapping to an axis using inner product

  33. Large Margin Ordinal Regression • Consider • Then • Introduce soft margin • Solve using SVM

  34. Learn to order things • A greedy ordering algorithm to order things Calculate a score for each item

  35. Learn to order things (cont.) • Combine different ranking functions • To learn the weight iteratively

  36. Learn to order things Combine preference functions Do ranking aggregation Update weights based on feedbacks

  37. Initially, w is uniform • At each step • Compute a combined ranking function • Produce a ranking aggregation • Measure the loss

  38. RankBoost • Bipartite ranking problems • Combine weaker rankers • Sort based on values of H(x)

  39. RankBoost (cont.) Sampling distribution Initialization • Bipartite ranking problem Learn weak ranker Sampling distribution updation normalization Combine weak rankers

  40. Stability and Generalization • Bipartite ranking problems • Expected rank error • Empirical rank error

  41. Stability and Generalization (cont.) • Stability • Remove one training sample, how much changes • Generalization • Generalize to k-partite ranking problem…

  42. Rank on graph data • Objective

  43. P-norm push • Focus on the topmost ranked items • The top left region is the most important

  44. P-norm push (cont.) • Height of k (k is a negative sample) Cost of sample k: g is convex, monotonically incresasing

  45. p-norm push • Run RankBoost to solve the problem

  46. Thanks!

More Related