480 likes | 849 Views
Learning to Rank --A Brief Review. Yunpeng Xu. Ranking and sorting. Rank: only has K structured categories Sorting: each sample has a distinct rank Generally, no need to differentiate them. Overview. Rank aggregation Label ranking Query and rank by example Preference learning
E N D
Learning to Rank --A Brief Review Yunpeng Xu
Ranking and sorting • Rank: only has K structured categories • Sorting: each sample has a distinct rank • Generally, no need to differentiate them
Overview • Rank aggregation • Label ranking • Query and rank by example • Preference learning • Problems left, what we can do?
Ranking aggregation • Needs of combining different ranking results • Voting systems, welfare economics, decision making 1. Hillary Clinton > John Edwards > Barack Obama 2. Barack Obama >John Edwards > Hillary Clinton => ?
Ranking aggregation (cont.) • Arrow’s impossibility theorem • Kenneth Arrow, 1951 If the decision-making body has at least two members and at least three options to decide among, then it is impossible to design a social welfare function that satisfies all these conditions at once.
Ranking aggregation (cont.) • Arrow’s impossibility theorem • 5 fair assumptions • non-dictatorship, unrestricted domain or universality, independence of irrelevant alternatives, positive association of social and individual values or monotonicity, non-imposition or citizen sovereignty • Cannot be satisfied simultaneously
Ranking aggregation (cont.) • Borda’s method (1971) • Given lists , each has n items • For each Define as the number of items rank below j in • Rank all items by Hillary Clinton: 2, John Edwards: 2, Barack Obama: 2
Ranking aggregation (cont.) -- Border • Condorcet Criteria • If the majority prefers x to y, then x must be ranked above y • Border’s method does not satisfy CC, neither any method that assigns weights to each rank position
Ranking aggregation (cont.) • Assumption relaxation • Maximize consensus criteria • Equivalent to minimize disagreement (Kemeny, Social Choice Theorem) • NP Hard! • Sub-optimal solutions using heuristics
Ranking aggregation (cont.) • Basic idea • Assign different weights to different experts • Supervised aggregation • Weighting according to a final judger (ground truth) • Unsupervised aggregation • Aims to minimize the disagreement measured by certain distances
Ranking aggregation (cont.) • Distance measure • Spearman footrule distance • Kendal tau distance • Kendal tau distance for multiple lists • Scaled footrule distance
Ranking aggregation (cont.) -Distance Measure • Kemeny optimal ranking • Minimizing Kendal distance • Still NP-Hard to compute • Local Kemenization (local optimal aggregation) • Can be computed in O(knlogn)
Ranking aggregation (cont.) • Supervised Ranking Aggregation (SRA WWW07) • Ground truth: preference matrix H • Example • Goal: rank by the score • It can be seen that , or with relaxation
Ranking aggregation (cont.) -- SRA • Method • Use Borda’s score • Objective
Ranking aggregation (cont.) • Markov Chain Rank Aggregation(MCRA, WWW05) • Map a ranked list to a Markov Chain M • Compute the stationary distribution of M • Rank items based on • Example: • B > C > D • A > D > E • A > B > E
Ranking aggregation (cont.) - MCRA • Different transition strategies • MC1 all out-degree edgeshave uniform probabilities • MC2 choose a list, then choose next item on the list; • … • For disconnected graph, define transition probability based on measure item similarity
Ranking aggregation (cont.) • Unsupervised Learning Algorithm for Rank Aggregation(ULARA: Dan Roth ECML07) • Goal: • Method: maximize agreement
Ranking aggregation (cont.) - UCLRA • Method • Algorithm: iterative gradient decent • Initially, w is uniform, then updated iteratively
Overview • Rank aggregation • Label ranking • Query and rank by example • Preference learning • Problems left, what we can do?
Mountain > Sea> Beach Label Ranking • Goal: Map from the input space to the set of total order over a finite set of labels • Related to multi-label or multi-class problems Input: Customer information Output: Porsche > Toyota > Ford
Label Ranking (cont.) • Pairwise ranking (ECML03) • Train a classifier for each pair of labels • When judge on an example : If the classifier predicts , then count it as a vote on Then rank all labels according to their votes • Total classifiers
Label Ranking (cont.) • Constraint Classification (NIPS 02) • Consider a linear sorting function • Goal: learn the values of rank all labels by the score
Label Ranking (cont.) -- CC • Expand the feature vector • Generate positive/ negative samples in
Label Ranking (cont.) -- CC • Learn a separating hyper plane • Can be solved by SVM
Overview • Rank aggregation • Label ranking • Query and rank by example • Preference learning • Problems left, what we can do?
Query and rank by example • Given one query, rank retrieved items according to their relevancy w.r.t the query.
Query and rank by example (cont.) • Rank on manifold • Convergence form • Essentially, this is an one-class semi-supervised method
Preference learning • Given a set of items, and a set of user preference over these items, to rank all items according to the user preference. • Motivated by the needs of personalized search.
Preference learning • Input: preference: a set of partial order on X Output: a total order on X or, map X onto a structured label space Y • Preference function
Existing methods • Learning to order things [W. Cohen 98] • Large margin ordinal regression [R. Herbrich 98] • PRanking with Ranking[K Crammer 01] • Optimizing Search Engines using Clickthrough Data [T Joachims 02] • Efficient boosting algorithm for combining preferences[Yoav Freund 03] • Classification Approach towards Rankingand Sorting Problems[SRajaram 03]
Existing methods • Learning to Rank using Gradient Descent[C Burges 05] • Stability and Generalization of Bipartite Ranking[S Agarwal 05] • Generalization Bounds for k-Partite Ranking[S Rajaram 05] • Ranking with a p-norm push [C Rudin 05] • Magnitutde-Preserving Ranking Algorithms [C Cortes 07] • From Pairwise Approach to Listwise[Z Cao 07]
Large Margin Ordinal Regression • Mapping to an axis using inner product
Large Margin Ordinal Regression • Consider • Then • Introduce soft margin • Solve using SVM
Learn to order things • A greedy ordering algorithm to order things Calculate a score for each item
Learn to order things (cont.) • Combine different ranking functions • To learn the weight iteratively
Learn to order things Combine preference functions Do ranking aggregation Update weights based on feedbacks
Initially, w is uniform • At each step • Compute a combined ranking function • Produce a ranking aggregation • Measure the loss
RankBoost • Bipartite ranking problems • Combine weaker rankers • Sort based on values of H(x)
RankBoost (cont.) Sampling distribution Initialization • Bipartite ranking problem Learn weak ranker Sampling distribution updation normalization Combine weak rankers
Stability and Generalization • Bipartite ranking problems • Expected rank error • Empirical rank error
Stability and Generalization (cont.) • Stability • Remove one training sample, how much changes • Generalization • Generalize to k-partite ranking problem…
Rank on graph data • Objective
P-norm push • Focus on the topmost ranked items • The top left region is the most important
P-norm push (cont.) • Height of k (k is a negative sample) Cost of sample k: g is convex, monotonically incresasing
p-norm push • Run RankBoost to solve the problem