Learning to Rank --A Brief Review

Learning to Rank --A Brief Review Yunpeng Xu

Ranking and sorting • Rank: only has K structured categories • Sorting: each sample has a distinct rank • Generally, no need to differentiate them

Overview • Rank aggregation • Label ranking • Query and rank by example • Preference learning • Problems left, what we can do?

Ranking aggregation • Needs of combining different ranking results • Voting systems, welfare economics, decision making 1. Hillary Clinton > John Edwards > Barack Obama 2. Barack Obama >John Edwards > Hillary Clinton => ?

Ranking aggregation (cont.) • Arrow’s impossibility theorem • Kenneth Arrow, 1951 If the decision-making body has at least two members and at least three options to decide among, then it is impossible to design a social welfare function that satisfies all these conditions at once.

Ranking aggregation (cont.) • Arrow’s impossibility theorem • 5 fair assumptions • non-dictatorship, unrestricted domain or universality, independence of irrelevant alternatives, positive association of social and individual values or monotonicity, non-imposition or citizen sovereignty • Cannot be satisfied simultaneously

Ranking aggregation (cont.) • Borda’s method (1971) • Given lists , each has n items • For each Define as the number of items rank below j in • Rank all items by Hillary Clinton: 2, John Edwards: 2, Barack Obama: 2

Ranking aggregation (cont.) -- Border • Condorcet Criteria • If the majority prefers x to y, then x must be ranked above y • Border’s method does not satisfy CC, neither any method that assigns weights to each rank position

Ranking aggregation (cont.) • Assumption relaxation • Maximize consensus criteria • Equivalent to minimize disagreement (Kemeny, Social Choice Theorem) • NP Hard! • Sub-optimal solutions using heuristics

Ranking aggregation (cont.) • Basic idea • Assign different weights to different experts • Supervised aggregation • Weighting according to a final judger (ground truth) • Unsupervised aggregation • Aims to minimize the disagreement measured by certain distances

Ranking aggregation (cont.) • Distance measure • Spearman footrule distance • Kendal tau distance • Kendal tau distance for multiple lists • Scaled footrule distance

Ranking aggregation (cont.) -Distance Measure • Kemeny optimal ranking • Minimizing Kendal distance • Still NP-Hard to compute • Local Kemenization (local optimal aggregation) • Can be computed in O(knlogn)

Ranking aggregation (cont.) • Supervised Ranking Aggregation (SRA WWW07) • Ground truth: preference matrix H • Example • Goal: rank by the score • It can be seen that , or with relaxation

Ranking aggregation (cont.) -- SRA • Method • Use Borda’s score • Objective

Ranking aggregation (cont.) • Markov Chain Rank Aggregation(MCRA, WWW05) • Map a ranked list to a Markov Chain M • Compute the stationary distribution of M • Rank items based on • Example: • B > C > D • A > D > E • A > B > E

Ranking aggregation (cont.) - MCRA • Different transition strategies • MC1 all out-degree edgeshave uniform probabilities • MC2 choose a list, then choose next item on the list; • … • For disconnected graph, define transition probability based on measure item similarity

Ranking aggregation (cont.) • Unsupervised Learning Algorithm for Rank Aggregation(ULARA: Dan Roth ECML07) • Goal: • Method: maximize agreement

Ranking aggregation (cont.) - UCLRA • Method • Algorithm: iterative gradient decent • Initially, w is uniform, then updated iteratively

Mountain > Sea> Beach Label Ranking • Goal: Map from the input space to the set of total order over a finite set of labels • Related to multi-label or multi-class problems Input: Customer information Output: Porsche > Toyota > Ford

Label Ranking (cont.) • Pairwise ranking (ECML03) • Train a classifier for each pair of labels • When judge on an example : If the classifier predicts , then count it as a vote on Then rank all labels according to their votes • Total classifiers

Label Ranking (cont.) • Constraint Classification (NIPS 02) • Consider a linear sorting function • Goal: learn the values of rank all labels by the score

Label Ranking (cont.) -- CC • Expand the feature vector • Generate positive/ negative samples in

Label Ranking (cont.) -- CC • Learn a separating hyper plane • Can be solved by SVM

Query and rank by example • Given one query, rank retrieved items according to their relevancy w.r.t the query.

Query and rank by example (cont.) • Rank on manifold • Convergence form • Essentially, this is an one-class semi-supervised method

Preference learning • Given a set of items, and a set of user preference over these items, to rank all items according to the user preference. • Motivated by the needs of personalized search.

Preference learning • Input: preference: a set of partial order on X Output: a total order on X or, map X onto a structured label space Y • Preference function

Existing methods • Learning to order things [W. Cohen 98] • Large margin ordinal regression [R. Herbrich 98] • PRanking with Ranking[K Crammer 01] • Optimizing Search Engines using Clickthrough Data [T Joachims 02] • Efficient boosting algorithm for combining preferences[Yoav Freund 03] • Classification Approach towards Rankingand Sorting Problems[SRajaram 03]

Existing methods • Learning to Rank using Gradient Descent[C Burges 05] • Stability and Generalization of Bipartite Ranking[S Agarwal 05] • Generalization Bounds for k-Partite Ranking[S Rajaram 05] • Ranking with a p-norm push [C Rudin 05] • Magnitutde-Preserving Ranking Algorithms [C Cortes 07] • From Pairwise Approach to Listwise[Z Cao 07]

Large Margin Ordinal Regression • Mapping to an axis using inner product

Large Margin Ordinal Regression • Consider • Then • Introduce soft margin • Solve using SVM

Learn to order things • A greedy ordering algorithm to order things Calculate a score for each item

Learn to order things (cont.) • Combine different ranking functions • To learn the weight iteratively

Learn to order things Combine preference functions Do ranking aggregation Update weights based on feedbacks

Initially, w is uniform • At each step • Compute a combined ranking function • Produce a ranking aggregation • Measure the loss

RankBoost • Bipartite ranking problems • Combine weaker rankers • Sort based on values of H(x)

RankBoost (cont.) Sampling distribution Initialization • Bipartite ranking problem Learn weak ranker Sampling distribution updation normalization Combine weak rankers

Stability and Generalization • Bipartite ranking problems • Expected rank error • Empirical rank error

Stability and Generalization (cont.) • Stability • Remove one training sample, how much changes • Generalization • Generalize to k-partite ranking problem…

Rank on graph data • Objective

P-norm push • Focus on the topmost ranked items • The top left region is the most important

P-norm push (cont.) • Height of k (k is a negative sample) Cost of sample k: g is convex, monotonically incresasing

p-norm push • Run RankBoost to solve the problem

Thanks!

Learning to Rank --A Brief Review

Learning to Rank --A Brief Review

Presentation Transcript

Learning to Rank (part 1)

Learning to Rank: A Machine Learning Approach to Static Ranking

Managerial Accounting A Brief Review

A Brief Review of…

A Brief Review of CAPE

A brief review of Chemistry

A brief review of non-neural-network approaches to deep learning

PERIOPerAtIve Anaphylaxis: A BRIEF REVIEW

Brief Review

Anti-aliasing: a brief review

Learning to Rank

A brief review

Brief Review

A brief review

LCP V12 A brief review

Brief Review

Brief Review

Learning to Rank with Ties

Brief Review