Rank centrality: Ranking from comparisons

Rank centrality: Ranking from comparisons SahandNegahbanSewoong Oh Devavrat Shah Yale + UIUC + MIT

Some scenarios • Given partial preferences • Compute globalranking with scores to reflect intensity • Sports • Outcome of games between teams/players • Social recommendations • Ratings of few restaurants/movies • Competitive conference/Graduate admission • Ordering of few papers/applicants

Revealed preferences • Partial preferences are revealed in different forms • Sports: Win and Loss • Social: Starred rating • Conferences: Scores • All can be viewed as pair-wise comparisons • IND beats AUS: IND > AUS • South Indies ***** vs MTR***: SI > MTR • Ranking Paper 10/10 vs Other Paper 5/10: Ranking > Other

Data and Decision • Revealed preferences lead to • Bag of pair-wise comparisons • Sports, Social, Conferences, Transactions, etc. • Question of interest • Obtain global ranking over objects of interest • Teams/Players, Restaurants, Papers, Applicants. • Along with intensity/score for each object • Using given partial preferences/pair-wise comparisons

# times 1 defeats 2 Data and Decision A12 1 A21 6 2 5 3 4 • Q1. Given weighted comparison graph G=(V, E, A) • Find ranking of/scores associated with objects • Q2. When possible (e.g. Conference/Crowd-Sourcing), choose G so as to • Minimize the number of comparisons required to find ranking/scores

Rank aggregation: Model A12 1 A21 6 2 C C A A A A A 5 3 • Data • Distribution • Ranking 4 A A B B B B B B B C C C C C 0.25 0.75 • We posit • Distribution over permutations as ground-truth • Pair-wise comparisons are drawn from this distribution

Rank aggregation: Background A12 1 A21 6 2 5 3 4 4 5 6 3 1 1 6 2 3 2 4 5 > > > > > > > > > > • Input: complete preference (not comparisons) • Axiomatic impossibility [Arrow ’51] • Some algorithms • Kemeny optimal: minimize disagreements • Extended Condorcet Criteria • NP-hard, 2-approx algorithm [Dwork et al ’01] • Borda count: average position is score • Simple • Useful axiomatic properties [Young ‘74]

Rank aggregation: Background A12 1 A21 6 2 5 3 4 [Ammar, Shah ’11] • Algorithm with comparisons • Variant of Kemeny optimal: • NP-hard • Variant of Bordacount: average position from comparison? • If pij = Aij/(Aij+ Aji) represent pair-wise marginal distribution • Then, Borda count is given as • Requires: G complete, many comparisons per pair • Also see (short list of relatd works): [Diaconis ‘87], [Alder et al ‘87], [Braverman-Mossel’09], [Caramanis et al ‘11], [Fernoud et al ’11], [Duchi et al ‘12]…

Rank aggregation: Model A12 1 A21 6 2 5 3 4 • General model • Effectively impossible to do aggregation • Practically • Restrict choice model • Popularly utilized model is instance of Thurstone’s ‘27 • Used for transportation system (cf. McFadden) • TrueSkill uses for ranking online gamers • Pricing in airline industry (cf. Talluri and Van Ryzin) • …

Choice model • Choice model (distribution over permutations) [Bradley-Terry-Luce (BTL) or MNL Model] • Each object i has an associated weight wi> 0 • When objects i and j are compared • P(i > j) = wi/(wi + wj) • Sampling model • Edges E of graph G are selected • For each (i,j) ε E, sample k pair-wise comparisons

Rank centrality A12 1 A21 6 2 5 3 4 • Random walk on comparison graph G=(V,E,A) • d = max (undirected) vertex degree of G • For each edge (i,j): • Pij= (Aji+1)/(Aij+Aji+2) x 1/(d+1) • For each node i: • Pii= 1- Σj≠iPij • Let G be connected • Let s be the unique stationary distribution of RW P • Ranking: • Use s as scores of objects • Closely related to Dwork et al ‘01 + Saaty ‘03

Rank centrality A12 1 A21 6 2 5 3 4 • Random walk on comparison graph G=(V,E,A) • Let s be the unique stationary distribution of RW P • Ranking: • Use s as scores of objects • That is, object i has higher score if • It beats object j with higher score, • Or, beats many objects.

Rank centrality A12 1 A21 6 2 5 3 4 • Random walk on comparison graph G=(V,E,A) • Let s be the unique stationary distribution of RW P • Ranking: • Use s as scores of objects • Compared to variant of Borda count:

Rank centrality: experimentInternational Cricket Ranking

Rank centrality: simulation • Error(s) = • G: Erdos-Renyi graph with edge prob. d/n d/n k

Rank centrality: performance • Theorem 1 (Negahban-Oh-Shah). • Let R= (maxijwi/wj). • Let G be Erdos-Renyi graph. • Under Rank centrality, with d = Ω(log n) • That is, sufficient to have O(R5 n log n) samples • Optimal dependence on n for ER graph • Dependence on R ?

Rank centrality: performance • Theorem 1 (Negahban-Oh-Shah). • Let R= (maxijwi/wj). • Let G be Erdos-Renyi graph. • Under Rank centrality, with d = Ω(log n) • Information theoretic lower-bound: for any algorithm

Rank centrality: performance • Theorem 2 (Negahban-Oh-Shah). • Let R= (maxijwi/wj). • Let G be any connected graph: • L = D-1 E be it’s Laplacian • Δ = 1- λmax(L) • κ = dmax /dmin • Under Rank centrality, with kd = Ω(log n) • That is, number of samples required O(R5 κ2n log n x Δ-2) • Graph structure plays role through it’s Laplacian

Rank centrality and Graph choice • Theorem 2 (Negahban-Oh-Shah). • Under Rank centrality, with kd = Ω(log n) • That is, number of samples required O(R5 κ2n log n x Δ-2) • Choice of graph G • Subject to constraints, choose G so that • Spectral gap Δ is maximized • SDP [Boyd, Diaconis, Xiao ‘04]

Some remarks on proof

Some remarks on proof • Bound on • Use of comparison theorem [Diaconis-SaloffCoste ‘94]++ • Bound on • Use of (modified) concentration of measure inequality for matrices • Finally, use this to further bound Error(s)

Rank centrality for Admissions, Conferences,… A12 1 A21 6 2 5 3 4 • MIT admission system • ACM conferences (MobiHoc ‘11, Sigmetrics ‘13) • Past few years has been used for efficient reviewing • Daily polls (cf. A. Ammar) • polls.mit.edu • Netflix • ?

Concluding remarks • Pair-wise comparisons • Universal way to look at partial preferences • Rank centrality • Simple and intuitive algorithm for rank aggregation • The comparison graph plays important role in aggregation • Choose G to maximize spectral gap of natural RW

Rank centrality: Ranking from comparisons