240 likes | 436 Views
Rank centrality: Ranking from comparisons. Sahand Negahban Sewoong Oh Devavrat Shah Yale + UIUC + MIT. Some scenarios. Given partial preferences Compute global ranking with scores to reflect intensity Sports Outcome of games between teams/players Social recommendations
E N D
Rank centrality: Ranking from comparisons SahandNegahbanSewoong Oh Devavrat Shah Yale + UIUC + MIT
Some scenarios • Given partial preferences • Compute globalranking with scores to reflect intensity • Sports • Outcome of games between teams/players • Social recommendations • Ratings of few restaurants/movies • Competitive conference/Graduate admission • Ordering of few papers/applicants
Revealed preferences • Partial preferences are revealed in different forms • Sports: Win and Loss • Social: Starred rating • Conferences: Scores • All can be viewed as pair-wise comparisons • IND beats AUS: IND > AUS • South Indies ***** vs MTR***: SI > MTR • Ranking Paper 10/10 vs Other Paper 5/10: Ranking > Other
Data and Decision • Revealed preferences lead to • Bag of pair-wise comparisons • Sports, Social, Conferences, Transactions, etc. • Question of interest • Obtain global ranking over objects of interest • Teams/Players, Restaurants, Papers, Applicants. • Along with intensity/score for each object • Using given partial preferences/pair-wise comparisons
# times 1 defeats 2 Data and Decision A12 1 A21 6 2 5 3 4 • Q1. Given weighted comparison graph G=(V, E, A) • Find ranking of/scores associated with objects • Q2. When possible (e.g. Conference/Crowd-Sourcing), choose G so as to • Minimize the number of comparisons required to find ranking/scores
Rank aggregation: Model A12 1 A21 6 2 C C A A A A A 5 3 • Data • Distribution • Ranking 4 A A B B B B B B B C C C C C 0.25 0.75 • We posit • Distribution over permutations as ground-truth • Pair-wise comparisons are drawn from this distribution
Rank aggregation: Background A12 1 A21 6 2 5 3 4 4 5 6 3 1 1 6 2 3 2 4 5 > > > > > > > > > > • Input: complete preference (not comparisons) • Axiomatic impossibility [Arrow ’51] • Some algorithms • Kemeny optimal: minimize disagreements • Extended Condorcet Criteria • NP-hard, 2-approx algorithm [Dwork et al ’01] • Borda count: average position is score • Simple • Useful axiomatic properties [Young ‘74]
Rank aggregation: Background A12 1 A21 6 2 5 3 4 [Ammar, Shah ’11] • Algorithm with comparisons • Variant of Kemeny optimal: • NP-hard • Variant of Bordacount: average position from comparison? • If pij = Aij/(Aij+ Aji) represent pair-wise marginal distribution • Then, Borda count is given as • Requires: G complete, many comparisons per pair • Also see (short list of relatd works): [Diaconis ‘87], [Alder et al ‘87], [Braverman-Mossel’09], [Caramanis et al ‘11], [Fernoud et al ’11], [Duchi et al ‘12]…
Rank aggregation: Model A12 1 A21 6 2 5 3 4 • General model • Effectively impossible to do aggregation • Practically • Restrict choice model • Popularly utilized model is instance of Thurstone’s ‘27 • Used for transportation system (cf. McFadden) • TrueSkill uses for ranking online gamers • Pricing in airline industry (cf. Talluri and Van Ryzin) • …
Choice model • Choice model (distribution over permutations) [Bradley-Terry-Luce (BTL) or MNL Model] • Each object i has an associated weight wi> 0 • When objects i and j are compared • P(i > j) = wi/(wi + wj) • Sampling model • Edges E of graph G are selected • For each (i,j) ε E, sample k pair-wise comparisons
Rank centrality A12 1 A21 6 2 5 3 4 • Random walk on comparison graph G=(V,E,A) • d = max (undirected) vertex degree of G • For each edge (i,j): • Pij= (Aji+1)/(Aij+Aji+2) x 1/(d+1) • For each node i: • Pii= 1- Σj≠iPij • Let G be connected • Let s be the unique stationary distribution of RW P • Ranking: • Use s as scores of objects • Closely related to Dwork et al ‘01 + Saaty ‘03
Rank centrality A12 1 A21 6 2 5 3 4 • Random walk on comparison graph G=(V,E,A) • Let s be the unique stationary distribution of RW P • Ranking: • Use s as scores of objects • That is, object i has higher score if • It beats object j with higher score, • Or, beats many objects.
Rank centrality A12 1 A21 6 2 5 3 4 • Random walk on comparison graph G=(V,E,A) • Let s be the unique stationary distribution of RW P • Ranking: • Use s as scores of objects • Compared to variant of Borda count:
Rank centrality: simulation • Error(s) = • G: Erdos-Renyi graph with edge prob. d/n d/n k
Rank centrality: performance • Theorem 1 (Negahban-Oh-Shah). • Let R= (maxijwi/wj). • Let G be Erdos-Renyi graph. • Under Rank centrality, with d = Ω(log n) • That is, sufficient to have O(R5 n log n) samples • Optimal dependence on n for ER graph • Dependence on R ?
Rank centrality: performance • Theorem 1 (Negahban-Oh-Shah). • Let R= (maxijwi/wj). • Let G be Erdos-Renyi graph. • Under Rank centrality, with d = Ω(log n) • Information theoretic lower-bound: for any algorithm
Rank centrality: performance • Theorem 2 (Negahban-Oh-Shah). • Let R= (maxijwi/wj). • Let G be any connected graph: • L = D-1 E be it’s Laplacian • Δ = 1- λmax(L) • κ = dmax /dmin • Under Rank centrality, with kd = Ω(log n) • That is, number of samples required O(R5 κ2n log n x Δ-2) • Graph structure plays role through it’s Laplacian
Rank centrality and Graph choice • Theorem 2 (Negahban-Oh-Shah). • Under Rank centrality, with kd = Ω(log n) • That is, number of samples required O(R5 κ2n log n x Δ-2) • Choice of graph G • Subject to constraints, choose G so that • Spectral gap Δ is maximized • SDP [Boyd, Diaconis, Xiao ‘04]
Some remarks on proof • Bound on • Use of comparison theorem [Diaconis-SaloffCoste ‘94]++ • Bound on • Use of (modified) concentration of measure inequality for matrices • Finally, use this to further bound Error(s)
Rank centrality for Admissions, Conferences,… A12 1 A21 6 2 5 3 4 • MIT admission system • ACM conferences (MobiHoc ‘11, Sigmetrics ‘13) • Past few years has been used for efficient reviewing • Daily polls (cf. A. Ammar) • polls.mit.edu • Netflix • ?
Concluding remarks • Pair-wise comparisons • Universal way to look at partial preferences • Rank centrality • Simple and intuitive algorithm for rank aggregation • The comparison graph plays important role in aggregation • Choose G to maximize spectral gap of natural RW