Rank Aggregation Methods for the Web

Rank Aggregation Methods for the Web CS728 Lecture 11

Web Page Ranking Methods Reviewed • PageRank – global link analysis • Indegree – local link analysis • HITS- topic-based link analysis • Voting –NNN and Correlation • Graph distance from seed • URL length and depth • Text-based methods (e.g., tf*idf)

Rank Aggregation B D C A F E “Consensus” ranking of all A B D C FE B D C A B C D A F E

Notations for Ranking • Given a universe U, and ordered list τ of a subset of S of U τ=[x1≥ x2≥… ≥xd] , xi in S τ(i) : position of rank of i |τ|: number of elements • full list : τ which contains all the elements in U • partial list : rank only some of elements in U • top d list : all d ranked elements are above all unranked elements • Question: when are two orderings similar? Can you give a distance measure?

Measuring Distance Between Orderings • Spearman’s Footrule Distance • σ,τ :two full list. • σ( i ) :rank of candidate i • Kendall tau distance • Count the number of pairwise disagreementsbetween the two lists

σ τ 1 2 3 4 5 A C E D B C A B D E Example of Ordered-List Distance • Example • S = {A,B,C,D,E} • σ,τ :two full list • Spearman’s Footrule Distance • F(σ,τ) = 1 + 2 + 1 + 0 + 2 = 6 • Kendall tau distance • K(σ,τ) = |{(A,C), (B.D), (B,E), (D,E)}| = 4

Optimal ranking aggregation • Optimality depends on the distance measure we use. • Optimizing with Kendall tau distance, we obtain Kemeny optimal aggregation • Can show satisfiesneutrality and consistency • important properties of rank aggregation functions. • Useful but computationally hard. Kemeny optimal aggregation is NP-hard. • Will show that footrule-optimal is in P.

Two properties relate K and F • For any full lists σ,τ K(σ,τ) ≤ F(σ,τ) ≤ 2 K(σ,τ) So we get a 2-approximation to Kemeny-optimality • Since, if σ is the Kemeny optimal aggregation of full lists τ1 ,…, τk and σ’ optimizes the footrule aggregation then, K(σ’, τ1 ,…, τk ) ≤ 2 K(σ, τ1 ,…, τk )

Condorcet Criteria and SPAM Filters • Condorcet Criterion • An element of S which wins every other in pairwise simple majority voting should be ranked first. • Extended Condorcet Criterion (XCC): • If most voters prefer candidate a to candidate b (i.e., # of i s.t. i(a) < i(b) is at least n/2), then also  should prefer a to b (i.e., (a) < (b)). • XCC is effective in ‘spam-fighting’ and thus good to use in meta-search.

XCC: Not always realizable (a) < (b) < (c) Not realizable

Voting Theory: Desired Properties • Given set of candidates and voter preferences: seek an algorithm that ranks candidates which satisfies a set of desired properties • Which combination of properties are realizable? • 1) Independence from Irrelevant Alternatives: Relative order of a and b in  should depend only on relative order of a and b in 1,…,n. • Ex: if i = (a b c) changes to (a c b), relative order of a,b in  should not change.

Desired Properties: • 2) Neutrality No candidate should be favored to others. • If two candidates switch positions in 1,…,n, they should switch positions also in . • 3) Anonymity No voter should be favored to others. • If two voters switch their orderings,  should remain the same.

Desired Properties: • 4) Monotonicity If the ranking of a candidate is improved by a voter, its ranking in  can only improve. • 5) Consistency If voters are split into two disjoint sets, S and T, and both the aggregation of voters in S and the aggregation of voters in T prefer a to b, then also the aggregation of all voters should prefer a to b.

Desired Properties • 6) No Dictatorship: f(1,…,n) != I • 7) Unanimity (a.k.a. Pareto optimality): If all voters prefer candidate a to candidate b (i.e., i(a) < i(b)for all i), then also  should prefer a to b (i.e., (a) < (b)).

Desired Properties • 8) Democracy: satisfies extended Condorcet Criterion XCC. • Always works for m = 2. • Not always realizable for m ≥ 3. • Theorem [May, 1952]: For m = 2, Democracy is the only rank aggregation function which is monotone, neutral, and anonymous.

Arrow’s Impossibility Theorem [Arrow, 1951] • Theorem: If m ≥ 3, then the only rank aggregation function that is unanimous and independent from irrelevant alternatives is dictatorship. • Won Nobel prize (1972)

1 2 3 4 C3 C1 . . . C7 C8 C10 C7 C1 . . . C8 C3 C10 C3 C2 . . . C7 C10 C9 C3 C8 . . . C1 C15 C10 Borda’s method • Easy and intuitive - Several “score-based”variants; 1781 • Violates independence from irrelevant alternatives B(c)=iBi(c) Sorted in decreasing order Bi(C8) =1 2 0 13 Bi(c)=the number of candidates ranked below c in  i

Partial lists • Handle partial lists by giving all the excess scores equally among all unranked candidates, Example: Candidates number =100 Ranked candidates number =70 (score: 31~100) =>Assign score 31/30 to each 30 unranked candidates

Footrule optimal aggregation • Footrule optimal aggregation can be computed in polynomial time. is a good approximation of Kemeny optimal aggregation. • Proof : Via minimum cost perfect matching

Markov Chain method for rank aggregation. • States=candidates • Transitions depend on the preference orders given by voters • Basic idea: probabilistically switch to a “better candidate” • Rank candidates based on stationary probabilities!

Markov chain advantages • Handling partial list and top d list by usingavailable comparisons to infer new ones • Handling uneven comparison and list length • Computation efficiency • O(NK) preprocessing,O(K) per step for about O(N) steps

Four ways to build transition Matrix • Current state is candidate a. • MC1: Choose uniformly from multiset of all candidatesthat were ranked at least as high as a by some voter. – Probability to stay at a: ~ average rank of a. • MC2: Choose a voter i uniformly at random and pick uniformly at random from amongthe candidates that the i-th voter ranked at least as highas a. • MC3: Choose a voter i uniformly at random and pick uniformly at random a candidateb. If i-th voter ranked b higher than a, go to b. Otherwise,stay in a. • MC4: Choose a candidate b uniformly at random If most voters rankedb higher than a, go to b. Otherwise, stay in a. – Rank of a ~ # of “pairwise contests” a wins.

A locally Kemeny optimal aggregation is a relaxation of Kemeny Optimality • A locally Kemeny optimal aggregation satisfies the extended Condorcet property and can be computed in “kO(nlogn)” worst case, O(n2) • Many of existing aggregation methods do not satisfy ECC. =>Given τ1 , … ,τk use your favorite aggregation method to obtain a full list μ. And Apply local kemenization to μ with respect to τ1 , … ,τk .

Local Kemenization is a procedure to get locally Kemeny optimal aggregation. • A local Kemenization of a full list with respect to Compute a locally Kemeny optimal aggregation of that is maximally consistent with This approach: (1) preserves the strengths of the initial aggregation . (2) ranks non-spam above spam. (3) gives a result that disagrees with on any pair ( i, j ) only if a majority of the τ’s endorse this disagreement. (4) for every d, 1 ≤ d ≤ | μ |, the restriction of the output is a local Kemenization of the top d elements of μ

How do we perform local kemenization? • Local Kemenization Example! A B F E C D B C A E F D A C F D E B B F D C A E C A B F E D B A DC E F A B D B A B A B CF E D A B DC A B CD B A disagree A>B: 3 A<B: 2 B>D: 4 B<D: 1

Experiments: meta-search K = Kendall distance SF = scaled footrule distance IF = induced footrule distance LK = Local Kemenization

Rank Aggregation Methods for the Web

Rank Aggregation Methods for the Web

Presentation Transcript

Lecture 9: Rank Aggregation in MetaSearch

Rank Aggregation Methods II Experiments

The Rank Aggregation Problem

Unsupervised Rank Aggregation with Distance -Based Models

Wisdom of Crowds and Rank Aggregation

Section 2.4: Rank Methods

Aggregation

Web Survey Methods

6 Rank Aggregation and Top-k Queries

MAP estimation in MRFs via rank aggregation

Voting (rank aggregation) rules

Group Recommendations with Rank Aggregation and Collaborative Filtering

Lecture 9: Rank Aggregation in MetaSearch

Rank Aggregation

Rank Annihilation Based Methods

Alternative Methods for Aggregation of Expert Judgments: A Preliminary Comparison

EFFECT OF AGGREGATION METHODS ON ECOLOGICAL ASSESSMENT

Methods Code: Model Aggregation Methods (Black) Entity Aggregation Methods (Green)

Rank Aggregation Methods II Experiments

Voting (rank aggregation) rules

GES : scales for assessment and aggregation methods

The Rank Aggregation Problem