190 likes | 296 Views
Common Voting Rules as Maximum Likelihood Estimators Vincent Conitzer (Joint work with Tuomas Sandholm) Early version of this work appeared in UAI-05 . Voting (rank aggregation) rules. Set of m candidates (alternatives) C n voters; each voter ranks the candidates (the voter’s vote )
E N D
Common Voting Rules as Maximum Likelihood EstimatorsVincent Conitzer (Joint work with Tuomas Sandholm)Early version of this work appeared in UAI-05
Voting (rank aggregation) rules • Set of mcandidates (alternatives) C • n voters; each voter ranks the candidates (the voter’s vote) • E.g. b > a > c > d • Voting rulef maps every (multi-)set of votes to either: • a winner in C, or • a complete ranking ofC • E.g. plurality: • every voter votes for a single candidate (equiv. we only consider the candidate’s top-ranked candidate) • candidate with most votes wins • E.g. single transferable vote (STV): • candidate ranked first by fewest voters drops out and is removed from rankings • repeat • final ranking is inverse of order in which they dropped out
Two views of voting • Voters’ preferences are idiosyncratic; only purpose is to find a compromise winner/ranking • There is some absolute sense in which some candidates are better than others, independent of voters’ preferences; votes are merely noisy perceptions of candidates’ true quality (outcome=winner or ranking) “correct” outcome “correct” outcome a a P(vote|outcome) P(all votes|outcome) … agents’ votes vote 1 vote 2 vote n a a a a conditional independence assumption Goal: given votes, find maximum likelihood estimate of correct outcome Different noise model different maximum likelihood estimator/voting rule
Marquis de Condorcet [1785] • Condorcet was interested in the “correct ranking” model • He assumed noise model where voter ranks any two candidates correctly with fixed probability p > 1/2, independently • With some probability this gives a cycle… • E.g. if the correct ranking is a > b > c, then with probability p2(1-p) a voter will prefer a > b, b > c, c > a • But, it does not matter for the MLE approach as long as we get a probability for each (acyclic) vote • Equivalently, we can renormalize the probabilities over the acyclic votes • Equivalently, we can say that if a cyclic vote is drawn, it must be redrawn • Condorcet solved for the MLE rule for the cases of 2 and 3 candidates
The Kemeny rule [1959] • Given a ranking r, a vote v, and two candidates a, b, let δab(r, v) = 1 if r and v disagree on the relative ranking of a and b, and 0 otherwise • A Kemeny rankingr minimizes ΣabΣvδab(r, v) • Young [1986]’s observation: the Kemeny rule is the solution to Condorcet’s problem! • Drissi & Truchon [2002] extend to the case where p is allowed to vary with the distance between two candidates in correct ranking
What is next? • Does this suggest using Kemeny rule? • Many other noise models possible • Some of these may correspond to other, better-known rules • Goal of this work: Classify which common rules are a maximum likelihood estimator for some noise model • Positive and negative results • Positive results are constructive • Motivation: • Rules corresponding to a noise model are more natural • Knowing a noise model can give us insight into the rule and its underlying assumptions • If we disagree with the noise model, we can modify it and obtain new version of the rule
Conditional independence restriction “correct” outcome a • Without any independence restriction, it turns out that any rule has a noise model: • P(vote set|outcome) > 0 if and only if f(vote set)=outcome agents’ votes a • So, will focus on conditionally independent votes • If a rule has a noise model in this setup we call it an • MLEWIV rule if producing winner • MLERIV rule if producing ranking • (IV = Independent Votes) “correct” outcome a … vote 1 vote 2 vote n a a a conditional independence assumption
Any scoring rule is MLEWIV and MLERIV • Scoring rule gives a candidate a1 points if it is ranked first, a2 points if it is ranked second, etc. • plurality rule: a1 = 1, ai = 0 otherwise • Borda rule: ai = m-i • veto rule: am = 0, ai = 1 otherwise • MLEWIV noise model: P(v|w) = 2al(v,w) where l(v,w) is the rank of w in v • want to choose w to maximize Πv 2al(v,w) = 2Σval(v,w) • MLERIV noise model: P(v|r) = Π1≤i≤m(m+1-i)al(v,ri) where ri is the candidate ranked ith in r
Single Transferable Vote (STV) is MLERIV • STV rule: candidate ranked first by fewest voters drops out and is removed from rankings; repeat; final ranking is inverse of order in which they dropped out • MLERIV noise model: • Let ribe the candidate ranked ith in r • Let δv(ri) = 1 if all the candidates ranked higher than riin v are ranked lower in r (i.e. they are all contained in {ri+1, ri+2, …, rm}), otherwise 0 • P(v|r) = Π1≤i≤mkiδv(ri) where ki+1 << ki < 1
Lemma to prove negative results correct outcome • For any noise model, if there is a single outcome that maximizes the likelihood of both vote set 1 and vote set 2, then it must also maximize the likelihood of vote set 3 • Hence, a voting rule that produces the same outcome on both set 1 and set 2 but a different one on set 3 cannot be a maximum likelihood estimator … … vote n vote k+1 vote k vote 1 vote set 2 vote set 1 vote set 3
STV rule is not MLEWIV • STV rule: candidate ranked first by fewest voters drops out and is removed from rankings; repeat. Final ranking is inverse of order in which they dropped out • First vote set: • 3 times c > a > b • 4 times a > b > c • 6 times b > a > c • c drops out first, then a wins • Second vote set: • 3 times b > a > c • 4 times a > c > b • 6 times c > a > b • b drops out first, then a wins • But: taking all votes together, a drops out first! • (8 votes vs. 9 for the others)
Bucklin rule is not MLEWIV/MLERIV • Bucklin rule: • For every candidate, consider the minimum k such that more than half of the voters rank that candidate among the top k • Candidates are ranked (inversely) by their minimum k • Ties are broken by the number of voters by which the “half” mark is passed • First vote set: • 2 times a > b > c > d > e • 1 time b > a > c > d > e • gives final ranking a > b > c > d > e • Second vote set: • 2 times b > d > a > c > e • 1 time c > e > a > b > d • 1 time c > a > b > d > e • gives final ranking a > b > c > d > e • But: taking all votes together gives final ranking b > a > c > d > e • (b goes over half at k=2, a does not)
Pairwise election graphs • Pairwise election: take two candidates and see which one is ranked above the other in more votes • Pairwise election graph has edge of weight k from a to b if a defeats b by k votes in the pairwise election • E.g. votes a > b > c and b > a > c together produce pairwise election graph:
(Roughly) all pairwise election graphs can be realized • Lemma: any graph with even weights is the pairwise election graph for some votes • Proof: can increase the weight of edge from a to b by two by adding the following two votes: • a > b > c1 > c2 > … > cm-2 • cm-2 > cm-1 > … c1 > a > b • Hence, from here on, we will simply show the pairwise election graph rather than the votes that realize it
Copeland is not MLEWIV/MLERIV • Copeland rule: candidate’s score = number of pairwise victories – number of pairwise defeats • i.e. outdegree – indegree of vertex in pairwise election graph = + b: 2-0 = 2 a: 2-1 = 1 c: 2-2 = 0 d: 1-2 = -1 e: 0-2 = -2 a: 3-1 = 2 b: 2-1 = 1 c: 2-2 = 0 d: 1-2 = -1 e: 1-3 = -2 a: 3-1 = 2 b: 2-1 = 1 c: 2-2 = 0 d: 1-2 = -1 e: 1-3 = -2
Maximin is not MLEWIV/MLERIV • maximin rule: candidate’s score = score in worst pairwise election • i.e. candidates are ordered inversely by weight of largest incoming edge = + c: 2 a: 4 d: 6 b: 8 a: 6 b: 8 c: 10 d: 12 a: 6 b: 8 c: 10 d: 12
Ranked pairs is not MLEWIV/MLERIV • ranked pairs rule: pairwise elections are locked in according by margin of victory • i.e. larger edges are “fixed” first, an edge is discarded if it introduces a cycle = + d > a fixed c > d fixed a > c discarded b > d fixed a > b discarded b > c fixed result: b > c > d > a a > c fixed c > d fixed d > a discarded b > c fixed a > b fixed result: a > b > c > d b > d fixed a > b fixed d > a discarded b > c fixed c > d fixed result: a > b > c > d
Consistency & scoring rules • A rule is consistent if, whenever it produces the same winner on two vote sets, it produces the same winner on the union of those sets • Known result: A rule is consistent if and only if it determines the winner according to a scoring rule [Young 1975] • Hence, the following are equivalent properties of a rule: • Consistency • Determining the winner according to a scoring rule • MLEWIV • These questions are open (as far as I know): • What is the characterization of MLERIV rules? • What is the characterization of “ranking-consistent” voting rules? • What is the relationship between these?
Conclusions • We asked the question: which common voting rules are maximum likelihood estimators (for some noise model)? • If votes are not independent given outcome (winner/ranking), any rule is MLE • If votes are independent given outcome, some rules are MLEWIV (MLE for winner), some are MLERIV (MLE for ranking), some are both: Thank you for your attention!