When Do Noisy Votes Reveal the Truth? Ioannis Caragiannis 1 Ariel D. Procaccia 2

When Do Noisy Votes Reveal the Truth? Ioannis Caragiannis1 Ariel D. Procaccia2 Nisarg Shah2( speaker ) 1 University of Patras & CTI 2 Carnegie Mellon University

What? Why? • What? • Alternatives to be compared • True order (unknown ground truth) • Noisy estimates (votes) drawn from some distribution around it • Q: How many votes are needed to accurately find the true order? • Why? • Practical motivation • Theoretical motivation b > a > c > d a > c > b > d a > b > c > d a > b > d > c Alternativesa, b, c, d

Practical Motivation 2. Judgement Aggregation • Jury system, experts ranking restaurants, … • How many experts are required? 1. Human Computation • EteRNA, Foldit, Crowdsourcing … • How many users/workers are required?

Theoretical Motivation VotingRules NoiseModels • Maximum Likelihood Estimator (MLE) View: Is a given voting rule the MLE for any noise model? • Problems • Only 1 MLE/noise model • Strange noise models • Noise model is usually unknown • Our Contribution • MLE is too stringent! • Just want low sample complexity • Family of reasonable noise models

Boring Stuff! • Voting rule () • Input  several rankings of alternatives • Social choice function (traditionally) : Output  a winning alternative • Social welfare function (this work) : Output  a ranking of alternatives • Noise model over rankings () • For every ground truth and every ranking σ  • Mallows’ model : • = Kendall-Tau distance = #pairwise comparisons two rankings disagree on • Sample complexity of rule for model and accuracy • Smallest For every σ*,

Sample Complexity for Mallows’ Model • Kemeny rule (+ any tie-breaking) = MLE • Theorem:Kemeny rule + uniformly random tie-breaking = optimal sample complexity for Mallows’ model, any accuracy. • Subtlety: MLE does not always imply optimal sample complexity! • So, are the other voting rules really bad for Mallows’ model? • No.

PM-c and PD-c Rules • Pairwise Majority Consistent Rules (PM-c) • Must match the pairwise majority graph whenever it is acyclic • Condorcet consistency for social welfare functions a c d b

PM-c and PD-c Rules • PD-c rules  similar, but focus on positions of alternatives PD-c PM-c KM PSR SL SC BL CP RP

The Big Picture • PM-c  O(log m) (m = #alternatives) • Any voting rule  Ω(log m) Exponential • Kemeny rule + uniform tie breaking • Optimal sample complexity Polynomial Many scoring rules PM-c Logarithmic • Plurality, veto • Strictly exponential

Take-Away - I • Given any fixed noise model, sample complexity is a clear and useful criterion for selecting voting rules • Hey, what happened to the noise model being unknown?

Generalization • Stronger need  Unknown noise modelWorking well on a family of reasonable noise models • Problems • What is reasonable? • HUGE sample complexity for near-extreme parameter values! • Relaxation  Accuracy in the LimitGround truth with probability 1 given infinitely manysamples • Novel axiomatic property

Accuracy in the Limit Monotonicity is reasonable, but why Kendall-Tau distance?

Take-Away - II • Robustness accuracy in the limit over a family of reasonable noise models • d-monotonic noise models  reasonable • If you believe in PM-c and PD-c rules  look for distances that are both MC and PCKendall-Tau, footrule, maximum displacement • Cayley distance and Hamming distance are neither MC nor PC • Even the most popular rule – plurality – is not accurate in the limit for anymonotonic noise model over either distance ! • Lose just too much information for the true ranking to be recovered

Distances over Rankings • MC (Majority-Concentric) Distance • Ranking , distance  • For every pairwise comparison, a (weak) majority of rankings in every must agree with σ* σ*

Discussion • The stringent MLE requirement  sample complexity • Connections to axiomatic and distance rationalizability views? • Noise model unknown  d-monotonic noise models • Some distances over rankings are better suited for voting than others (e.g., MC and PC distances) • An extensive study of the applicability of various distance metrics in social choice • Practical applications  Extension to voting with partial information - pairwise comparisons, partial orders, top- lists

When Do Noisy Votes Reveal the Truth? Ioannis Caragiannis 1 Ariel D. Procaccia 2