On the Limits of Dictatorial Classification

On the Limits of Dictatorial Classification Reshef Meir School of Computer Science and Engineering, Hebrew University Joint work withShaullAlmagor, AssafMichaelyand Jeffrey S. Rosenschein

Strategy-Proof Classification • An Example • Motivation • Our Model and previous results • Filling the gap: proving a lower bound • The weighted case

Introduction Motivation Model Results Strategic labeling: an example ERM 5 errors

Introduction Motivation Model Results There is a better classifier! (for me…)

Introduction Motivation Model Results If I just change the labels… 2+5 = 7 errors

Classification Introduction Motivation Model Results E(x,y)~D[ c(x)≠y ] The Supervised Classification problem: • Input: a set of labeled data points {(xi,yi)}i=1..m • output: a classifier c from some predefined concept class C( e.g., functions of the form f : X{-,+} ) • We usually want c to classify correctly not just the sample, but to generalize well, i.e., to minimize R(c) ≡ the expected number of errors w.r.t. the distribution D(the 0/1 loss function)

Classification (cont.) Introduction Motivation Model Results A common approach is to return the ERM (Empirical Risk Minimizer), i.e., the concept in C that is the best w.r.t. the given samples (has the lowest number of errors) Generalizes well under some assumptions on the concept class C (e.g., linear classifiers tend to generalize well) With multiple experts, we can’t trust our ERM!

Where do we find “experts” with incentives? Introduction Motivation Model Results Example 1: A firm learning purchase patterns • Information gathered from local retailers • The resulting policy affects them • “the best policy, is the policy that fits my pattern”

Introduction Motivation Model Results Example 2:Internet polls / polls of experts Users Reported Dataset Classification Algorithm Classifier

Introduction Motivation Model Results Motivation from other domains Aggregating partitions Judgment aggregation Facility location (on the binary cube)

Introduction Motivation Model Results A problem instance is defined by • Set of agentsI = {1,...,n} • A set of data points X = {x1,...,xm}  X • For each xkX agent i has a label yik{,} • Each pair sik=xk,yik is a sample • All samples of a single agent compose the labeled dataset Si = {si1,...,si,m(i)} • The joint dataset S= S1 , S2 ,…, Sn is our input • m=|S| • We denote the dataset with the reported labels by S’

Introduction Motivation Model Results Input: Example + – – – – + - + + – – – - + + – - + – – + X  Xm Y1 {-,+}m Y2 {-,+}m Y3 {-,+}m S = S1, S2,…, Sn = (X,Y1),…, (X,Yn)

Introduction Motivation Model Results Mechanisms • A Mechanism M receives a labeled dataset S and outputs c = M(S) C • Private risk of i: Ri(c,S) = |{k: c(xik)  yik}| / mi • Global risk: R(c,S) = |{i,k: c(xik)  yik}| / m • We allow non-deterministic mechanisms • Measure the expected risk % of errors on Si % of errors on S

Introduction Motivation Model Results ERM We compare the outcome of M to the ERM: c* = ERM(S) = argmin(R(c),S) r* = R(c*,S) c C Can our mechanism simply compute and return the ERM?

Requirements Introduction Motivation Model Results MOST IMPORTANT SLIDE Are there any mechanisms that guarantee both SP and good approximation? (Lying) (Truth) • Good approximation: SR(M(S),S) ≤ α∙r* • Strategy-Proofness (SP): i,S,Si‘Ri(M(S-i, Si‘),S)≥ Ri(M(S),S) • ERM(S) is 1-approximating but not SP • ERM(S1) is SP but gives bad approximation

Related work Introduction Motivation Model Results • A study of SP mechanisms in Regression learning • O. Dekel, F. Fischer and A. D. Procaccia, SODA (2008), JCSS (2009). [supervised learning] • No SP mechanisms for Clustering • J. Perote-Peña and J. Perote, Economics Bulletin (2003) [unsupervised learning]

Previous work A simple case Introduction Motivation Model Results • Tiny concept class: |C|= 2 • Either “all positive” or “all negative” Theorem: • There is a SP 2-approximation mechanism • There are no SP α-approximation mechanisms, for any α<2 Meir, Procacciaand Rosenschein, AAAI 2008

Previous work General concept classes Introduction Motivation Model Results Theorem: Selecting a dictator at random is SP and guarantees approximation • True for any concept class C • Generalizes well from sampled data when C has a bounded VC dimension Open question #1: are there better mechanisms? Open question #2: what if agents are weighted? Meir, Procacciaand Rosenschein, IJCAI 2009

A lower bound Introduction Motivation Model Results • Our main result: • Matching the upper bound from IJCAI-09 • Proof is by a careful reduction to a voting scenario • We will see the proof sketch Theorem: There is a concept class C (where |C|=3), for which any SP mechanism has an approximation ratio of at least

Proof sketch Introduction Motivation Model Results Gibbard [‘77] proved that every (randomized) SP voting rule for 3 candidates, must be a lottery over dictators*. We define X = {x,y,z}, and C as follows: We also restrict the agents, so that each agent can have mixed labels on just one point

Proof sketch (cont.) Introduction Motivation Model Results • Suppose that M is SP

Proof sketch (cont.) Introduction Motivation Model Results cz> cy > cx cx> cz> cy • Suppose that M is SP • M must be monotone on the mixed point • M must ignore the mixed point • M is a (randomized) voting rule

Proof sketch (cont.) Introduction Motivation Model Results cz> cy > cx cx> cz> cy • By Gibbard [‘77], M is a random dictator • We construct an instance where random dictators perform poorly

Weighted agents Introduction Motivation Model Results • We must select a dictator randomly • However, probability may be based on weight • Naïve approach: • Only gives 3-approximation • An optimal SP algorithm: • Matches the lower bound of

Future work Introduction Motivation Model Results Other concept classes Other loss functions (linear loss, quadratic loss,…) Alternative assumptions on structure of data Other models of strategic behavior …

On the Limits of Dictatorial Classification

On the Limits of Dictatorial Classification

Presentation Transcript

The Idea of ‘Limits’

The Limits of ‘Culture’

The Rise of Dictatorial Regimes

On the limits of partial compaction

A Perspective on the Limits of Computation

Rise of Dictatorial Regimes

The Age of Limits

Limits on ILP

On the limits of partial compaction

C9S2: Rise of Dictatorial Regimes

The Rise of Dictatorial Regimes

On the Limits of Computing

The Limits of Technology

The Limits of Arbitrage

THE AGE OF LIMITS

The Limits of Reform

Limits on the Rate of Evolution

The Limits of Personas

More on Limits

The limits of Meritocracy

On the Limits of Computing