Strategy-Proof Classification

Strategy-Proof Classification Reshef Meir School of Computer Science and Engineering, Hebrew University A joint work with Ariel. D. Procaccia and Jeffrey S. Rosenschein

Strategy-Proof Classification • An Example of Strategic Labels in Classification • Motivation • Our Model • Previous work (positive results) • An impossibility theorem • More results (if there is time) (~12 minutes)

Introduction Motivation Model Results Strategic labeling: an example ERM 5 errors

Introduction Motivation Model Results There is a better classifier! (for me…)

Introduction Motivation Model Results If I will only change the labels… 2+4 = 6 errors

Classification Introduction Motivation Model Results E(x,y)~D[ c(x)≠y ] The Supervised Classification problem: • Input: a set of labeled data points {(xi,yi)}i=1..m • output: a classifier c from some predefined concept class C( functions of the form f : X{-,+} ) • We usually want c to classify correctly not just the sample, but to generalize well, i.e .to minimize R(c) ≡ the expected number of errors w.r.t. the distribution D

Classification (cont.) Introduction Motivation Model Results A common approach is to return the ERM, i.e. the concept in C that is the best w.r.t. the given samples (has the lowest number of errors) Generalizes well under some assumptions on the concept class C With multiple experts, we can’t trust our ERM!

Where do we find “experts” with incentives? Introduction Motivation Model Results Example 1: A firm learning purchase patterns • Information gathered from local retailers • The resulting policy affects them • “the best policy, is the policy that fits my pattern”

Introduction Motivation Model Results Example 2: Internet polls / expert systems Users Reported Dataset Classification Algorithm Classifier

Related work Introduction Motivation Model Results • A study of SP mechanisms in Regression learning • O. Dekel, F. Fischer and A. D. Procaccia, Incentive Compatible Regression Learning, SODA 2008 • No SP mechanisms for Clustering • J. Perote-Peña and J. Perote. The impossibility of strategy-proof clustering, Economics Bulletin, 2003

Introduction Motivation Model Results A problem instance is defined by • Set of agentsI = {1,...,n} • A partial dataset for each agent i I, Xi = {xi1,...,xi,m(i)}  X • For each xikXi agent i has a label yik{,} • Each pair sik=xik,yik is an example • All examples of a single agent compose the labeled dataset Si = {si1,...,si,m(i)} • The joint dataset S= S1 , S2 ,…, Sn is our input • m=|S| • We denote the dataset with the reported labels by S’

Introduction Motivation Model Results Input: Example – – + – – – + – + + + + – + + – X1 Xm1 X2 Xm2 X3 Xm3 Y1 {-,+}m1 Y2 {-,+}m2 Y3 {-,+}m3 S = S1, S2,…, Sn = (X1,Y1),…, (Xn,Yn)

Introduction Motivation Model Results Incentives and Mechanisms • A Mechanism M receives a labeled dataset S’ and outputs c C • Private risk of i: Ri(c,S) = |{k: c(xik)  yik}| / mi • Global risk: R(c,S) = |{i,k: c(xik)  yik}| / m • We allow non-deterministic mechanisms • The outcome is a random variable • Measure the expected risk

Introduction Motivation Model Results ERM We compare the outcome of M to the ERM: c* = ERM(S) = argmin(R(c),S) r* = R(c*,S) c C Can our mechanism simply compute and return the ERM?

Requirements Introduction Motivation Model Results MOST IMPORTANT SLIDE Are there any mechanisms that guarantee both SP and good approximation? • Good approximation: SR(M(S),S) ≤ β∙r* • Strategy-Proofness (SP): i,S,Si‘Ri(M(S-i , Si‘),S)≥ Ri(M(S),S) • ERM(S) is 1-approximating but not SP • ERM(S1) is SP but gives bad approximation

Restricted settings Introduction Motivation Model Results R. Meir, A. D. Procaccia and J. S. Rosenschein, Incentive Compatible Classification under Constant Hypotheses: A Tale of Two Functions, AAAI 2008 • A very small concept class: |C| = 2 • There is a deterministic SP mechanism that obtains a 3-approximation ratio • This bound is tight • Randomization can improve the bound to 2

Restricted settings (cont.) Introduction Motivation Model Results R. Meir, A. D. Procaccia and J. S. Rosenschein, Incentive Compatible Classification with Shared Inputs, IJCAI 2009. • Agents with similar interests: • There is a randomized SP 3-approximation mechanism (works for any class C)

But not everything shines  Introduction Motivation Model Results Without restrictions on the input, we cannot guarantee a constant approximation ratio Our main result: Theorem: There is a concept class C, for which there are no deterministic SP mechanisms with o(m)-approximation ratio

Deterministic lower bound Introduction Motivation Model Results = Proof idea: • First construct a classification problem that is equivalent to a voting problem with 3 candidates • Then use the Gibbard-Satterthwaite theorem to prove that there must be a dictator • Finally, the dictator’s opinion might be very far from the optimal classification

Proof (1) Introduction Motivation Model Results We do not set the labels. Instead, we denote by Y all the possible labelings of an agent’s dataset. Construction: We have X={a,b}, and 3 classifiers as follows The dataset contains two types of agents, with samples distributed unevenly over a and b

Proof (2) Introduction Motivation Model Results Let P be the set of all 6 orders over C A voting rule is a function of the form f: Pn  C But our mechanism is a function M: Yn  C ! (its input are labels and not orders) Lemma 1: there is a valid mapping g: Pn  Yn, s.t. (M*g) is a voting rule

Proof (3) Introduction Motivation Model Results Lemma 2: If M is SP, and guarantees any bounded approximation ratio, then f=M*g is dictatorial Proof: (f is onto) any profile that c classifies perfectly must induce the selection of c (f is SP) suppose there is a manipulation By mapping this profile to labels with g, we find a manipulation of M, in contradiction to its SP From the G-S theorem, f must be dictatorial

Proof (4) Introduction Motivation Model Results Finally, f (and thus M) can only be dictatorial. We assume w.l.o.g. that the dictator is agent 1 of type Ia. We now label the data points as follows: The optimal classifier is cab, which makes 2 errors The dictator selects ca, which makes m/2 errors

Real concept classes Introduction Motivation Model Results • We managed to show that there are no good (deterministic) SP mechanisms, but only for a synthetically constructed class. • We are interested in more common classes, that are really used in machine learning. For example: • Linear Classifiers • Boolean Conjunctions

Introduction Motivation Model Results Linear classifiers Only 2 errors “a” “b” cb ca cab Ω(√m) errors

Introduction Motivation Model Results A lower bound for randomized SP mechanisms • A lottery over dictatorships is still bad • Ω(k) instead of Ω(m), where k is the size of the largest dataset controlled by an agent ( m ≈ k*n ) • However, it is not clear how to eliminate other mechanisms • G-S works only for deterministic mechanisms • Another theorem by Gibbard [’79] can help • But only under additional assumptions

Introduction Motivation Model Results Upper bounds • So, our lower bounds do not leave much hope for good SP mechanisms • We would still like to know if they are tight A deterministic SP O(m)-approximation is easy: • break ties iteratively according to dictators What about randomized SP O(k) mechanisms?

Introduction Motivation Model Results The iterative random dictator (IRD) (example with linear classifiers on R1) v v

Introduction Motivation Model Results The iterative random dictator (IRD) (example with linear classifiers on R1) v v Iteration 1: 2 errors

Introduction Motivation Model Results The iterative random dictator (IRD) (example with linear classifiers on R1) v v Iteration 1: 2 errors Iteration 2: 5 errors Iteration 3: 0 errors

Introduction Motivation Model Results The iterative random dictator (IRD) (example with linear classifiers on R1) v v Iteration 1: 2 errors Iteration 2: 5 errors Iteration 3: 0 errors Iteration 4: 0 errors

Introduction Motivation Model Results The iterative random dictator (IRD) (example with linear classifiers on R1) v v Iteration 1: 2 errors Theorem: The IRD is O(k2) approximating for Linear Classifiers in R1 Iteration 2: 5 errors Iteration 3: 0 errors Iteration 4: 0 errors Iteration 5: 1 error

Future work Introduction Motivation Model Results Other concept classes Other loss functions Alternative assumptions on structure of data Other models of strategic behavior …

Thank you...

Strategy-Proof Classification

Strategy-Proof Classification

Presentation Transcript

Proof-reading

Approximately Strategy-Proof Voting

Proof

The Proof

Proof

PROOF

CLASSIFICATION: A WRITING STRATEGY FOR ORGANIZING IDEAS

PROOF

BURDEN OF PROOF STANDARD OF PROOF

Proof

Proof

Section 3.1: Proof Strategy

Strategy-Proof Classification

Group strategy proof mechanisms via primal-dual algorithms (Cost Sharing)

Proof techniques

PROOF WEAVING

Mathematical Proof

PROOF

Information Asset Classification Strategy

Asymptotically Optimal Strategy-Proof Mechanisms for Two-Facility Games

Proof Clustering for Proof Plans

Proof Methods