110 likes | 126 Views
Explore privacy-preserving prediction models for more accurate results. Learn differential privacy techniques to protect sensitive data while optimizing learning algorithms. Discover the trade-offs in linear regression and classifier models to ensure data security and accuracy.
E N D
Privacy-preserving Prediction Vitaly Feldman Brain with Cynthia Dwork
Privacy-preserving learning • Input:dataset • Goal:given predict Differentially private learning algorithm Model
Trade-offs Linear regression in With -DP needs factor more data [Bassily,Smith,Thakurta 14] Learning a linear classifier over Needs factor more data [Feldman,Xiao 13] MNIST accuracy with small vs 99.8% without privacy [AbadiCGMMTZ 16]
Prediction Users need predictions not models Fits many existing systems Prediction API Users DP
Attacks Black-box membership inference with high accuracy [Shokri,Stronati,Song,Shmatikov 17; LongBWBWTGC 18; SalemZFHB 18]
Learning with DP prediction Accuracy-privacy trade-off Single prediction query • Differentially private prediction : • is -DP prediction algorithm if for every , is -DP private w.r.t.
Label aggregation • [HCB 16; PAEGT 17; PSMRTE 18; BTT 18] (non-DP) learning algo Differentially private aggregation e.g. exponential mechanism
Classification via aggregation PAC model: Let be a class of function over For all distributions over output such that w.h.p. • Realizable case: Agnostic: • Representation dimension[Beimel,Nissim,Stemmer 13] • [KLNRS 08] • For many classes [F.,Xiao 13]
Prediction stability • À la [Bousquet,Elisseeff 02]: • is uniformly -stable algorithm if for every, neighboring and , • Convex regression: given • For over ,minimize: • over convex , where is convex in for all • Convex -Lipschitz regression over ball of radius : • Excess loss:
Beyond aggregation • Threshold functions on a line Excess error for agnostic learning DP prediction implies generalization
Conclusions • Natural setting for learning with privacy • Better accuracy-privacy trade-off • Paper (COLT 2018): https://arxiv.org/abs/1803.10266 • Open problems: • General agnostic learning • Other general approaches • Handling of multiple queries [BTT 18]