Generalization bounds for uniformly stable algorithms

Generalization bounds for uniformly stable algorithms Vitaly Feldman Brain with Jan Vondrak

Uniform stability • Domain (e.g. ) • Dataset • Learning algorithm • Loss function • Uniform stability [Bousquet,Elisseeff 02]: • has uniform stability w.r.t. if • For all neighboring

Generalization error • Probability distribution over , • Population loss: • Empirical loss: • Generalization error/gap:

Stochastic convex optimization • For all is convex -Lipschitz in • Minimize over • For all , being SGD with rate : • Uniform convergence error: • ERMmight have generalization error: • [Shalev-Shwartz,Shamir,Srebro,Sridharan ’09; Feldman 16]

Stable optimization Strongly convex ERM [BE 02, SSSS 09] is uniformly stable and minimizes within Gradient descent on smooth losses [Hardt,Recht,Singer 16] : steps of GD on with is uniformly stable and minimizes within

Generalization bounds For w/ range and uniform stability [Rogers,Wagner 78] [Bousquet,Elisseeff 02] Vacuous when NEW!

Comparison [BE 02] This work Generalization error

Second moment Previously [Devroye,Wagner 79; BE 02] • Chebyshev: TIGHT! NEW! [BE 02] This work Generalization error

Implications For any loss , has uniform stability Stronger generalization bounds for DP prediction algorithms • Differentially-private prediction [Dwork,Feldman 18] • . A randomized algorithm has -DP prediction if for all

Data-dependent functions Consider . E.g. • has uniform stability if • For all neighboring • Generalization error/gap:

Generalization in expectation Goal: For all and ,

Concentration via McDiarmid For all neighboring 1. 2. McDiarmid: where

Proof technique Based on [Nissim,Stemmer 15; BNSSSU 16] Let for Need to bound • Let be a distribution over UNSTABLE!

Stable max Exponential mechanism [McSherry,Talwar 07] : sample • Stable: - differentially private • Approximates max

Game over Pick get • Let be a distribution over

Conclusions • Better understanding of uniform stability • New technique • Open • Gap between upper and lower bounds • High probability generalization without strong uniformity

Generalization bounds for uniformly stable algorithms

Generalization bounds for uniformly stable algorithms

Presentation Transcript

Map Generalization

Lower Bounds for Sorting

Glittering Generalization

Generalization Specialization

Bounds

A. GENERALIZATION

Stable Feature Selection: Theory and Algorithms

Generalization for Multi-scale Mapping

Generalization for Multi-scale Mapping

Generalization

Hasty Generalization

Uniformly-Fractured Aquifers

Algorithms vs. Circuit Lower Bounds

Estimation Bounds for Localization

Generalization Bounds for Clustering - Some Thoughts and Many Questions

Lower Bounds for Comparison-Based Sorting Algorithms (Ch. 8)

Data Anonymization - Generalization Algorithms

New Algorithms and Lower Bounds for Monotonicity Testing of Boolean Functions

Generalization

Uniformly Most Powerful Tests

Estimation Bounds for Localization

A. GENERALIZATION