160 likes | 200 Views
This work explores the concept of uniform stability in stochastic convex optimization, providing insights into generalization bounds for uniformly stable algorithms. The study analyzes uniform stability w.r.t. loss functions, learning algorithms, and data sets, emphasizing the implications for differential-privacy prediction algorithms. The text investigates the relationship between gradient descent and uniformly stable optimization methods, highlighting the convergence error and generalization error gaps. It introduces a new technique for bounding generalization error with a broader understanding of uniform stability and the potential for high-probability generalization without strong uniformity.
E N D
Generalization bounds for uniformly stable algorithms Vitaly Feldman Brain with Jan Vondrak
Uniform stability • Domain (e.g. ) • Dataset • Learning algorithm • Loss function • Uniform stability [Bousquet,Elisseeff 02]: • has uniform stability w.r.t. if • For all neighboring
Generalization error • Probability distribution over , • Population loss: • Empirical loss: • Generalization error/gap:
Stochastic convex optimization • For all is convex -Lipschitz in • Minimize over • For all , being SGD with rate : • Uniform convergence error: • ERMmight have generalization error: • [Shalev-Shwartz,Shamir,Srebro,Sridharan ’09; Feldman 16]
Stable optimization Strongly convex ERM [BE 02, SSSS 09] is uniformly stable and minimizes within Gradient descent on smooth losses [Hardt,Recht,Singer 16] : steps of GD on with is uniformly stable and minimizes within
Generalization bounds For w/ range and uniform stability [Rogers,Wagner 78] [Bousquet,Elisseeff 02] Vacuous when NEW!
Comparison [BE 02] This work Generalization error
Second moment Previously [Devroye,Wagner 79; BE 02] • Chebyshev: TIGHT! NEW! [BE 02] This work Generalization error
Implications For any loss , has uniform stability Stronger generalization bounds for DP prediction algorithms • Differentially-private prediction [Dwork,Feldman 18] • . A randomized algorithm has -DP prediction if for all
Data-dependent functions Consider . E.g. • has uniform stability if • For all neighboring • Generalization error/gap:
Generalization in expectation Goal: For all and ,
Concentration via McDiarmid For all neighboring 1. 2. McDiarmid: where
Proof technique Based on [Nissim,Stemmer 15; BNSSSU 16] Let for Need to bound • Let be a distribution over UNSTABLE!
Stable max Exponential mechanism [McSherry,Talwar 07] : sample • Stable: - differentially private • Approximates max
Game over Pick get • Let be a distribution over
Conclusions • Better understanding of uniform stability • New technique • Open • Gap between upper and lower bounds • High probability generalization without strong uniformity