1 / 17

CMS 165 Lecture 8

CMS 165 Lecture 8. Approximation and Generalization in Neural Networks. Recall from previous lecture: Hypothesis class: A loss function: Expected risk: Expected risk minimizer: Given a set of samples: Empirical risk:

cpurcell
Download Presentation

CMS 165 Lecture 8

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CMS 165 Lecture 8 Approximation and Generalization in Neural Networks

  2. Recall from previous lecture: Hypothesis class: A loss function: Expected risk: Expected risk minimizer: Given a set of samples: Empirical risk: Empirical risk minimizer:

  3. Measures of Complexity with prob at least VC-Dimension: Linear class Bounded linear class

  4. Rademacher complexity of NN From notes of Percy Liang

  5. Decomposition of Errors Derivation for linear regression

  6. Universality of NN

  7. Approximation in Shallow NN Universality proof is loose: exponential number of units. Better bound? Better basis? How does it improve bound for various classes of functions?

  8. Deep vs. Shallow Networks What is the advantage of deep networks? Compositionality: requires exponential number of units in a shallow network

  9. Classical NN theory

  10. Modern Neural Networks From Belkin etal, “Reconciling modern machine learning and the bias-variance trade-off”

  11. Seems to be true in practice Slides from Ben Recht

  12. Is it really true? Slides from Ben Recht

  13. Look closely at data.. Slides from Ben Recht

  14. Solution? Better Test Sets.. Slides from Ben Recht

  15. Accuracy on harder test set Slides from Ben Recht

  16. True even on Imagenet Slides from Ben Recht

  17. Is this a good summary? Slides from Ben Recht

More Related