1 / 23

Relevant Previous Algorithms

Relevant Previous Algorithms. Linear discrimination LMS Gradient descent. Features. Simple but powerful Nonlinear functions Multilayers Heuristic. Feedforward Operation Example. y, Perceptron. Fourier’s Theorem. Kolmogorov. Universal expressive power.

palani
Download Presentation

Relevant Previous Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Relevant Previous Algorithms • Linear discrimination • LMS • Gradient descent

  2. Features • Simple but powerful • Nonlinear functions • Multilayers • Heuristic

  3. Feedforward Operation Example y, Perceptron

  4. Fourier’s Theorem Kolmogorov Universal expressive power too complex, cannot be smooth, don’t know how to find

  5. Gradient descent Iteration Back Propagation Algorithm Criterion function

  6. - Back Propagation Algorithm Hidden-to-output

  7. Back Propagation Algorithm Input-to-output Back Propagation

  8. Learning Curves Back Propagation Algorithm • Stochastic • Batch • On-line • Queries Training Protocols

  9. Error with small networks Back Propagation Algorithm

  10. Back Propagation Algorithm Training Examples

  11. Bayes Discriminants & Neural Networks

  12. Improving B-P Sigmoid i.e. hyperbolic tangent Transfer function • Gaussian • Nonlinear • Saturate • Continuity and smoothness • Monotonicity • Linear for small value of net • computational simplicity

  13. Improving B-P • Shift • Scale • on-line Scaling inputs Setting bias • Teaching • Limit net activation • Keep balance

  14. Improving B-P • Small training set • Virtual training patterns • Gaussian noise • More information • Representative • Rotation Training with noise Training with noise

  15. Improving B-P • Expressive power • Complexity of decision boundary • Based on pattern distribution Number of hidden units Rule of thumb: n/10

  16. Improving B-P Initialize weights Fast and uniform learning

  17. ( ) -1 Improving B-P • Convergence • Speed • Quality Learning rates Optimal learning rate is the one which leads to the local error minimum in one learning step

  18. Improving B-P • Learn more quickly with plateaus • Speeding up even far from error plateaus Momentum

  19. Small weights Heuristic: Improved network performance Implementation: Improving B-P Weight decay

  20. Improving B-P Hints Add information or constraints to aid category learning.

  21. Improving B-P • Overfitting • Less than some preset value • Error on a validation set reaches a minimum • Equivalent to weight decay Stopped Training

  22. Improving B-P How many hidden Layers?

More Related