1 / 26

Optimizing number of hidden neurons in neural networks

Optimizing number of hidden neurons in neural networks. IASTED International Conference on Artificial Intelligence and Applications Innsbruck, Austria Feb, 2007. Janusz A. Starzyk School of Electrical Engineering and Computer Science Ohio University Athens Ohio U.S.A. Outline.

kato-witt
Download Presentation

Optimizing number of hidden neurons in neural networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimizing number of hidden neurons in neural networks IASTED International Conference on Artificial Intelligence and Applications Innsbruck, Austria Feb, 2007 Janusz A. Starzyk School of Electrical Engineering and Computer Science Ohio University Athens Ohio U.S.A

  2. Outline • Neural networks – multi-layer perceptron • Overfitting problem • Signal-to-noise ratio figure (SNRF) • Optimization using signal-to-noise ratio figure • Experimental results • Conclusions

  3. Inputs x Outputs z Neural networks– multi-layer perceptron (MLP)

  4. outputs inputs MLP Neural networks– multi-layer perceptron (MLP) • Efficient mapping from inputs to outputs • Powerful universal function approximation • Number of inputs and outputs determined by the data • Number of hidden neurons: determines the fitting accuracy  critical

  5. MLP training Training data (x, y) Model new data (x’) Model y’ Overfitting problem • Generalization: • Overfitting: overestimates the function complexity, degrades generalization capability • Bias/variance dilemma • Excessive hidden neuron  overfitting

  6. Overfitting problem • Avoid overfitting: cross-validation & early stopping training data (x, y) Training error etrain MLP training All available training data (x, y) testing data (x’, y’) MLP testing Testing error etest Fitting error etest Stopping criterion: etest starts to increase or etrainand eteststart to diverge etrain Number of hidden neurons Optimum number

  7. Overfitting problem • How to divide available data? • When to stop? Fitting error training data (x, y) All available training data (x, y) etest testing data (x’, y’) etrain data wasted Number of hidden neurons Optimum number • Can test error catch the generalization error?

  8. Overfitting problem • Desired: • Quantitative measure of unlearned useful information from etrain • Automatic recognition of overfitting

  9. Signal-to-noise ratio figure (SNRF) • Sampled data: function value + noise • Error signal: approximation error component + noise component Useful signal Should be reduced Noise part Should not be learned • Assumption: continuous function & WGN as noise • Signal-to-noise ratio figure (SNRF): signal energy/noise energy • Compare SNRFe and SNRFWGN Learning should stop – ? If there is useful signal left unlearned If noise dominates in the error signal

  10. Error signal Training data and approximating function Signal-to-noise ratio figure (SNRF) – one-dimensional case How to measure the level of these two components? noise component approximation error component +

  11. Signal-to-noise ratio figure (SNRF) – one-dimensional case High correlation between neighboring samples of signals

  12. Signal-to-noise ratio figure (SNRF) – one-dimensional case

  13. Signal-to-noise ratio figure (SNRF) – one-dimensional case Hypothesis test: 5% significance level

  14. Signal-to-noise ratio figure (SNRF) – multi-dimensional case • Signal and noise level: estimated within neighborhood sample p M neighbors

  15. Signal-to-noise ratio figure (SNRF) – multi-dimensional case All samples

  16. Signal-to-noise ratio figure (SNRF) – multi-dimensional case M=1  threshold multi-dimensional (M=1)≈ threshold one-dimensional

  17. Optimization using SNRF • SNRFe< threshold SNRFWGN • Start with small network • Train the MLP  etrain • Compare SNRFe & SNRFWGN • Add hidden neurons Noise dominates in the error signal, Little information left unlearned, Learning should stop Stopping criterion: SNRFe< threshold SNRFWGN

  18. Optimization using SNRF • Set the structure of MLP • Train the MLP with back-propagation iteration  etrain • Compare SNRFe & SNRFWGN • Keep training with more iterations Applied in optimizing number of iterations in back-propagation training to avoid overfitting (overtraining)

  19. Experimental results • Optimizing number of iterations noise-corrupted 0.4sinx+0.5

  20. Optimization using SNRF • Optimizing order of polynomial

  21. Experimental results • Optimizing number of hidden neurons two-dimensional function

  22. Experimental results

  23. Experimental results • Mackey-glass database Every consecutive 7 samples  the following sample MLP

  24. Experimental results WGN characteristic

  25. Experimental results • Puma robot arm dynamics database 8 inputs (positions, velocities, torques) angular acceleration MLP

  26. Conclusions • Quantitative criterion based on SNRF to optimize number of hidden neurons in MLP • Detect overfitting by training error only • No separate test set required • Criterion: simple, easy to apply, efficient and effective • Optimization of other parameters of neural networks or fitting problems

More Related