1 / 16

Boosting Neural Networks

Boosting Neural Networks. Published by Holger Schwenk and Yoshua Benggio Neural Computation , 12(8):1869-1887, 2000. Presented by Yong Li. Outline. Introduction AdaBoost 3 versions of AdaBoost for Neural Network Results Conclusions Discussions. Introduction.

dinah
Download Presentation

Boosting Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Boosting Neural Networks Published by Holger Schwenk and Yoshua Benggio Neural Computation, 12(8):1869-1887, 2000. Presented by Yong Li

  2. Outline • Introduction • AdaBoost • 3 versions of AdaBoost for Neural Network • Results • Conclusions • Discussions

  3. Introduction • Boosting – a general method to improve the performance of a learning method. • AdaBoost is a relatively new one of Boosting algorithms. • Many empirical studies for AdaBoost using decision tree as base classifiers. (Breiman 1996, Drucker and cortes, 1996, et al) • Also theoretically understanding. (Schapire et al 1997, Breidman 1998, Schapire 1999)

  4. Introduction • But applications have all been to decision trees. No applications to multi-layer artificial neural networks. (At that time) • The questions which this paper try to answer • Does AdaBoost work as well for neural networks as for decision tree? • Does it behave in a similar way? • And more?

  5. AdaBoost (Adaptive Boosting) • It is often possible to increase the accuracy of a classifier by averaging the decisions of an ensemble of classifiers. • Two popular ensemble methods. Bagging and Boosting. • Bagging improves generation performance due to a reduction in variance while maintaining or only slightly increasing bias. • AdaBoost constructs a composite classifier by sequentially training classifier while putting more and more emphasis on certain patterns.

  6. AdaBoost • AdaBoost M2 is used in the experiments

  7. Applying AdaBoost to neural networks • Three versions of AdaBoost are compared in this paper. • (R) Training the t-th classifier with a fixed training set • (E) Training the t-th classifier using a different training set at each epoch • (W) Training the t-th calssifier by directly weighting the cost function of the t-th neural network.

  8. Results • Experiments are performed on three data sets. • The online data set collected at Paris 6 university • 22 attributes([-1 1]22), 10 classes. • 1200 examples for learning and 830 examples for testing • UCI letter • 16 attributes and 26 classes • 16000 for training and 4000 for testing • Satimage Data set • 36 attributes and 6 classes • 4435 for training and 2000 for testing

  9. Results of online data

  10. Results of online data • Some conclusions • Boosting is better than Bagging • AdaBoost is less useful for very big networks. • (E) and (W) versions are better than (R)

  11. Results of online data • The generation errors continue decrease after the training error reach zero.

  12. Results of online data The number of examples with high margin increases when more classifier are combined by boosting Note: There are opposite results about the margin cumulative distribution.

  13. Results of online data Bagging has no significant influence on the margin distribution

  14. The results for UCI letters and Satimage data sets • Only E and W version are applied. They obtain same results. • The same conclusions are drawn as those of online data. (Some results are omitted)

  15. Conclusion • AdaBoost can significantly improve the neural classifiers. • Does AdaBoost work as well for neural networks as for decision tree? • Answer Yes • Does it behave in a similar way? • Answer Yes • Overfitting • Still there • Other questions • Short answers

  16. Discussions • Empirically shows AdaBoost works well for neural networks • The algorithm description is misleading. • Dt(i), Dt(i, y)

More Related