1 / 12

Boosting

Boosting. Rong Jin. Bagging. D. Boostrap Sampling. …. D 1. D 2. D k. h 1. h 2. h k. Inefficiency with Bagging. Inefficient boostrap sampling: Every example has equal chance to be sampled No distinction between “easy” examples and “difficult” examples.

ganit
Download Presentation

Boosting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Boosting Rong Jin

  2. Bagging D Boostrap Sampling … D1 D2 Dk h1 h2 hk Inefficiency with Bagging • Inefficient boostrap sampling: • Every example has equal chance to be sampled • No distinction between “easy” examples and “difficult” examples • Inefficient model combination: • A constant weight for each classifier • No distinction between accurate classifiers and inaccurate classifiers

  3. Improve the Efficiency of Bagging • Better sampling strategy • Focus on the examples that are difficult to classify • Better combination strategy • Accurate model should be assigned larger weights

  4. + + Classifier3 Classifier1 Classifier2 No training mistakes !!  May overfitting !! Training Examples Mistakes Mistakes X1 Y1 X1 Y1 X3 Y3 X1 Y1 X2 Y2 X3 Y3 X4 Y4 Intuition

  5. AdaBoost Algorithm

  6. Sample Sample x5, y5 x3, y3 x1, y1 x2, y2 x3, y3 x4, y4 x5, y5 x5, y5 x3, y3 x1, y1 x4, y4 x3, y3 x2, y2 x1, y1 x1, y1 x1, y1 x2, y2 x3, y3 x4, y4 x5, y5 D0: 1/5 1/5 1/5 1/5 1/5 Training Training Update Weights      h1 h2 h1 D2: D1: 2/9 2/7 1/9 1/7 4/9 2/7 1/7 1/9 1/7 1/9      h2 Update Weights Sample … AdaBoost Example: t=ln2

  7. How To Choose t in AdaBoost? • How to construct the best distribution Dt+1(i) • Dt+1(i) should be significantly different from Dt(i) • Dt+1(i) should create a situation that classifier ht performs poorly

  8. How To Choose t in AdaBoost?

  9. Optimization View for Choosing t • ht(x): x{1,-1}; a base (weak) classifier • HT(x): a linear combination of basic classifiers • Goal: minimize training error • Approximate error switha exponential function

  10. AdaBoost: Greedy Optimization Fix HT-1(x), and solve hT(x) and t

  11. Empirical Study of AdaBoost • AdaBoosting decision trees • Generate 50 decision trees by AdaBoost • Linearly combine decision trees using the weights of AdaBoost • In general: • AdaBoost= Bagging > C4.5 • AdaBoostusually needs less number of classifiers than Bagging

  12. Bia-Variance Tradeoff for AdaBoost • AdaBoost can reduce both variance and bias simultaneously variance bias single decision tree Bagging decision tree AdaBoosting decision trees

More Related