1 / 10

Boosting

Boosting. Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified in the last round. combine the classifiers by letting them vote on the final prediction (like bagging).

Download Presentation

Boosting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Boosting • Main idea: • train classifiers (e.g. decision trees) in a sequence. • a new classifier should focus on those cases which were incorrectly classified in the last round. • combine the classifiers by letting them vote on the final prediction (like bagging). • each classifier could be (should be) very “weak”, e.g. a decision stump.

  2. Example this line is one simple classifier saying that everything to the left + and everything to the right is -

  3. Boosting Intuition • We adaptively weigh each data case. • Data cases which are wrongly classified get high weight (the algorithm will focus on them) • Each boosting round learns a new (simple) classifier on the weighed dataset. • These classifiers are weighed to combine them into a single powerful classifier. • Classifiers that that obtain low training error rate have high weight. • We stop by using monitoring a hold out set (cross-validation).

  4. Boosting in a Picture boosting rounds training cases correctly classified training case has large weight in this round this DT has a strong vote.

  5. And in animation Original Training set : Equal Weights to all training samples Taken from “A Tutorial on Boosting” by Yoav Freund and Rob Schapire

  6. AdaBoost(Example) ROUND 1

  7. AdaBoost(Example) ROUND 2

  8. AdaBoost(Example) ROUND 3

  9. AdaBoost(Example)

  10. AdaBoost • Given:                                                      where • Initialise •                           . • For                               : • Find the classifier                                                         that minimizes the error with respect to the distribution Dt: • Prerequisite: εt < 0.5, otherwise stop. • Choose                     , typically                                            where εt is the weighted error rate of classifier ht. • Update: • where Zt is a normalisation factor (chosen so that Dt + 1 will be a distribution). • Output the final classifier: • The equation to update the distribution Dt is constructed so that: • Thus, after selecting an optimal classifier       for the distribution        , the examples     that the classifier • identified correctly are weighted less and those that it identified incorrectly are weighted more. • Therefore, when the algorithm is testing the classifiers on the distribution            , it will select a • classifier that better identifies those examples that the previous classifer missed.

More Related