1 / 21

Kullback-Leibler Boosting

This paper presents KLBoosting, a variation of RealBoost that uses Kullback-Leibler divergence to select optimal features. It also includes a detailed description of the feature selection process and parameter learning in KLBoosting compared to AdaBoost.

pbrinker
Download Presentation

Kullback-Leibler Boosting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Kullback-Leibler Boosting Ce Liu, Hueng-Yeung Shum Microsoft Research Asia CVPR 2003 Presented by Derek Hoiem

  2. RealBoost Review • Start with some candidate feature set • Initialize training sample weights • Loop: • Add feature to minimize error bound • Reweight training examples, giving more weight to misclassified examples • Assign weight to weak classifier according to weighted error of training samples • Exit loop after N features have been added

  3. The Basic Idea of KLBoosting • Similar to RealBoost except: • Features are general linear projections • Generates optimal features • Uses KL divergence to select features • Finer tuning on coefficients

  4. Linear Features • KLBoosting: • VJ Adaboost:

  5. What makes a feature good? • KLBoosting: • RealBoost: • Minimize upper bound on classification error

  6. Creating the feature set • Sequential 1-D Optimization • Begin with large initial set of features (linear projections) • Choose top L features according to KL-Div • Initial feature = weighted sum of L features • Search for optimal feature in directions of L features

  7. Example • Initial feature set: x x x x x x x x

  8. Example • Top two features (by KL-Div): x x x x x x x x w1 w2

  9. Example • Initial feature (weighted combo by KL): x x x x x x x x w1 f0 w2

  10. Example • Optimize over w1 x x x x x x x x f1= f0 + B* w1 w1 f1 w2 B = -a1..a1

  11. Example • Optimize over w2 x x x x x x x x f2= f1 + B* w2 w1 f2 w2 B = -a2..a2 (and repeat…)

  12. Creating the feature set First three features Selecting the first feature

  13. Creating feature set

  14. Classification = ½ in RealBoost

  15. Parameter Learning • With each added feature k: • Set first a1..ak-1 to current optimal value • Set ak to 0 • Minimize recognition error on training: • Solve using greedy algorithm

  16. KLBoost vs AdaBoost 1024 candidate features for AdaBoost

  17. Face detection: candidate features 52,400  2,800450

  18. Face detection: training samples • 8760 faces + mirror images • 2484 non-face images  1.34B patches • Cascaded classifier allows bootstrapping

  19. Face detection: final features top ten global semantic global not semantic local

  20. Results Test time: .4 sec per 320x240 image x x x x 8 85 853 Schneiderman (2003)

  21. Comments • Training time? • Which improves performance: • Generating optimal features? • KL feature selection? • Optimizing alpha coefficients?

More Related