1 / 25

Kullback-Leibler Boosting

Learn about KLBoosting, a powerful classifier using linear projection and histogram divergence, optimized for face detection with KL features to achieve high accuracy. Explore its application, methodology, and comparison with AdaBoost for better understanding in research-driven settings.

rconner
Download Presentation

Kullback-Leibler Boosting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Asia Kullback-Leibler Boosting Ce Liu Heung-Yeung Shum Microsoft Research Asia

  2. Projection function Discriminating function Coefficients Identification function A General Two-layer Classifiers Intermediate Input Output

  3. RBF Polynomial Issues under Two-layer Framework • How to choose the type of projection function? • How to choose the type of discriminating function? • How to learn the parameters from samples? Projection function Discriminating function Sigmoid

  4. Our proposal • How to choose the type of projection function? • Kullback-Leibler linear feature • How to choose the type of discriminating function? • Histogram divergences • How to learn the parameters from samples? • Sample re-weighting (Boosting) Kullback-Leibler Boosting (KL Boosting)

  5. Intuitions • Linear projection is robust and easy to compute • The histograms of two classes upon a projection are evidences for classification • The linear feature, on which the histograms of two classes differ most, should be selected • If the weight distribution of the sample set changes, the histogram changes as well • Increase weights for misclassified samples, and decrease weights for correctly classified samples

  6. Linear projections and histograms

  7. KLBoosting (1) • At the kth iteration • Kullback-Leibler Feature • Discriminating function • Reweighting

  8. KLBoosting (2) • Two types of parameters to learn • KL features: • Combination coefficients: • Learning KL feature in low dimensions: MCMC • Learning weights to minimize training error • Optimization: brute-force search

  9. +1 -1 -1 +1 -1 +1 +1 +1 -1 +1 -1 +1 Output classifier Flowchart Input: Initialize weights Learn KL feature Learn combining coefficients Update weights Recognition error small enough? N Y

  10. A Simple Example KL Features Histograms Decision manifold

  11. A Complicated Case

  12. Kullback-Leibler Analysis (KLA) • A challenging task to find KL feature in image space • Sequential 1D optimization • Construct a feature bank • Build a set of the most promising features • Sequentially do 1D optimization along the promising features Conjecture: The global optimum of an objective function can be reached by searching along linear features as many as needed

  13. Intuition of Sequential 1D Optimization Result of Sequential 1D Optimization Feature bank MCMC feature Promising feature set

  14. Optimization in Image Space • Image is a random field, not a pure random variable • The local statistics can be captured by wavelets • 111×400 small-scale wavelets for the whole 20×20 patch • 80×100 large-scale wavelets for the inner 10×10 patch • Total 52,400 wavelets to compose a feature bank • 2,800 most promising wavelets selected Gaussian family wavelets Harr wavelets Feature bank

  15. Face patterns Non-face patterns Data-driven KLA On each position of the 20*20 lattice, compute the histograms of the 111 wavelets and the KL divergences between face and non-face images. Large scale wavelets are used to capture the global statistics, on the 10*10 inner lattice Compose the KL feature by sequential 1D optimization Promising feature set (total 2,800 features) Feature bank (111 wavelets)

  16. MCMC feature KL feature Best Harr wavelet KL=2.944 (Harr wavelet) KL=3.246 (MCMC feature) KL=10.967 (KL feature) Comparison with Other Features

  17. Application: Face Detection • Experimental setup • 20×20 patch to represent face • 17,520 frontal faces • 1,339,856,947 non-faces from 2,484 images • 300 bins in histogram representation • A cascade of KLBoosting classifiers • In each classifier, keep false negative rate <0.01% and false alarm rate <35% • Totally 22 classifiers to form the cascade (450 features)

  18. KL Features of Face Detector Face patterns Non-face patterns First 10 KL features Global semantics Frequency filters Local features Some other KL features

  19. ROC Curve

  20. Some Detection Results

  21. Comparison with AdaBoost

  22. Compared with AdaBoost

  23. Summary • KLBoosting is an optimal classifier • Projection function: linear projection • Discrimination function: histogram divergence • Coefficients: optimized by minimizing training error • KLA: a data-driven approach to pursue KL features • Applications in face detection

  24. Research Asia Thank you! Harry Shum Microsoft Research Asia hshum@microsoft.com

  25. Compared with SVM

More Related