690 likes | 785 Views
Principled Asymmetric Boosting Approaches to Rapid Training and Classification in Face Detection. presented by. Minh-Tri Pham Ph.D. Candidate and Research Associate Nanyang Technological University, Singapore. Outline. Motivation Contributions Automatic Selection of Asymmetric Goal
E N D
Principled Asymmetric Boosting Approaches to Rapid Training and Classification in Face Detection presented by Minh-Tri PhamPh.D. Candidate and Research AssociateNanyang Technological University, Singapore
Outline • Motivation • Contributions • Automatic Selection of Asymmetric Goal • Fast Weak Classifier Learning • Online Asymmetric Boosting • Generalization Bounds on the Asymmetric Error • Future Work • Summary
Outline • Motivation • Contributions • Automatic Selection of Asymmetric Goal • Fast Weak Classifier Learning • Online Asymmetric Boosting • Generalization Bounds on the Asymmetric Error • Future Work • Summary
Application Face recognition
Application 3D face reconstruction
Application Camera auto-focusing
Application • Windows face logon • Lenovo Veriface Technology
1 0 Appearance-based Approach • Scan image with probe window patch (x,y,s) • at different positions and scales • Binary classify each patch into • face, or • non-face • Desired output state: • (x,y,s) containing face • Most popular approach • Viola-Jones ‘01-’04, Li et.al. ‘02, Wu et.al. ’04, Brubaker et.al. ‘04, Liu et.al. ’04, Xiao et.al ‘04, • Bourdev-Brandt ‘05, Mita et.al. ‘05, Huang et.al. ’05 – ‘07, Wu et.al. ‘05, Grabner et.al. ’05-’07, • And many more
1 0 Appearance-based Approach • Statistics: • 6,950,440 patches in a 320x240 image • P(face) < 10-5 • Key requirement: • A very fast classifier
A very fast classifier • Cascade of non-face rejectors: pass pass pass pass F1 F2 FN face …. reject reject reject non-face • A very fast classifier
A very fast classifier • Cascade of non-face rejectors: • F1, F2, …, FN : asymmetric classifiers • FRR(Fk) 0 • FAR(Fk) as small as possible (e.g. 0.5 – 0.8) pass pass pass pass F1 F1 F1 F1 F2 F2 F2 F2 FN FN FN FN F1 F1 F1 F1 F1 F2 F2 F2 F2 F2 FN face face face face face …. reject reject reject non-face non-face non-face non-face non-face
A very fast classifier • Cascade of non-face rejectors: • F1, F2, …, FN : asymmetric classifiers • FRR(Fk) 0 • FAR(Fk) as small as possible (e.g. 0.5 – 0.8) pass pass pass pass F1 F2 FN face …. reject reject reject non-face
Non-face Rejector • A strong combination of weak classifiers: F1 yes + + + f1,K pass …. f1,1 f1,2 no reject • f1,1, f1,2, …, f1,K : weak classifiers • : threshold > ?
Boosting Wrongly classified Weak Classifier Learner 1 Weak Classifier Learner 2 Correctly classified Wrongly classified Correctly classified Stage 1 Stage 2 : negative example : positive example
Asymmetric Boosting • Weight positives times more than negatives Weak Classifier Learner 1 Weak Classifier Learner 2 Stage 1 Stage 2 : negative example : positive example
Non-face Rejector • A strong combination of weak classifiers: F1 yes + + + f1,K pass …. f1,1 f1,2 no reject • f1,1, f1,2, …, f1,K : weak classifiers • : threshold > ?
Non-face Rejector • A strong combination of weak classifiers: F1 yes + + + f1,K pass …. f1,1 f1,2 no reject • f1,1, f1,2, …, f1,K : weak classifiers • : threshold > ?
Weak classifier • Classify a Haar-like feature value Classify v input patch feature value v score
Weak classifier • Classifya Haar-like feature value Classify v input patch feature value v score …
Main issues • Requires too much intervention from experts
A very fast classifier • Cascade of non-face rejectors: • F1, F2, …, FN : asymmetric classifiers • FRR(Fk) 0 • FAR(Fk) as small as possible (e.g. 0.5 – 0.8) pass pass pass pass F1 F2 FN face …. reject reject reject non-face How to choose bounds for FRR(Fk) and FAR(Fk)?
Asymmetric Boosting How to choose ? • Weight positives times more than negatives Weak Classifier Learner 1 Weak Classifier Learner 2 Stage 1 Stage 2 : negative example : positive example
Non-face Rejector • A strong combination of weak classifiers: F1 yes + + + f1,K pass …. f1,1 f1,2 no reject • f1,1, f1,2, …, f1,K : weak classifiers • : threshold > ? How to choose ?
Main issues • Requires too much intervention from experts • Very long learning time
Weak classifier • Classifya Haar-like feature value Classify v input patch feature value v score 10 minutes to learn a weak classifier …
Main issues • Requires too much intervention from experts • Very long learning time • To learn a face detector ( 4000 weak classifiers): • 4,000 * 10 minutes 1 month • Only suitable for objects with small shape variance
Outline • Motivation • Contributions • Automatic Selection of Asymmetric Goal • Fast Weak Classifier Learning • Online Asymmetric Boosting • Generalization Bounds on the Asymmetric Error • Future Work • Summary
Outline • Motivation • Contributions • Automatic Selection of Asymmetric Goal • Fast Weak Classifier Learning • Online Asymmetric Boosting • Generalization Bounds on the Asymmetric Error • Future Work • Summary
Outline • Motivation • Contributions • Automatic Selection of Asymmetric Goal • Fast Weak Classifier Learning • Online Asymmetric Boosting • Generalization Bounds on the Asymmetric Error • Future Work • Summary
Detection with Multi-exit Asymmetric Boosting • CVPR’08 poster paper: • Minh-Tri Pham and Viet-Dung D. Hoang and Tat-Jen Cham. Detection with Multi-exit Asymmetric Boosting. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, Alaska, 2008. • Won Travel Grant Award
Problem overview pass pass pass pass • Common appearance-based approach: • F1, F2, …, FN : boosted classifiers • f1,1, f1,2, …, f1,K : weak classifiers • : threshold F1 F2 FN object …. reject reject reject non-object F1 yes + + + > ? f1,K pass …. f1,1 f1,2 no reject
Objective • Find f1,1, f1,2, …, f1,K, and such that: • K is minimized proportional to F1’s evaluation time F1 yes + + + > ? f1,K pass …. f1,1 f1,2 no reject
Existing trends (1) Idea • For k from 1 until convergence: • Let • Learn new weak classifier f1,k(x): • Let • Adjust to see if we can achieve FAR(F1) <= 0 and FRR(F1) <= 0: • Break loop if such exists Issues • Weak classifiers are sub-optimalw.r.t. training goal. • Too many weak classifiers are required in practice.
Existing trends (2) Idea • For k from 1 until convergence: • Let • Learn new weak classifier f1,k(x): • Break loop if FAR(F1) <= 0 and FRR(F1) <= 0 Pros • Reduce FRR at the cost of increasing FAR – acceptable for cascades • Fewer weak classifiers Cons • How to choose ? • Much longer training time Solution to con • Trial and error: • choose such that K is minimized.
Our solution Why? Learn every weak classifier using the same asymmetric goal: where
Because… FAR FAR (1) 1 1 (2) • Consider two desired bounds (or targets) for learning a boosted classifier • Exact bound: and • Conservative bound: • (2) is more conservative than (1) because (2) => (1). H1 = 0/0 = 1 H2 H3 H1 H4 H2 exact bound conservative bound exact bound conservative bound Q2 H39 0 0 H3 H40 Q1 Q1 Q3 Q4 Q2 Q3 H200 H201 H41 Q39 Q200 Q40 At for every new weak classifier learned, the ROC operating point moves the fastest toward the conservative bound Q201 Q41 0 b0 b0 1 FRR 0 1 FRR
Implication • When the ROC operation point lies in the conservative bound: • Conditions met, therefore = 0. F1 yes + + + > ? f1,K pass …. f1,1 f1,2 no reject
Multi-exit Boosting A method to train a single boosted classifier with multiple exit nodes: : a weak classifier: a weak classifier followed by a decision to continue or reject – an exit node + + + + + + + f1 f2 f3 f4 f5 f6 f7 f8 object pass pass pass F2 F3 F1 reject reject reject non-obj fi fi • Features: • Weak classifiers are trained with the same goal: • Every pass/reject decision is guaranteed with and • The classifier is a cascade. • Score is propagated from one node to another. • Main advantages: • Weak classifiers are learned (approximately) optimally. • No training of multiple boosted classifiers. • Much fewer weak classifiers are needed than traditional cascades.
ResultsGoal () vs. Number of weak classifiers (K) • Toy problem:To learn a (single-exit) boosted classifier F for classifying face/non-face patches such that FAR(F) < 0.8 and FRR(F) < 0.01 • Empirically best goal: • Our method chooses: • Similar results were obtained for tests on other desired error rates.
Ours vs. Others (in Face Detection) • Use Fast StatBoost as base method for fast-training a weak classifier.
Ours vs. Others (in Face Detection) • MIT+CMU Frontal Face Test set:
Conclusion • Multi-exit Asymmetric Boosting trains every weak classifier approximately optimally. • Better accuracy • Much fewer weak classifiers • Significantly reduces training time • No more trial-and-error for training a boosted classifier
Outline • Motivation • Contributions • Automatic Selection of Asymmetric Goal • Fast Weak Classifier Learning • Online Asymmetric Boosting • Generalization Bounds on the Asymmetric Error • Future Work • Summary
Outline • Motivation • Contributions • Automatic Selection of Asymmetric Goal • Fast Weak Classifier Learning • Online Asymmetric Boosting • Generalization Bounds on the Asymmetric Error • Future Work • Summary
Fast Training and Selection of Haar-like Features using Statistics • ICCV’07 oral paper: • Minh-Tri Pham and Tat-Jen Cham. Fast Training and Selection of Haar Features using Statistics in Boosting-based Face Detection. In Proc. International Conference on on Computer Vision (ICCV), Rio de Janeiro, Brazil, 2007. • Won Travel Grant Award • Won Second Prize, Best Student Paper in Year 2007 Award, Pattern Recognition and Machine Intelligence Association (PREMIA), Singapore
Motivation • Face detectors today • Real-time detection speed …but… • Weeks of training time
Why is Training so Slow? • Time complexity: O(MNT log N) • 15ms to train a feature classifier • 10 minutes to train a weak classifier • 27 days to train a face detector
Why Should the Training Time be Improved? • Tradeoff between time and generalization • E.g. training 100 times slower if we increase both N and T by 10 times • Trial and error to find key parameters for training • Much longer training time needed • Online-learning face detectors have the same problem