290 likes | 639 Views
Face Alignment by Explicit Shape Regression. Xudong Cao Yichen Wei Fang Wen Jian Sun Microsoft Research Asia. Outline. Introduction Face Alignment by Shape Regression Implementation details Experiments Discussion and Conclusion. Introduction.
E N D
Face Alignment by Explicit Shape Regression Xudong Cao Yichen Wei Fang Wen Jian Sun Microsoft Research Asia
Outline • Introduction • Face Alignment by Shape Regression • Implementation details • Experiments • Discussion and Conclusion
Introduction • A face shape S = [x1, y1, ..., x, y ]T consists of Nfp facial landmarks. • Given a face image, the goal of face alignment is to estimate a shape S that is as close as possible to the true shape
Introduction • most alignment approaches can be classified into two categories • optimization-based • minimize another error function that is correlated to instead • regression-based • learn a regression function that directly maps image appearance to the target output
Introduction • the shape constraint is essential in all methods • Most previous works use a parametric shape model to enforce such a constraint • a novel regression-based approach without using any parametric shape models • the regressorrealizes the shape constraint in an non-parametric manner
Introduction • a boosted regressor to progressively infer the shape • the early regressors handle large shape variations and guarantee robustness • the later regressors handle small shape variations and ensure accuracy • two-level boosted regression • shape-indexed features • correlation-based feature selection method
Face Alignment by Shape Regression • use boosted regression to combine T weak regressors(R1, ...Rt, ...,RT ) in an additive manner • Given a facial image I and an initial face shape S0 • each regressor computes a shape increment δS from image
Face Alignment by Shape Regression • where the tth weak regressorRt updates the previous shape St−1 to the new shape St • Given N training examples, the regressors (R1, ...Rt, ...,RT) are sequentially learnt until the training error no longer decreases Where is the estimated shape in previous stage
Face Alignment by Shape Regression • Two-level cascaded regression • learn each weak regressorRt by a second level boosted regression • the shape-indexed image features are fixed in the second level • they are indexed only relative to St−1 and no longer change when those r’s are learnt
Face Alignment by Shape Regression • Primitive regressor • use a fern as primitive regressorr δSb: regression output Ωb : alignment error Si : the estimated shape in the previous step
Face Alignment by Shape Regression • Primitive regressor (Cont.) • The solution for above is the mean of shape differences • To overcome over-fitting in the case of insufficient training data in the bin, a shrinkage is performed as where β is a free shrinkage parameter
Face Alignment by Shape Regression • Non-parametric shape constraint • the final regressed shape S can be expressed as the initial shape S0 plus the linear combination of all training shapes • as long as the initial shape S0 satisfies the shape constraint, the regressed shape is always constrained to reside in the linear subspace constructed by all training shapes
Face Alignment by Shape Regression • Shape-indexed (image) features • use simple pixel-difference features, i.e., the intensity difference of two pixels in the image • A pixel is indexed relative to the currently estimated shape rather than the original image coordinates • for each weak regressorRt in the first level, randomly sample P pixels, P2 pixel-difference features are generated
Face Alignment by Shape Regression • n-best • randomly generating a pool of ferns and selecting the one with minimum regression error • Correlation-based feature selection • exploit the correlation between features and the regression target
Face Alignment by Shape Regression • Correlation-based feature selection (Cont.) • a good fern should satisfy two properties • each feature in the fern should be highly discriminative to the regression target • correlation between features should be low so they are complementary when composed
Face Alignment by Shape Regression • To find features satisfying such properties • Project the regression to a random direction to produce a scalar • Among P2 features, select a feature with highest correlation to the scalar • Repeat steps 1. and 2. F times to obtain F features • Construct a fern by F features with random thresholds
Implementation details • Training data augmentation • Multiple initializations in testing • Running time performance • Parameter settings
Experiments • BioID • It consists of 1,521 near frontal face images captured in a lab environment • LFPW • Its images are downloaded from internet and contain large variations • LFW87 • The images mainly come from the LFW dataset • has 87 annotated landmarks,muchmore than that in BioID and LFPW
Experiments • Comparison with previous work • Comparison to [1] on LFPW • Comparison to [12] on LFW87 [1] Localizing parts of faces using a concensus of exemplars. In CVPR, 2011 [12] Face alignment via component-based discriminative search. In ECCV, 2008
Experiments • Comparison with previous work (Cont.) • Comparison to previous methods on BioID [20] Fully automatic facial feature point detection using gabor feature based boosted classifiers [5] Feature detection and tracking with constrained local models. BMVC, 2006 [14] Locating facial features with an extended active shape model. ECCV, 2008 [19] Facial point detection using boosted regression and graph models. In CVPR, 2010
Experiments • Algorithm validation and discussions • Two-level cascaded regression • Shape indexed feature • mean error of local index method, which is much smaller than the mean error of global index method • Feature selection
Experiments • Algorithm validation and discussions (Cont.) • Feature range • the distance between the pair of pixels normalized by the distance between the two pupils
Discussion and Conclusion • By jointly regressing the entire shape and minimizing the alignment error, the shape constraint is automatically encoded. The resulting method is highly accurate, efficient, and can be used in real time applications such as face tracking