220 likes | 439 Views
Technische Universität München. Institute for Informatics Technische Universität München Germany Matthias Wimmer Christoph Mayer Freek Stulp Bernd Radig. Face Model Fitting based on Machine Learning from Multi-band Images of Facial Components. Outline of this Presentation.
E N D
Technische Universität München Institute for Informatics Technische Universität München Germany Matthias Wimmer Christoph Mayer Freek Stulp Bernd Radig Face Model Fitting based on Machine Learning from Multi-band Images of Facial Components
Outline of this Presentation facial expression, gaze, identity, gender, age,… Infer semantic information correctly fitted face model Model fitting with learned objective functions Part 2: facial component Compute multi-band image representation Part 1: image
Part 1:Compute Multi-band Image Representing Facial Components
Motivation and Related Work • Descriptive feature representation of input image • Input for the subsequent process of model fitting • Quick computation • Similar approaches: • Stegman et al. (IVC2003) Stegman et al. Image and Vision Computing (2003)
Our Idea • Multi-band image contains the location of facial components • Use classifier for this task. • Provide a multitude of features • Classifier decides which ones are relevant (→ quick) • Consider pixel features only (→ quick) • Pre-compute image characteristicsand adjust pixel features (→ accurate) skin lips teeth brows pupils
Prerequisites • Image data base (from Web; 500 images) • Face Locator: e.g. Viola and Jones • Computes rectangular regions around human faces • Manual annotations
Probability Matrices • Indicate the probability of the pixels to denote a certain facial component. • Relative to the face rectangle • Learned offline skin brows pupils lower lip teeth
Image Characteristics space • Distribution of pixel locationsof all facial components • Gaussian distribution of locations(Mean, covariance matrix) • Describe characteristics of the entire image • Computed by applying probability mask to face rectangle color • Distribution of color of all facial components • Gaussian distribution of color(Mean, covariance matrix) skin color distribution example image spatial distribution of skin color
Multitude of Pixel Features space • Coordinates (Cartesian, Polar) • Coordinates relative to mean location of facial components (Euclidean, Mahalanobis) color Static pixel features • Color (RGB, NRGB, HSV, YCbCr) 16 features Adjusted pixel features • Color relative to mean color (Euclidean, Mahalanobis) ~ 90 features
Evaluation: Train various Classifiers • 4 Classifiers for each facial component • C1: static feature only • C2: adjusted color features only • C3: adjusted location features only • C4: all features
Model-based Image Interpretation • The model The model contains a parameter vector that represents the model’s configuration. • The objective functionCalculates a value that indicates how accurately a parameterized model matches an image. • The fitting algorithmSearches for the model parameters that describe the image best, i.e. it minimizes the objective function.
Ideal Objective Functions P1: Correctness property:Global minimum corresponds to the best fit. P2: Uni-modality property:The objective function has no local extrema. ¬ P1 P1 ¬P2 P2 • Don’t exist for real-world images • Only for annotated images: fn( I , x ) = | cn – x |
Learning the Objective Function • Ideal objective function generates training data • Machine Learning technique generates calculation rules x x x x x x x x x x x x x ideal objective function x x x x x x x x x x x x x x x x x x x training data learned objective function
Benefits of the Machine Learning Approach • Accurate and robust calculation rules • Locally customized calculation rules • Generalization from many images • Simple job for the designer • Critical decisions are automated • No domain-dependent knowledge required • No loops
Evaluation 1: Displacing the Correct Model statistics-based objective function ideal objective function learned objective function
Evaluation 2: Selected Features contour point 116
Conclusion • Crucial decisions within Computer Vision algorithms • Don’t solve by trial and error→ Learn from training data • Example 1: Learned classifiers for facial components • Example 2: Learned objective functions
Outlook • More features for learning objective functions • Higher number of features • Other kinds of features: SIFT, LBP, … • Learn with better classifiers • Relevance Vector Machines • Boosted regressors • Training images: render faces with AAM • Exact ground truth (no manual work required) • Many images • Learn global objective function • Learn rules to directly update model parameter
Thank you! Online-Demonstration: http://www9.cs.tum.edu/people/wimmerm