1 / 22

Technische Universität München

Technische Universität München. Institute for Informatics Technische Universität München Germany Matthias Wimmer Christoph Mayer Freek Stulp Bernd Radig. Face Model Fitting based on Machine Learning from Multi-band Images of Facial Components. Outline of this Presentation.

larue
Download Presentation

Technische Universität München

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Technische Universität München Institute for Informatics Technische Universität München Germany Matthias Wimmer Christoph Mayer Freek Stulp Bernd Radig Face Model Fitting based on Machine Learning from Multi-band Images of Facial Components

  2. Outline of this Presentation facial expression, gaze, identity, gender, age,… Infer semantic information correctly fitted face model Model fitting with learned objective functions Part 2: facial component Compute multi-band image representation Part 1: image

  3. Part 1:Compute Multi-band Image Representing Facial Components

  4. Motivation and Related Work • Descriptive feature representation of input image • Input for the subsequent process of model fitting • Quick computation • Similar approaches: • Stegman et al. (IVC2003) Stegman et al. Image and Vision Computing (2003)

  5. Our Idea • Multi-band image contains the location of facial components • Use classifier for this task. • Provide a multitude of features • Classifier decides which ones are relevant (→ quick) • Consider pixel features only (→ quick) • Pre-compute image characteristicsand adjust pixel features (→ accurate) skin lips teeth brows pupils

  6. Prerequisites • Image data base (from Web; 500 images) • Face Locator: e.g. Viola and Jones • Computes rectangular regions around human faces • Manual annotations

  7. Probability Matrices • Indicate the probability of the pixels to denote a certain facial component. • Relative to the face rectangle • Learned offline skin brows pupils lower lip teeth

  8. Image Characteristics space • Distribution of pixel locationsof all facial components • Gaussian distribution of locations(Mean, covariance matrix) • Describe characteristics of the entire image • Computed by applying probability mask to face rectangle color • Distribution of color of all facial components • Gaussian distribution of color(Mean, covariance matrix) skin color distribution example image spatial distribution of skin color

  9. Multitude of Pixel Features space • Coordinates (Cartesian, Polar) • Coordinates relative to mean location of facial components (Euclidean, Mahalanobis) color Static pixel features • Color (RGB, NRGB, HSV, YCbCr) 16 features Adjusted pixel features • Color relative to mean color (Euclidean, Mahalanobis) ~ 90 features

  10. Evaluation: Train various Classifiers • 4 Classifiers for each facial component • C1: static feature only • C2: adjusted color features only • C3: adjusted location features only • C4: all features

  11. Classifiers for Lips and Teeth

  12. Part 2:Model Fitting with Learned Objective Functions

  13. Model-based Image Interpretation • The model The model contains a parameter vector that represents the model’s configuration. • The objective functionCalculates a value that indicates how accurately a parameterized model matches an image. • The fitting algorithmSearches for the model parameters that describe the image best, i.e. it minimizes the objective function.

  14. Local Objective Functions

  15. Ideal Objective Functions P1: Correctness property:Global minimum corresponds to the best fit. P2: Uni-modality property:The objective function has no local extrema. ¬ P1 P1 ¬P2 P2 • Don’t exist for real-world images • Only for annotated images: fn( I , x ) = | cn – x |

  16. Learning the Objective Function • Ideal objective function generates training data • Machine Learning technique generates calculation rules x x x x x x x x x x x x x ideal objective function x x x x x x x x x x x x x x x x x x x training data learned objective function

  17. Benefits of the Machine Learning Approach • Accurate and robust calculation rules • Locally customized calculation rules • Generalization from many images • Simple job for the designer • Critical decisions are automated • No domain-dependent knowledge required • No loops

  18. Evaluation 1: Displacing the Correct Model statistics-based objective function ideal objective function learned objective function

  19. Evaluation 2: Selected Features contour point 116

  20. Conclusion • Crucial decisions within Computer Vision algorithms • Don’t solve by trial and error→ Learn from training data • Example 1: Learned classifiers for facial components • Example 2: Learned objective functions

  21. Outlook • More features for learning objective functions • Higher number of features • Other kinds of features: SIFT, LBP, … • Learn with better classifiers • Relevance Vector Machines • Boosted regressors • Training images: render faces with AAM • Exact ground truth (no manual work required) • Many images • Learn global objective function • Learn rules to directly update model parameter

  22. Thank you! Online-Demonstration: http://www9.cs.tum.edu/people/wimmerm

More Related