Lip Feature Extraction and Tracking: Model-Based Approach

Model-Based Facial Feature Extraction Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk http://www.eee.bham.ac.uk/spannm

Contents • Introduction • Lip feature extraction and tracking • Summary

Lip feature extraction and tracking • Lip feature tracking is an important in combining audio and visual cues for speech recognition systems • Typically the lip boundaries (inner/outer/both) are tracked and shape features passed to the speech recognition module • Previous approaches • Active contour model (snakes) • Energy function minimisation used to control contour shape (curvature) and local greylevel (colour) gradient • Can be dependant on weighting parameters which need to be tuned

Lip feature extraction and tracking • Typically an energy function E is defined in terms of the parameterised snake v(s)=(x(s),y(s)) where s is the distance along the snake: • The first two terms represent the snake’s internal energy and control it’s tension and rigidity • The third term attracts the snake to object boundaries with high greylevel gradient • Often an additional term is added for a ‘balloon’ snake to either inflate or deflate the snake

Lip feature extraction and tracking

Lip feature extraction and tracking • More recent approaches to lip localisation and tracking have been model-based • A statistical shape model of the inner and outer lip contours can be built from training data • Landmarks on the contour form pointsets: • We need to align the pointsets and then build a statistical model using PCA

Lip feature extraction and tracking • Pointsets of lip feature landmarks must be normalized for translation, scale and rotation • We can use a simple iterative algorithm to align to the mean pointset

Lip feature extraction and tracking • PCA is based on the mean and covariance of the pointset vectors computed across the training set: • We then compute our shape model by solving the eigenvector/eigenvalue equation: • where Λis a diagonal matrix of eigenvalues :

Lip feature extraction and tracking • We can represent each landmark pointset x by a corresponding shape vector b • The set of bi’s across all of the pointsets in the database represents the ithmodeof variation of the original data • We can vary each bito get realistic versions of lip shapes • Typically for the itheigenvalue λi:

Lip feature extraction and tracking • An active shape model sample greylevels perpendicular to the lip contour and centred at the model points

Lip feature extraction and tracking • We sample the profiles perpendicular to each model point j • Training image i then gives us a vector of greylevels gij • We concatenate all these greylevel vectors to give us a global profile vector hi • We build a statistical model out of these profile vectors to enable the main modes of variation of the profiles about the model boundaries to be computed

Lip feature extraction and tracking • The weight vectors bhcan be used as a parameter in a cost function to determine how well the actual profile fits the model

Lip feature extraction and tracking • The greylevels between profile vectors can be interpolated to visualise the greylevel models • Some smoothing using a median filter helps remove any artefacts of the interpolation • We can visualise several modes corresponding to the first few eigenvectors • The corresponding components of the weight vector bh can be varied according to: • For example we can set bhi to ±2√λi for i=1,2,3

Lip feature extraction and tracking • Mode 1 • Global illumination differences • Mode 2 • Lower/Upper lip intensity difference • Mode 3 • Skin/lip contrast differences • Higher modes • Illumination variations, visibility of teeth and tongue etc

Lip feature extraction and tracking • In order to apply an ASM search algorithm, a coarse estimate of the region of interest containing the lips region is found • Can be input interactively or computed automatically using segmentation or edge-based feature extraction algorithm • Provides an estimate of the scale of the lips • Limits the search area

Lip feature extraction and tracking • In order to use the greylevel and shape models in a search algorithm we can use the greylevel model to best fit the model greylevel profile to the current greylevel profile • Shape and pose parameters can then be updated • We need a cost function which describes the fit between the model greylevel profile and the profile measured in the image at the current model position • Several statistical approaches possible • Maximizing the probability assuming Gaussian distributions • Minimizing the mean square error between the profiles

Sample profile h Current model position

Lip feature extraction and tracking • We can define a error function E defining the mismatch between the actual profile h measured at the current position estimate and our model profile hm: • Substituting for hm : • Typically hm would comprise only the first few modes of variation

Lip feature extraction and tracking • The model is initialized with the mean shape computed over aligned shapes in the training set • Our goal is to minimize our energy function E in terms of translation vectors tx and ty, a scale parameter s and a rotation angle θalong with the profile parameter vector :

Lip feature extraction and tracking • Optimization is carried out by perturbing individual parameters and evaluating their effects on the energy function E • Typically only a few (typically 10-20) shape modes are used in the search to ease the computational burden • Perturbations in bi are limited to: • For a given position of the model landmarks, the profile h is sampled and bh computed according to:

Lip feature extraction and tracking • We can devise an iterative algorithm to update the pose and shape parameters sequentially based on our error measure • The algorithm alternates between ‘model space’ and ‘image space’ • The object boundary in model space is defined by the shape parameters • We can use the greylevel or colour profile information to measure the error in image space • Conversion between the two spaces is done via the pose parameters

Lip feature extraction and tracking Model space - b Image space - bh

Lip feature extraction and tracking • Initialize the shape parameters b to zero and image points y • 2. Generate the model point positions: 3. Find the pose parameters tx,ty, s, θ to best fit the model points to the image points y • Project the model points into the image frame • x->T(x), compute the image profile vector h and at each projected model point, search normal to the model boundary and find the image points y’ which minimize E to produce new image profile vector h’

Lip feature extraction and tracking 6. Project the image points y’ into the model coordinate frame by inverting the transformation T 7. Update the model parameters 8. If not converged y->y’. Go to step 2

Lip feature extraction and tracking Model point Nearest image point to model point Image boundary

Lip feature extraction and tracking • Its easier to track the outer lips than the inner ones • More constant greylevel profile • Easier to model for example with application to active shape modelling • But, less appropriate for lip gesture recognition and speech recognition algorithms • Often using a full appearance model rather than just a shape model gives better speech recognition performance • For example the teeth and tongue appearance give clues to particular types of vocal sounds

Lip feature extraction and tracking • Results of off centre initialization of ASM using local greylevel profiles after 5, 10, 20, iterations

Lip feature extraction and tracking • Results using ASM search with local greylevel profiles

Lip feature extraction and tracking • Demo • http://www.ee.surrey.ac.uk/Projects/M2VTS/experiments/lip_tracking/index.html

Summary • We have looked at a shape model and a model describing greylevel or colour variation local to the shape model landmark positions can be used for finding the lip contour location in face images • We have described an iterative model-based search algorithm for lip contour location • We have shown lip tracking results based on this algorithm

Lip Feature Extraction and Tracking: Model-Based Approach

Lip Feature Extraction and Tracking: Model-Based Approach

Presentation Transcript

Feature Extraction

Facial Feature Recognition

Feature extraction: Corners

Feature Selection and Extraction

Feature extraction/data compression

Feature Extraction for ASR

A Study on the 3D Facial Feature Extraction and Motion

Feature extraction

ATWD WF feature extraction

Feature Extraction (I)

Feature Extraction

Facial Feature Extraction by Kernel Independent Component Analysis

Feature extraction

Facial feature localization

Facial Feature Detection

Feature Selection, Feature Extraction

Extraction Facial Treatment

Feature Extraction

Feature Extraction (I)

Feature Extraction

Feature Extraction