Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I

Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I alessandro verri DISI – università di genova verri@disi.unige.it

actually, i’m gonna talk about: • brief introduction (the whole thing) • what some people do for detecting faces • what we are doing

the problem(s) • geometry (position, rotation, pose, scale,…) • facial features (beards, glasses,…) • facial expressions • occlusions • imaging conditions (illumination, resolution, color contrast, camera quality,…)

where we are: face detection we address face detection as a brute force classification problem (sometimes sophisticated, but still brute force) the model is encoded in the training samples but not explicitly defined

face recognition and the like explicit image models are derived from examples separating identity and imaging parameters

motivation we want to explore who should learn from whom… we come back to this at the end!

some approaches • knowledge-based (Yang & Huang 94) • feature invariant (Leung et al, 95; Yow & Cipolla, 97) • template matching (Lanitis et al, 95) • appearance based • eigenfaces (Turk & Pentland, 91) • SVM (Osuna et al, 97) • naive bayes(Schneiderman & Kanade, 98) • AdaBoost (Viola and Jones, 01)

SVM: global detector(Poggio’s group) • some preprocessing essential (equalization and normalization) • polynomial SVM applied to pixels • training set: • about 2,500 face images (58x58 pixels) • about 10,000 non face images (extended to 13,000)

SVM: component-based detector(Poggio’s group) • some preprocessing essential (equalization and normalization) • two level system (always linear SVMs): • component classifiers (14: eyes, nose,…) • geometrical configuration classifier based on maximal outputs

global vs component-based • component-based performs better (more robust to pose variation and/or occlusion) • global a little faster (though they are both pretty slow, too many patches have to be stored!)

naive bayes (Kanade’s group) • multiple detectors for different views (size and orientation) • for each view: statistical modeling using predefined attribute histograms (17), about 2,000 face examples • independency is required… very good for out-of-plane rotation but involved procedure for building histograms (bootstrap, AdaBoost…)

AdaBoost (Viola & Jones) • wavelet like features (computed efficiently) • feature selected through AdaBoost (each weak classifier depends on a single feature) • detection is obtained by training a cascade of classifiers • very fast and effective on frontal faces

summing up • SVM: components based on prior knowledge, simple, very good results but rather slow (optimization approaches…) • naive bayes: robust against rotation, prior knowledge on feature selection, rather hand crafted statistical analysis, many models need to be learned (each with many examples) • AdaBoost: data-driven feature selection, fast, frontal face only

what we do • we assume we are given a fair number of positive examples only (no negatives) • we want to explore how far one can get by combining fully data driven techniques based on 1D data • we look at old-fashioned hypothesis testing (false negative rate under full control)

one possible way to object detection • building models can be expensive (domain dependent) • learning from positive examples only is more difficult, but… • classical hypothesis testing controls the false negative rate naturally

testing hypotheses • HT with only one observation • testing for independence with rank test (seems appropriate for comparing different features)

CBCL database faces(19x19pixels)training: 2429 test: 472 nonfaces(19x19pixels)training: 4548test: 23573

training by hypothesis testing • we first compute a large number of features (for the moment about 16,000) on the training set images • a subset of good features (about 1,000) is then selected • of these, a subset of independent features is considered (ending up with about 100) • multiple statistical tests are then constructed using the training set (one test for each feature)

image measurements • grey value at fixed locations (19 x 19) • tomographies (19 vertical + 19 horizontal + 37 along the 45deg diagonals) • ranklets (5184 vertical, 5184 horizontal, 5184 diagonal) • a total of about 16,000 features

ranklets (Smeraldi, 2002)

vertical ranklets (variance-to-natural support ratio)

a salient and a non-salient feature we discard all features for which the ratio falls below the threshold s0.15 (this leaves us with about 2000 features)

independent feature selection • we run independence tests on all possible pairs of salient features of the same category • we build a complete graph for each category with as many nodes as features. An edge between two features is deleted if for the two features the Spearman’s test rejects the independence hypothesis with probabilityt

independent feature selection • we then search the graph for maximally complete subgraphs (cliques) which we regard as sets of independent features for t = 0.5 we are left with 44 vertical, 64 horizontal, 35 diagonal ranklets, and 38 tomographies

testing • for all image locations, all applicable scales and a fixed number t • compute the values of the good, independent features • perform multiple statistical tests at a certain confidence level • a positive example is located if t tests are passed

multiple tests • we run multiple tests living with the fact that we won’t detect a certain fraction of the objects we want to find • luckily we are in a position to decide the fraction beforehand • we gain power because each test looks at a new feature

some results (franceschi et al, 2004)472 positive vs 23,573 negatives tomographies + ranklets overlapping features randomly chosen

once you have detected a face… ask Thomas

Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I

Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I

Presentation Transcript

Context Higher Education Faces Unprecedented Challenges

Automated feedback of Learning Styles. Is it just a horoscope?

is for Epi

Image Characteristics

The Dutch Automated Vehicle Initiative: Challenges for automated driving

Recognition by Appearance

Automated Volume Diagnostics

Image Effects

Faces About Reindeers

Europe Faces Revolution

Four Faces of Anger

Towards semi-automated analysis of sequence data using Mutation Surveyor

Statistical Image Analysis in High-Content Microscopy Screens

What is an image map?

Image Processing

Biomedical Image Analysis and Machine Learning BMI 731 Winter 2005

Aerial Image Exploitation Change Detection Event Detection Object Tracking

Image Analysis

Active Lighting for Appearance Decomposition

Proactive AIX Analysis for System p / Power

Prediction of Non-Linear Aging Trajectories of Faces

Methods In Medical Image Analysis