Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I

Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I alessandro verri DISI – università di genova verri@disi.unige.it

actually, i’m gonna talk about: • brief introduction (the whole thing) • what some people do for detecting faces • what we are doing

the problem(s) • geometry (position, rotation, pose, scale,…) • facial features (beards, glasses,…) • facial expressions • occlusions • imaging conditions (illumination, resolution, color contrast, camera quality,…)

where we are: face detection we address face detection as a brute force classification problem (sometimes sophisticated, but still brute force) the model is encoded in the training samples but not explicitly defined

face recognition and the like explicit image models are derived from examples separating identity and imaging parameters

motivation we want to explore who should learn from whom… we come back to this at the end!

some approaches • knowledge-based (Yang & Huang 94) • feature invariant (Leung et al, 95; Yow & Cipolla, 97) • template matching (Lanitis et al, 95) • appearance based • eigenfaces (Turk & Pentland, 91) • SVM (Osuna et al, 97) • naive bayes(Schneiderman & Kanade, 98) • AdaBoost (Viola and Jones, 01)

SVM: global detector(Poggio’s group) • some preprocessing essential (equalization and normalization) • polynomial SVM applied to pixels • training set: • about 2,500 face images (58x58 pixels) • about 10,000 non face images (extended to 13,000)

SVM: component-based detector(Poggio’s group) • some preprocessing essential (equalization and normalization) • two level system (always linear SVMs): • component classifiers (14: eyes, nose,…) • geometrical configuration classifier based on maximal outputs

global vs component-based • component-based performs better (more robust to pose variation and/or occlusion) • global a little faster (though they are both pretty slow, too many patches have to be stored!)

naive bayes (Kanade’s group) • multiple detectors for different views (size and orientation) • for each view: statistical modeling using predefined attribute histograms (17), about 2,000 face examples • independency is required… very good for out-of-plane rotation but involved procedure for building histograms (bootstrap, AdaBoost…)

AdaBoost (Viola & Jones) • wavelet like features (computed efficiently) • feature selected through AdaBoost (each weak classifier depends on a single feature) • detection is obtained by training a cascade of classifiers • very fast and effective on frontal faces

summing up • SVM: components based on prior knowledge, simple, very good results but rather slow (optimization approaches…) • naive bayes: robust against rotation, prior knowledge on feature selection, rather hand crafted statistical analysis, many models need to be learned (each with many examples) • AdaBoost: data-driven feature selection, fast, frontal face only

what we do • we assume we are given a fair number of positive examples only (no negatives) • we want to explore how far one can get by combining fully data driven techniques based on 1D data • we look at old-fashioned hypothesis testing (false negative rate under full control)

one possible way to object detection • building models can be expensive (domain dependent) • learning from positive examples only is more difficult, but… • classical hypothesis testing controls the false negative rate naturally

testing hypotheses • HT with only one observation • testing for independence with rank test (seems appropriate for comparing different features)

CBCL database faces(19x19pixels)training: 2429 test: 472 nonfaces(19x19pixels)training: 4548test: 23573

training by hypothesis testing • we first compute a large number of features (for the moment about 16,000) on the training set images • a subset of good features (about 1,000) is then selected • of these, a subset of independent features is considered (ending up with about 100) • multiple statistical tests are then constructed using the training set (one test for each feature)

image measurements • grey value at fixed locations (19 x 19) • tomographies (19 vertical + 19 horizontal + 37 along the 45deg diagonals) • ranklets (5184 vertical, 5184 horizontal, 5184 diagonal) • a total of about 16,000 features

ranklets (Smeraldi, 2002)

vertical ranklets (variance-to-natural support ratio)

a salient and a non-salient feature we discard all features for which the ratio falls below the threshold s0.15 (this leaves us with about 2000 features)

independent feature selection • we run independence tests on all possible pairs of salient features of the same category • we build a complete graph for each category with as many nodes as features. An edge between two features is deleted if for the two features the Spearman’s test rejects the independence hypothesis with probabilityt

independent feature selection • we then search the graph for maximally complete subgraphs (cliques) which we regard as sets of independent features for t = 0.5 we are left with 44 vertical, 64 horizontal, 35 diagonal ranklets, and 38 tomographies

testing • for all image locations, all applicable scales and a fixed number t • compute the values of the good, independent features • perform multiple statistical tests at a certain confidence level • a positive example is located if t tests are passed

multiple tests • we run multiple tests living with the fact that we won’t detect a certain fraction of the objects we want to find • luckily we are in a position to decide the fraction beforehand • we gain power because each test looks at a new feature

some results (franceschi et al, 2004)472 positive vs 23,573 negatives tomographies + ranklets overlapping features randomly chosen

once you have detected a face… ask Thomas

Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I

Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I

Presentation Transcript

Introduction to Digital Image Analysis Part I: Digital Images

Learning the Appearance of Faces: A Unifying Approach for the Analysis and Synthesis of Images.

Part I Challenges

PART I Challenges

The Dutch Automated Vehicle Initiative: Challenges for automated driving

Automated Image Analysis Techniques for Screening of Mammography Images

The New Nation Faces Challenges

New Appearance Models for Natural Image Matting

An Automated Segmentation Method for Microarray Image Analysis

Part I: Image Transforms

Automated detection of faces in images

Automated Image Analysis Software for Quality Assurance of a Radiotherapy CT Simulator

Automated Image Registration for the Future

Learning the Appearance and Motion of People in Video

Two Faces of Integrative Learning

Automated Image Analysis Techniques for Screening of Mammography Images

Automated Image Analysis Techniques for Screening of Mammography Images

Image Analysis I

Automated Image Processing

Learning the Appearance and Motion of People in Video

Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I

Learning the Appearance of Faces: A Unifying Approach for the Analysis and Synthesis of Images.