1 / 28

Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I

Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I. alessandro verri DISI – università di genova verri@disi.unige.it. actually, i’m gonna talk about:. brief introduction (the whole thing) what some people do for detecting faces what we are doing.

maylin
Download Presentation

Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Challenges in Learning the Appearance of Faces for Automated Image Analysis: part I alessandro verri DISI – università di genova verri@disi.unige.it

  2. actually, i’m gonna talk about: • brief introduction (the whole thing) • what some people do for detecting faces • what we are doing

  3. the problem(s) • geometry (position, rotation, pose, scale,…) • facial features (beards, glasses,…) • facial expressions • occlusions • imaging conditions (illumination, resolution, color contrast, camera quality,…)

  4. where we are: face detection we address face detection as a brute force classification problem (sometimes sophisticated, but still brute force) the model is encoded in the training samples but not explicitly defined

  5. face recognition and the like explicit image models are derived from examples separating identity and imaging parameters

  6. motivation we want to explore who should learn from whom… we come back to this at the end!

  7. some approaches • knowledge-based (Yang & Huang 94) • feature invariant (Leung et al, 95; Yow & Cipolla, 97) • template matching (Lanitis et al, 95) • appearance based • eigenfaces (Turk & Pentland, 91) • SVM (Osuna et al, 97) • naive bayes(Schneiderman & Kanade, 98) • AdaBoost (Viola and Jones, 01)

  8. SVM: global detector(Poggio’s group) • some preprocessing essential (equalization and normalization) • polynomial SVM applied to pixels • training set: • about 2,500 face images (58x58 pixels) • about 10,000 non face images (extended to 13,000)

  9. SVM: component-based detector(Poggio’s group) • some preprocessing essential (equalization and normalization) • two level system (always linear SVMs): • component classifiers (14: eyes, nose,…) • geometrical configuration classifier based on maximal outputs

  10. global vs component-based • component-based performs better (more robust to pose variation and/or occlusion) • global a little faster (though they are both pretty slow, too many patches have to be stored!)

  11. naive bayes (Kanade’s group) • multiple detectors for different views (size and orientation) • for each view: statistical modeling using predefined attribute histograms (17), about 2,000 face examples • independency is required… very good for out-of-plane rotation but involved procedure for building histograms (bootstrap, AdaBoost…)

  12. AdaBoost (Viola & Jones) • wavelet like features (computed efficiently) • feature selected through AdaBoost (each weak classifier depends on a single feature) • detection is obtained by training a cascade of classifiers • very fast and effective on frontal faces

  13. summing up • SVM: components based on prior knowledge, simple, very good results but rather slow (optimization approaches…) • naive bayes: robust against rotation, prior knowledge on feature selection, rather hand crafted statistical analysis, many models need to be learned (each with many examples) • AdaBoost: data-driven feature selection, fast, frontal face only

  14. what we do • we assume we are given a fair number of positive examples only (no negatives) • we want to explore how far one can get by combining fully data driven techniques based on 1D data • we look at old-fashioned hypothesis testing (false negative rate under full control)

  15. one possible way to object detection • building models can be expensive (domain dependent) • learning from positive examples only is more difficult, but… • classical hypothesis testing controls the false negative rate naturally

  16. testing hypotheses • HT with only one observation • testing for independence with rank test (seems appropriate for comparing different features)

  17. CBCL database faces(19x19pixels)training: 2429 test: 472 nonfaces(19x19pixels)training: 4548test: 23573

  18. training by hypothesis testing • we first compute a large number of features (for the moment about 16,000) on the training set images • a subset of good features (about 1,000) is then selected • of these, a subset of independent features is considered (ending up with about 100) • multiple statistical tests are then constructed using the training set (one test for each feature)

  19. image measurements • grey value at fixed locations (19 x 19) • tomographies (19 vertical + 19 horizontal + 37 along the 45deg diagonals) • ranklets (5184 vertical, 5184 horizontal, 5184 diagonal) • a total of about 16,000 features

  20. ranklets (Smeraldi, 2002)

  21. vertical ranklets (variance-to-natural support ratio)

  22. a salient and a non-salient feature we discard all features for which the ratio falls below the threshold s0.15 (this leaves us with about 2000 features)

  23. independent feature selection • we run independence tests on all possible pairs of salient features of the same category • we build a complete graph for each category with as many nodes as features. An edge between two features is deleted if for the two features the Spearman’s test rejects the independence hypothesis with probabilityt

  24. independent feature selection • we then search the graph for maximally complete subgraphs (cliques) which we regard as sets of independent features for t = 0.5 we are left with 44 vertical, 64 horizontal, 35 diagonal ranklets, and 38 tomographies

  25. testing • for all image locations, all applicable scales and a fixed number t • compute the values of the good, independent features • perform multiple statistical tests at a certain confidence level • a positive example is located if t tests are passed

  26. multiple tests • we run multiple tests living with the fact that we won’t detect a certain fraction of the objects we want to find • luckily we are in a position to decide the fraction beforehand • we gain power because each test looks at a new feature

  27. some results (franceschi et al, 2004)472 positive vs 23,573 negatives tomographies + ranklets overlapping features randomly chosen

  28. once you have detected a face… ask Thomas

More Related