Mixture of SVMs for Face Class Modeling

Mixture of SVMs for Face Class Modeling J.Meynet, V.Popovici, J.-Ph. Thiran MLMI 04

Outline • Introduction • Presentation of the Work • Context • The face detection task • Principal Component Analysis • Classification with Support Vector Machines • Mixture of Support Vector Machines • With independent subsets • With k-means clustering • Experiments and Results • Conclusions and Future Work

… Face DetectionMethods • Image-Based Detection Consider face as a whole object • Eigenfaces • Fisher’s Linear Discriminant • Neural Network, SVM • HMM • SNoW • Geometrical-based methods Find precise parts of the face and reassemble them for the final decision • Top-down • Bottom-up

+ + - - + - - + + - - - - + + Principle of the detection • Pre-processing with a cascade of boosted Haar-Like Features =>Real-time face detector • Principal Component Analysis (PCA) Dimensionality reduction • Classification with a Mixture of SVM Random sampling or k-means clustering P.Viola, M.Jones, ”Robust real-time object detection.” International Journal of computer Vision, 2002.

PCA … Eigenfaces Space • PCA and Eigenfaces Sirovich, Kirby, ”Low-dimensional procedure for the characterization of human faces, 1987

F DFFS DIFS F Distance From Feature Space • DFFS • Construction of the classification vector:

Support Vector Machines (SVM) • Find the hyperplane that correctly separates the data while maximising the margin. • Optimisation: • Lagrange multipliers i: • Kernels: V. Popovici, J.-Ph. Thiran, "Face Detection using SVM Trained in Eigenfaces Space", 4th International Conference on Audio- and Video-Based Biometric Person Authentication, Surrey, UK, 2003

2 N 1 N+1 SVM-1 SVM-N m SVM-L2 Mixture of SVMs (MSVM) • Why? Building a face detection system requires a large amount of examples => make the training easier Principle 1-Split initial dataset into N+1 subsets • By random sampling • Or by K-means clustering 2- Train N first SVMs 3- Pass the N+1th subset through the SVMs, train the 2nd layer SVM on the margins. X

Mixture of SVMs • 2 Sampling techniques: 1- Random partitioning M+1 independent subsets 2- Clustering - Draw 1 random subset for the SVM-L2 - K-means clustering on the remaining examples M clusters for training the SVM-L1-i • SVM-L1-i are trained using cross-validation, with RBF kernels: • SVM-L2 trained on the margins: It learns a function that assembles the confidences of each individual expert.

Mixture of SVMs • Output of the mixture: • Advantages: Single SVM: problem of complexity MSVM: problems of complexity => clearly advantageous

Training Validation 8256 7822 faces non faces 14000 900000 Experiments and Results I • Database • Face images from Banca and XM2VTS • Non faces chosen by bootstrapping on randomly selected images • Estimation of a correct dimensionality for the eigenfaces space 20x15 images Number of eigenfaces needed to keep 85% of total variation

Faces(%) Non Faces(%) Classifier K-Means K-Means R.S R.S SVM-L1-1 86.23 76.47 99.00 98.86 SVM-L1-2 84.91 82.32 90.00 97.68 SVM-L1-3 85.13 81.23 99.02 98.77 SVM-L1-4 84.64 77.12 99.13 99.12 SVM-L1-5 85.66 74.29 99.12 99.12 SVM-L2 93.60 95.37 98.14 96.43 Experiments and Results II • Random sampling or clustering (x5) SVM-L1-i (x5) 1000 F 2000 NF 8256 F 14000 NF SVM-L2 2256 F 4000 NF Random sampling: Reduce the importance of outliers or unusual examples Clustering: Each SVM-L1-i performs like an expert on its own domain

Classifier Faces(%) Non Faces(%) Total N° SV MSVM, RS 93.60 98.14 1673 MSVM, KM 95.37 96.43 1420 92.8 99.42 2504 Single SVM Experiments and Results III Generalisation • Better generalisation capabilities than a single SVM; • MSVM improves the training time and the true positive rate; • Less Support Vectors => lower computation complexity. Results on Banca images pre-processed by boosted Haar-Like features

Conclusions - Future Work • Boosted local feature-based classifiers pre-pocessing • real-time processing • Dimensionality reduction by PCA ( + DFFS) • Decrease the complexity of the classification task • Extension to the SVM technique which performs well on large datasets. • Decrease the training and classification time • Improve discrimination capabilities • Try other clustering techniques in eigenfaces space based on more appropriate metrics.

Mixture of SVMs for Face Class Modeling