180 likes | 401 Views
Discriminative Decorelation for clustering and classification ECCV 12. Bharath Hariharan , Jitandra Malik, and Deva Ramanan. State-of-the-art Object Detection HOG Linear SVM. Motivation. Dalal&Triggs Histograms of Oriented Gradients for Human Detection CVPR05.
E N D
Discriminative Decorelation for clustering and classificationECCV 12 BharathHariharan, Jitandra Malik, and Deva Ramanan
State-of-the-art Object Detection • HOG • Linear SVM Motivation Dalal&Triggs Histograms of Oriented Gradients for Human Detection CVPR05
State of the art Object Detection • Mixture models • Parts Based Models • Train one linear for each class/mixture/part • Hundreds of objects of interest • Hard negative mining • Increasing number of mixtures (exemplar SVMs) • Increasing number of parts Can we make the training fast (closed form, no iteration) without losing much in performance? Motivation P. Ott and M. Everingham. "Shared Parts for Deformable Part-based Models” CVPR11 LubomirBourdev, et al. ,Detecting People Using Mutually Consistent PoseletActivations ECCV10
What does linear SVMs do? • Discriminative Feature selection! • Can we do it closed form? • Seems possible with LDA Motivation HOG Image LDA SVM
Motivation • LDA • Closed form solution • Parameters reduction/Regularization • Properties of the model • Clustering • Results Contents
Training set (X,Y) • Assumes normal conditional distributions P(x|y=0) and P(x|y=1) • Assumes equal class covariances • From log of likelihood ratios give • Dot product decision function: Linear Discriminant Analysis(LDA) – Binary case
Calculate object independent and Ʃ • Assume we want the same Ʃacross different classes • MLE of Ʃbecomes the covariance computed over all training samples • Lets assume number of negative samples are very much larger than positives of each class • can be computed once on all training samples including positives • Even so, Number of parameters in a HOG template can be very large making estimation of a Ʃ very hard Closed form Solution
Translation and scale invariance • Take all the negative windows from each scale and position • Stationary process • Mean will be computed for all HOG cells the same • Covariance will only model relative offsets • parameters only • Covariance only on different cells offsets • Still low rank and not invertible • Add a small amount to the diagonal (0.01) Regularization
Structure of Ʃ • encodes spatial correlations between oriented gradients • Sparsity • - Covariance of 9 orientation dimensions of one cell to 4 horizontal neighbors • Dark is negative and light is positive • Precision matrix is sparse • Fades after a few cells away Properties of the Covariance
Structure of Ʃ • encodes spatial correlations between oriented gradients • Sparsity • - Covariance of 9 orientation dimensions of one cell to 4 horizontal neighbors • Dark is negative and light is positive • Precision matrix is sparse • Fades after a few cells away Properties of the Covariance
Transformed (Whitened) HOG Feature (WHO) • More meaningful distances • Comparison to PCA • PCA keeps the dimensions with most variance among training samples • Removes the discriminative information! Properties of the Covariance HOG Image PCA LDA
Comparing to DalalTriggs HOG-SVM • With hard negative mining • Method • Averaging: Compute positive average HOG ( • Centering: Subtract • Whitening: Multiply by • To check the whitening effect use the centered version only • Dalal&Triggs: 79.66% • LDA: 75.10% • Centered: 8% • Pedestrians are well aligned Results – Pedestrian Detection
Use transformed feature • Recursive normalized cuts Clustering
Average over all clusters performance • Airplane, bus, horse • Correlation of SVM and LDA • Low training computation • Still slow in testing phase Some minor results
1- Use Exemplar SVMs method • 2- change the SVM to an LDA with object independent background model • 3- To make it fast in test • Use clustering train a classifier for each cluster • Train an LDA for each cluster • Produce Feature Vector1: Dot product of a window WHO feature to all exemplars WHO • Produce Feature Vector 2: Dot product of a window WHO to all clusters WHO • Train a linear SVM on the concatenated feature Combining Across Clusters
Train exemplar SVMs for each sample • Take the selected feature as the new transformed feature • Do Image Classification (combined features and nonlinear SVMs) • Clustering and Linear SVMs ALMOST DONE • New similarity measure for verification (like face verification in LFW) • Model Natural images with some distribution to make training Exemplar SVMs fast DONE • One can use more sophisticated modelling of natural patches (GMM)) • Make a pyramid of second, 3rd, … order discriminant features What we wanted! To do