360 likes | 484 Views
Approximate Correspondences in High Dimensions. Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…. Intra-class appearance. Key challenges: robustness. Illumination. Object pose. Clutter. Occlusions. Viewpoint. Key challenges: efficiency.
E N D
Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…
Intra-class appearance Key challenges: robustness Illumination Object pose Clutter Occlusions Viewpoint
Key challenges: efficiency • Thousands to millions of pixels in an image • 3,000-30,000 human recognizable object categories • Billions of images indexed by Google Image Search • 18 billion+ prints produced from digital camera images in 2004 • 295.5 million camera phones sold in 2005
Maximally Stable Extremal Regions [Matas et al.] Shape context [Belongie et al.] Superpixels [Ren et al.] SIFT [Lowe] Spin images [Johnson and Hebert] Geometric Blur [Berg et al.] Local representations Describe component regions or patches separately Salient regions [Kadir et al.] Harris-Affine [Schmid et al.]
How to handle sets of features? • Each instance is unordered set of vectors • Varying number of vectors per instance
Partial matching Compare sets by computing a partialmatching between their features.
optimal partial matching Pyramid match overview
for sets with features of dimension Computing the partial matching • Optimal matching • Greedy matching • Pyramid match
Pyramid match overview Pyramid match measures similarity of a partial matching between two sets: • Place multi-dimensional, multi-resolution grid over point sets • Consider points matched at finest resolution where they fall into same grid cell • Approximate optimal similarity with worst case similarity within pyramid cell No explicit search for matches!
Number of newly matched pairs at level i Measure of difficulty of a match at level i Pyramid match Approximate partial match similarity [Grauman and Darrell, ICCV 2005]
, Histogram pyramid: level i has bins of size Pyramid extraction
Counting matches Histogram intersection
Example pyramid match pyramid match optimal match
Approximating the optimal partial matching x Randomly generated uniformly distributed point sets with m= 5 to 100, d=2
Learning with the pyramid match • Kernel-based methods • Embed data into a Euclidean space via a similarity function (kernel), then seek linear relationships among embedded data • Efficient and good generalization • Include classification, regression, clustering, dimensionality reduction,… • Pyramid match forms a Mercer kernel
Kernel Complexity Match [Wallraven et al.] Pyramid match Category recognition results ETH-80 data set Accuracy Time (s) Mean number of features Mean number of features
0.002 s / match 5 s / match Pyramid match kernel over spatial features with quantized appearance 2004 6/05 3/06 12/05 6/06 Time of publication Category recognition results
Vocabulary-guided pyramid match • But rectangular histogram may scale poorly with input dimension… • Build data-dependent histogram structure… • New Vocabulary-guided PM [NIPS 06]: • Hierarchical k-means over training set • Irregular cells; record diameter of each bin • VG pyramid structure stored O(kL); stored once • Individual Histograms still stored sparsely
Vocabulary-guided bins Vocabulary-guided pyramid match Uniform bins • Tune pyramid partitions to the feature distribution • Accurate for d > 100 • Requires initial corpus of features to determine pyramid structure • Small cost increase over uniform bins: kL distances against bin centers to insert points
ch(n) : child h of node n c2(n11) Vocabulary-guided pyramid match W * # new matches @ level i wij * (# matches in cell j level i - # matches in children) nij(X) : hist. X level i cell j • wij : weight for hist. X level i cell j • ~= diameter of cell • ~= dij(X) + dij(Y) • (dij(H)=max dist of H’s pts in cell i,j to center) Mercer kernel Upper bound
Results: Evaluation criteria • Quality of match scores How similar are the rankings produced by the approximate measure to those produced by the optimal measure? • Quality of correspondencesHow similar is the approximate correspondence field to the optimal one? • Object recognition accuracyUsed as a match kernel over feature sets, what is the recognition output?
Match score quality ETH-80 images, sets of SIFT features d=8 d=128 Vocabulary-guided pyramid match d=8 d=128 Uniform bin pyramid match Dense SIFT (d=128) k=10, L=5 for VG PM; PCA for low-dim feats
Match score quality ETH-80 images, sets of SIFT features
d=3 d=8 d=13 d=128 d=68 d=113 Bin structure and match counts Data-dependent bins allow more gradual distance ranges
Approximate correspondences Use pyramid intersections to compute smaller explicit matchings.
optimal per bin random per bin Approximate correspondences Use pyramid intersections to compute smaller explicit matchings.
Approximate correspondences ETH-80 images, sets of SIFT descriptors
Approximate correspondences ETH-80 images, sets of SIFT descriptors
Impact on recognition accuracy • VG-PMK as kernel for SVM • Caltech-4 data set • SIFT descriptors extracted at Harris and MSER interest points
documents as bags of words methods as sets of instructions Sets of features elsewhere diseases as sets of gene expressions