150 likes | 325 Views
Towards efficient matching with random hashing methods… Kristen Grauman Gregory Shakhnarovich Trevor Darrell . Query. Motivation: Content-based image retrieval. Features: Harris-Affine detector (max m =3,595) MSER detector (max m =1,707) SIFT-PCA descriptors.
E N D
Towards efficient matching with random hashing methods…Kristen GraumanGregory Shakhnarovich Trevor Darrell
Query Motivation: Content-based image retrieval • Features: • Harris-Affine detector • (max m=3,595) • MSER detector • (max m=1,707) • SIFT-PCA descriptors • Data set of 30 scenes in Boston • 1,079 database images • 89 query images
Pyramid match: ~1 second / query Optimal match: ~2 hours / query Content-based image retrieval Even this is far too slow for any web-scale application! Accuracy Number top retrievals
? N 0110101 ? h 0110111 0111101 << N Linear scan Sub-linear time image search Randomized hashing techniques useful for sub-linear query time of very large image databases N
Pyramid match hashing • For fixed-size sets, Locality-Sensitive Hashing [Indyk & Motwani 1998] provides bounded approximate similarity search over bijective matching [Indyk & Thaper 2003]; [Grauman & Darrell CVPR 2004, 2005] • For varying set sizes, embedding of pyramid match (with product normalization) makes random hyperplane hashing possible under set intersection hash family of [Charikar2002]. [Grauman PhD 2006]
Single Frame Pose Estimation via Approximate Nearest Neighbor regression • Obtain large DB of pose-appearance mappings • Exploit fast methods for approximate nearest neighbor search in high dim. spaces. (e.g., LSH [Indyk and Motwani ‘98-’00].)
… … … Rendered (& hashed) Pose DB Approximate nearest neighbor techniques Hash fcns. input similar examples fall into same bucket in one or more hash table
Single Frame Pose Estimation via Approximate Nearest Neighbor regression • Render large DB of pose-appearance mappings • Exploit fast methods for approximate nearest neighbor search in high dim. spaces. (e.g., LSH [Indyk and Motwani ‘98-’00].) Problem: signal distance dominated by nuisance variables Idea: find embedding (i.e., hash functions for LSH) most relevant to parameter (pose) similarity… [Shakhnarovich et. al ’03, Shakhnarovich ‘05]
Pose estimation and Similarity-sensitive hashing … … … Rendered (& hashed) Pose DB Pose- sensitive Hash fcns. input NN similar in pose, not image [Shakhnarovich et. al ’03, Shakhnarovich ‘05]
SSE / BoostPro Similarity Sensitive Embedding • Compute embedding H: I {0, 1}N such that | H(I(1)) - H(I(2)) | is small if 1 is close to 2 | H(I(1)) - H(I(2)) | is large otherwise • Use the embedding with approximate nearest neighbors retrieval (LSH) • Find H by training boosted classifier to learn “same-pair” and concatenate resulting weak learners … [Shakhnarovich 2005]
PSH results ~200,000 examples in DB; 2 sec [Shakhnarovich et al. 2003, 2005]
Conclusions • Random Hashing techniques allow broad search; well suited for very high dimensional spaces • Useful in domains where there is no prior knowledge about how to cluster or model data… • Similarity (parameter) sensitive hashing can find distance related to task…effectively learn problem dependent distance measure and efficient means to index.