460 likes | 1.99k Views
Spectral Hashing. Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU). What does the world look like?. Motivation. High level image statistics. Object Recognition for large-scale search . Semantic Hashing. [Salakhutdinov & Hinton, 2007]. Query Image. Semantic Hash Function.
E N D
Spectral Hashing Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)
What does the world look like? Motivation High level image statistics Object Recognition for large-scale search
Semantic Hashing [Salakhutdinov & Hinton, 2007] Query Image Semantic HashFunction Address Space Binary code Images in database Query address Semantically similar images Quite differentto a (conventional)randomizing hash
1. Locality Sensitive Hashing • Gionis, A. & Indyk, P. & Motwani, R. (1999) • Take random projections of data • Quantize each projection with few bits 101 0 Gist descriptor 1 0 No learning involved 1 1 0
Toy Example • 2D uniform distribution
2. Boosting • Modified form of BoostSSC [Shaknarovich, Viola & Darrell, 2003] • Positive examples are pairs of similar images • Negative examples are pairs of unrelated images 0 Learn threshold & dimension for each bit (weak classifier) 0 1 1 0 1
Toy Example • 2D uniform distribution
3. Restricted Boltzmann Machine (RBM) • Type of Deep Belief Network • Hinton & Salakhutdinov, Science 2006 Hidden units Units are binary & stochastic SingleRBMlayer Symmetric weights W Visible units • Attempts to reconstruct input at visible layer from activation of hidden layer
Multi-Layer RBM: non-linear dimensionality reduction Output binary code (N dimensional) N Layer 3 w3 256 256 Layer 2 w2 512 512 Layer 1 w1 512 Linear units at first layer Input Gist vector (512 dimensions)
Toy Example • 2D uniform distribution
2-D Toy example: 3 bits 7 bits 15 bits Distance from query point Red – 0 bits Green – 1 bit Black – >2 bits Blue – 2 bits Query Point
Toy Results Distance Red – 0 bits Green – 1 bit Blue – 2 bits
Semantic Hashing [Salakhutdinov & Hinton, 2007] Query Image Semantic HashFunction Address Space Binary code Images in database Query address Semantically similar images Quite differentto a (conventional)randomizing hash
Spectral Hash Query Image SpectralHash Non-lineardimensionality reduction Address Space Binary code Images in database Real-valuedvectors Query address Semantically similar images Quite differentto a (conventional)randomizing hash
Spectral Hashing (NIPS ’08) • Assume points are embedded in Euclidean space • How to binarize so Hamming distance approximates Euclidean distance? Ham_Dist(10001010,11101110)=3
Spectral Hashing theory • Want to min YT(D-W)Y subject to: • Each bit on 50% of time • Bits are independent • Sadly, this is NP-complete • Relax the problem, by letting Y be continuous. • Now becomes eigenvector problem
Nystrom Approximation • Method for approximating eigenfunctions • Interpolate between existing data points • Requires evaluation of distance to existing data cost grows linearly with #points • Also overfits badly in practice
What about a novel data point? • Need a function to map new points into the space • Take limit of Eigenvalues as n\inf • Need to carefully normalize graph Laplacian • Analytical form of Eigenfunctions exists for certain distributions (uniform, Gaussian) • Constant time compute/evaluate new point • For uniform: Only depends on extent of distribution (b-a)
The Algorithm Input: Data {xi} of dimensionality d; desired # bits, k • Fit a multidimensional rectangle to the data • Run PCA to align axes, then bound uniform distribution • For each dimension, calculate k smallest eigenfunctions. • This gives dkeigenfunctions. Pick ones with smallest k eigenvalues. • Threshold eigenfunctions at zero to give binary codes
1. Fit Multidimensional Rectangle • Run PCA to align axes • Bound uniform distribution
3. Pick k smallest Eigenfunctions Eigenvalues e.g. k=3
Back to the 2-D Toy example 3 bits 7 bits 15 bits Distance Red – 0 bits Green – 1 bit Blue – 2 bits
Input Image representation: Gist vectors • Pixels not a convenient representation • Use Gist descriptor instead (Oliva & Torralba, 2001) • 512 dimensions/image (real-valued 16,384 bits) • L2 distance btw. Gist vectors not bad substitute for human perceptual distance NO COLOR INFORMATION Oliva & Torralba, IJCV 2001
LabelMe images • 22,000 images (20,000 train | 2,000 test) • Ground truth segmentations for all • Assume L2 Gist distance is true distance
Bit allocation between dimensions • Compare value of cuts in original space, i.e. before the pointwise nonlinearity.
Summary • Spectral Hashing • Simple way of computing good binary codes • Forced to make big assumption about data distribution • Use point-wise non-linearities to map distribution to uniform • Need more experiments on real data
Overview • Assume points are embedded in Euclidean space (e.g. output from RBM) • How to binarize the space so that Hamming distance between points approximates L2 distance?
Strategies for Binarization • Deliberately add noise during backprop - forces extreme values to overcome noise 0 1 0 1