390 likes | 401 Views
Presented by Ronen Basri, Tal Hassner, Lihi Zelnik-Manor in collaboration with Andrew Guillory and Ian Simon, this research focuses on solving the problem of finding the nearest linear subspace to a query point. Through reduction to nearest neighbor and utilizing squared distances, the study explores how the reduction to a hyperplane or hypersphere impacts the results. The paper delves into the geometry of the reduction, highlighting the dependence of the additive constant on the dimensions of points and subspaces. Furthermore, it discusses extensions to subspaces of various dimensions, including lines and planes. The concept of Approximate Nearest Neighbor Search is touched upon, exploring tree-based approaches, such as KD-trees and LSH, and showcasing the use of multiple KD-Trees with random projections for efficient subspace search. Several experiments, including synthetic experiments, image approximation, and reconstruction using Yale Faces and Patches datasets, provide insights into the effectiveness and practicality of the proposed method.
E N D
Approximate Nearest Subspace Search with Applications to Pattern Recognition Ronen Basri, Tal Hassner, LihiZelnik-Manor presented by Andrew Guillory and Ian Simon
The Problem • Given n linear subspaces Si:
The Problem • Given n linear subspaces Si: • And a query point q:
The Problem • Given n linear subspaces Si: • And a query point q: • Find the subspace Si that minimizes dist(Si,q).
Why? • object appearance variation = subspace • fast queries on object database
Why? • object appearance variation = subspace • fast queries on object database • Other reasons?
Approach • Solve by reduction to nearest neighbor. • point-to-point distances
Approach • Solve by reduction to nearest neighbor. • point-to-point distances not actual reduction
Approach • Solve by reduction to nearest neighbor. • point-to-point distances • In higher-dimensional space. not actual reduction
Point-Subspace Distance • Use squared distance.
Point-Subspace Distance • Use squared distance.
Point-Subspace Distance • Use squared distance. • Squared point-subspace distancecan be represented as a dot product.
The Reduction • Let: Remember:
The Reduction • Let: • Then: Remember:
The Reduction constant over query
The Reduction ? constant over query
The Reduction ? constant over query ZTZ = I
The Reduction ? constant over query ZTZ = I Z is d-by-(d-k), columns orthonormal.
The Reduction ? constant over query ZTZ = I Z is d-by-(d-k), columns orthonormal.
The Reduction • For query point q:
The Reduction • For query point q: • Can we decrease the additive constant?
Observation 1 • All data points lie on a hyperplane.
Observation 1 • All data points lie on a hyperplane. • Let: • Now the hyperplane contains the origin.
Observation 2 • After hyperplane projection: • All data points lie on a hypersphere.
Observation 2 • After hyperplane projection: • All data points lie on a hypersphere. • Let: • Now the query point lies on the hypersphere.
Observation 2 • After hyperplane projection: • All data points lie on a hypersphere. • Let: • Now the query point lies on the hypersphere.
Reduction Geometry • What is happening?
Reduction Geometry • What is happening?
Finally • Additive constant depends only on dimension of points and subspaces. • This applies to linear subspaces, all of the same dimension.
Extensions • subspaces of different dimension • lines and planes, e.g. • Not all data points have the same norm. • Add extra dimension to fix this.
Extensions • subspaces of different dimension • lines and planes, e.g. • Not all data points have the same norm. • Add extra dimension to fix this. • affine subspaces • Again, not all data pointshave the same norm.
Approximate Nearest Neighbor Search • Find point x with distance d(x, q) <= (1 + ε) mini d(xi,q) • Tree based approaches: KD-trees, metric / ball trees, cover trees • Locality sensitive hashing • This paper uses multiple KD-Trees with (different) random projections
KD-Trees • Decompose space into axis aligned rectangles Image from Dan Pelleg
Random Projections • Multiply data with a random matrix X with X(i,j) drawn from N(0,1) • Several different justifications • Johnson-Lindenstrauss (data set that is small compared to dimensionality) • Compressed Sensing (data set that is sparse in some linear basis) • RP-Trees (data set that has small doubling dimension)
Results • Two goals • show their method is fast • show nearest subspace is useful • Four experiments • Synthetic Experiments • Image Approximation • Yale Faces • Yale Patches
Questions / Issues • Should random projections be applied before or after the reduction? • Why does the effective distance error go down with the ambient dimensionality? • The reduction tends to make query points far away from the points in the database. Are there better approximate nearest neighbor algorithms in this case?