160 likes | 627 Views
Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. Piotr Indyk, Rajeev Motwani. The 30 th annual ACM symposium on theory of computing 1998. Problems. Nearest neighbor (NN) problem:
E N D
Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality Piotr Indyk, Rajeev Motwani The 30th annual ACM symposium on theory of computing 1998
Problems • Nearest neighbor (NN) problem: • Given a set of n points P={p1, …, pn} in some metric space X, preprocess P so as to efficiently answer queries which require finding the point in P closest to a query point qX. • Approximate nearest neighbor (ANN) problem: • Find a point pP that is an –approximate nearest neighbor of the query q in that for all p'P, d(p,q)(1+)d(p',q).
Motivation • The nearest neighbors problem is of major importance to a variety of applications, usually involving similarity searching. • Data compression • Databases and data mining • Information retrieval • Image and video databases • Machine learning • Pattern recognition • Statistics and data analysis • Curse of dimensionality • The curse of dimensionality is a term coined by Richard Bellman to describe the problem caused by the exponential increase in volume associated with adding extra dimensions to a (mathematical) space.
Overview of results and techniques • These results are obtained by reducing -NNSto a new problem: point location in equal balls.
nearest neighbor search (NNS) -nearest neighbor search (NNS) Ring-Cover Trees Point location in equal balls (PLEB) - Point location in equal balls (PLEB) Locality-Sensitive Hashing The Bucketing method Proposition 1 Proposition 2 Random projections Proposition 3 Content
The Bucketing method • We decompose each ball into a bounded number of cells and store them in a dictionary. • The bucketing algorithm works for any lp norm.