190 likes | 346 Views
On the Impossibility of Dimension Reduction for Doubling Subsets of L p. Yair Bartal Lee-Ad Gottlieb Ofer Neiman. Embedding and Distortion. L p spaces: L p k is the metric space Let ( X,d ) be a finite metric space A map f:X → L p k is called an embedding
E N D
On the Impossibility of Dimension Reduction for Doubling Subsetsof Lp YairBartal Lee-Ad Gottlieb Ofer Neiman
Embedding and Distortion • Lp spaces: Lpk is the metric space • Let (X,d) be a finite metric space • A map f:X→Lpk is called an embedding • The embedding is non-expansive and has distortion D, if for all x,yϵX :
JL Lemma • Lemma: Any n points in L2 can be embedded into L2k, k=O((log n)/ε2) with 1+ε distortion • Extremely useful for many applications: • Machine learning • Compressive sensing • Nearest Neighbor search • Many others… • Limitations: specific to L2, dimension depends on n • There are lower bounds for dimension reduction in L1, L∞
Lower bounds on Dimension Reduction • For general n-point sets in Lp, Ω(logDn) dimensions are required for distortion D (volume argument) • BC’03 (and also LN’04, ACNN’11, R’12) showed strong impossibility results in L1 • The dimension must be for distortion D
Doubling Dimension • Doubling constant: The minimal λ so that every ball of radius 2r can be covered by λ balls of radius r • Doubling dimension: log2λ • A measure for dimensionality of a metric space • Generalizes the dimension for normed space: Lpk has doubling dimension Θ(k) • The volume argument holds only for metrics with high doubling dimension
Overcoming the Lower Bounds? • One could hope for an analogous version of the JL-Lemma for doubling subsets • Question: Does every set of points in L2 of constant doubling dimension, embeds to constant dimensional space with constant distortion? • More ambitiously: Any subset of L2 with doubling constant λ, can be embedded into L2k, k=O((log λ)/ε2) with 1+ε distortion
Our Result • Such a dimension reduction is impossible in the Lp spaces with p>2 • Thm: For any p>2 there is a constant c, such that for any n, there is a subset A of Lp of size n with doubling constant O(1), and any embedding of A into Lpkwith distortion at most D satisfies
Our Result • Thm: For any p>2 there is a constant c, such that for any n, there is a subset A of Lp of size n with doubling constant O(1), and any embedding of A into Lpkwith distortion at most D satisfies • Note: any sub-logarithmic dimension requires non-constant distortion • We also show a similar bound for embedding from Lp into Lq, for all q≠2 • Lafforgue and Naor concurrently proved this using analytic tools, and their counterexample is based on the Heisenberg group
Implications • Rules out a class of algorithms for NN-search, clustering, routing etc. • The first non-trivial result on non-linear dimension reduction for Lp with p≠1,2,∞ • Comment: For p=1, there is a stronger lower bound for doubling subsets, the dimension of any embedding with distortion D (into L1) must be at least (LMN’05)
The Laakso Graph G0 G1 • A recursive graph, Gi+1 is obtained from Gi by replacing every edge with a copy of G1 • A series-parallel graph • Has doubling constant 6 G2
Simple Case: p=∞ • The Laakso graph lies in high dimensional L∞ • Assume w.l.o.g that there is a non-expansive embedding f with distortion D into L∞k • Proof idea: • Follow the recursive construction • At each step, find an edge whose L2stretch is increased by some value, compared to the stretch of its parent edge • When stretch(u,v) > k, we will have a contradiction, as
Simple Case: p=∞ u • Consider a single iteration • The pair a,b is an edge of the previous iteration • Let fj be the j-th coordinate • There is a natural embedding that does not increase stretch... • But then u,v may be distorted s a b t v fj(a) fj(b)
Simple Case: p=∞ u • For simplicity (and w.l.o.g) assume • fj(s)=(fj(b)-fj(a))/4 • fj(t)=3(fj(b)-fj(a))/4 • fj(v)=(fj(b)-fj(a))/2 • Let Δj(u) be the difference between fj(u) and fj(v) • The distortion D requirement imposes that for some j, Δj(u)>1/D (normalizing so that d(u,v)=1) s a b t v fj(a) fj(b) Δj(u)
Simple Case: p=∞ u s a b t • The stretch of u,s will increase due to the j-th coordinate • But may decrease due toother coordinates.. • Need to prove that for one of the pairs {u,s}, {u,t}, the total L2 stretch increases by at least • Compared to the stretch of a,b v fj(a) fj(b) Δj(u) u s a b t v fh(a) fh(b) -Δh(u)
Simple Case: p=∞ u s a b t • Observe that in the j-thcoordinate: • If the distance between u,s increases by Δj(u), • Then the distance between u,t decreases by Δj(u) (and vise versa) • Denote by x the stretch of a,b in coordinate j • The average of the L2stretch of {u,s} and {u,t} (in the j-th coordinate alone) is: v fj(a) fj(b) Δj(u)
Simple Case: p=∞ • For one of the pairs {u,s}, {u,t}, the total L2 stretch (over all coordinates) increases by • Continue with this edge • The number of iterations must be at mostkD2(otherwise the stretch will begreater than k) • But # of iterations ≈ log n • Finally, u s a t b v
Going Beyond Infinity • For p<∞, we cannot use the Laakso graph • Requires high distortion to embed it into Lp • Instead, we build an instance in Lp, inspired by the Laakso graph • The new points u,v will use a new dimension • Parameter ε determines the (scaled) u,v distance u b a s t ε v
Going Beyond Infinity • Problem: the u,s distance is now larger than 1, roughly 1+εp • Causes a loss of ≈ εp in the stretch of each level • Since u,v are at distance ε, the increase to the stretch is now only (ε/D)2 • When p>2, there is a choice of ε for which the increase overcomes the loss u b a s t ε v
Conclusion • We show a strong lower bound against dimension reduction for doubling subsets of Lp, for any p>2 • Can our techniques be extended to 1<p<2 ? • The u,s distance when p<2 is quite large, ≈ 1+(p-1)ε2 , so a different approach is required • General doubling metrics embed to Lp with distortion O(log1/pn) (for p≥2) • Can this distortion bound be obtained in constant dimension?