210 likes | 309 Views
expander codes and pseudorandom subspaces of R n. James R. Lee. University of Washington. [joint with Venkatesan Guruswami (Washington) and Alexander Razborov (IAS/Steklov)]. Classical high-dimensional geometry [Kasin 77, Figiel-Lindenstrauss-Milman 77] :.
E N D
expander codes and pseudorandom subspaces of Rn James R. Lee University of Washington [joint with Venkatesan Guruswami (Washington) and Alexander Razborov (IAS/Steklov)]
Classical high-dimensional geometry [Kasin 77, Figiel-Lindenstrauss-Milman 77]: random sections of the cross polytope For a random subspace X µRN with dim(X) = N/2, (e.g. choose X = span {v1, …, vN/2} where viare i.i.d. on the unit sphere) In other words, every x 2 X has its L2 mass very “spread” out: This holds not only for each vi, but every linear combination
Classical high-dimensional geometry [Kasin 77, Figiel-Lindenstrauss-Milman 77]: random sections of the cross polytope For a random subspace X µRN with dim(X) = N/2, (e.g. choose X = span {v1, …, vN/2} where viare i.i.d. on the unit sphere)
This is a prominent example of the (now ubiquitous) use of the probabilistic method in asymptotic convex geometry. an existential crisis Geometric functional analysts face a dilemma we know well: Almost every subspace satisfies this property, but we can’t pinpoint even one. [Szarek, ICM06; Milman, GAFA01; Johnson-Schechtman, handbook01]asked: Can we find an explicit subspace on which the L1 and L2norms are equivalent? Related questions about explicit, high-dim. constructions arose (concurrently) in CS: - explicit embeddings of L2into L1 for nearest-neighbor search (Indyk) - explicit compressed sensing matrices M : RNRn for n ¿ N (Devore) - explicit Johnson-Lindenstrauss (dimension reduction) transform (Ailon-Chazelle) Why do analytists / CSists care about explicit high-dimensional constructions?
For a subspace X µRN, we define the distortion of Xby distortion By Cauchy-Schwarz, we always have N1/2¸(X) ¸ 1. Random construction: A random X µRN satisfies: dim(X) = (1-)Nand (X) = O(1). [Kasin 77] dim(X) = (N) and (X) ·1+. [Fiegel-Lindenstrauss-Milman 77] Example (Hadamard): Let X = ker(first N/2 rows of Hadamard), then (X) ¼ N1/4.
distortion dimension applications Compressive sensing Coding in characteristic zero, Geometric functional analysis Nearest-neighbor search View as an embedding: O(1) distortion, (N) dimension 1+ distortion, small blowup in dimension (Milman believes impossible) Want a map A : RNRn with n ¿ N, such that any r-sparse signal x 2RN (vector with at most r non-zero entries) can be uniquely and efficiently recovered from Ax. Relation to distortion: [Kashin-Temlyakov] Can uniquely and efficiently recover any r-sparse signal for r · N/(ker(A))2. (Even tolerates additional “noise” in the “non-sparse” parts of the signal.)
Want a map A : RNRn such that any r-sparse signal x 2RN (vector with at most r non-zero entries) can be uniquely and efficiently recovered from Ax. sensing and distortion Want to solve: Given compressed signal y, minimize ||x||0 subject to Ax = y. (P0) Highly non-convex optimization problem, NP-hard for general A. Basis Pursuit: Given compressed signal y, minimize ||x||1 subject to Ax = y. (P1) Can use linear programming! [Lots of work has been done here: Donoho et. al.; Candes-Tao-Romberg; etc.] [KT07]: If y = Av and v has at most N/[2 (ker(A))]2non-zero coordinates, then (P0) and (P1) give the same answer. let’s prove this
[KT07]: If y = Av and v has at most N/[2 (ker(A))]2non-zero coordinates, then (P0) and (P1) give the same answer. sensing and distortion For x 2RN and S µ [N], let xS be x restricted to coordinates in S. If x 2 ker(A) and
Sub-linear dimension: previous results: explicit Rudin’60 (and later LLR’94) achieve dim(X) ¼ N1/2 and (X) · 3 (X = span {4-wise independent vectors}) Indyk’00 achieves dim(X) ¼ exp((log N)1/2) and (X) = 1+o(1). Indyk’07 achieves dim(X) ¼ N/2(log log N)2 and (X) = 1+o(1). Our result: We construct an explicit subspace X µRN with dim(X) = (1-o(1))N and In our constructions, X = ker(explicit sign matrix).
Partial derandomization: previous results: derandomization Let Ak, Nbe a random k £ N sign matrix (entries are ±1 i.i.d) Kashin’s technique shows that almost surely, (and dim(ker(Ak, N)) ¸ N – k) Can reduce to O(N log2 N) random bits [Indyk 00] Can reduce to O(N log N) random bits [Artstein-Milman 06] Can reduce to O(N) random bits [Lovett-Sodin 07] Our result: With No(1) random bits, we get (X) · polylog(N). With N random bits for any >0, we get (X) = O(1). [Guruswami-L-Wigderson]
d j À n N G = ([N], [n], E) - bipartite graph, d-right-regular and L µRd a subspace. the expander code construction where xS2R|S| is x restricted to the coordinates in S µ [N] and (j) is the neighborhood of j. x1 x2 x3 Resembles construction of Gallager, Tanner (L is the “inner” code). Following Tanner and Sipser-Spielman, we will show that if L is “good” and G is an “expander” then X(G,L) is even better (in some parameters). xN
Say that a subspace L µRd is (t, )-spread if every x 2 L satisfies some quantitative matters If L is ((d), )-spread, then Conversely, if L has (L) = O(1), then L is ((d),(1))-spread. For a bipartite graph G = ([N],[n],E), the expansion profile of Gis (This is expansion from left to right.)
Setup: spread-boosting theorem G = ([N], [n], E) - bipartite graph, d-right-regular and left degree · D. L µRda(t, )-spread subspace. Conclusion: If X(G,L) is (T, )-spread, then X(G,L) is How to apply: Assume D=O(1) and G(q) = (q) 8q 2 [N] (impossible to achieve) X(G,L) is (½, 1)-spread )(t, )-spread )(t2, 2)-spread … )((N), logt(N))-spread )(X(G,L)) . (1/)logt(N)
Setup: spread-boosting theorem G = ([N], [n], E) - bipartite graph, d-right-regular and left degree · D. L µRda(t, )-spread subspace. Conclusion: If X(G,L) is (T, )-spread, then X(G,L) is S should “leak” L2 mass outside (since L is spreading and G is an expander), unless most of the mass in S is concentrated on a small subset B (impossible by assumption) S B
Let H be a (non-bipartite) d-regular graph with second eigenvalue = O(d1/2). Let G be the edge-vertex incidence graph (an edge is connected to its endpoints) when L is random (explicit constructions exist by Margulis, Lubotsky-Phillips-Sarnak) Alon-Chung: nodes of H edges of H Random subspace L µRd is ((d), (1))-spread Letting d = N1/4, the spread-boosting thm gives X(G,L) is (T,)-spread )X(G,L) is Takes O(log log N) steps to reach (N)-sized sets )poly(log N) distortion.
Spectral Lemma: explicit construction: ingredients for L Let A be any k £ d matrix whose columns a1, …, ad2Rkare unit vectors and such that for every i j, |hai, aji|·. Then ker(A) is + Kerdock codes (aka Mutually Unbiased Bases) [Kerdock’72, Cameron-Seidel’73] ((d1/2), (1))-spread subspaces of dimension (1-)d for every eps>0
Kerdock + Spectral Lemma gives ((d1/2), (1))-spread subspaces of dimension (1-)d for every eps>0 boosting L with sum-product expanders Problem: If G=Ramanujan construction and L=Kerdock, the spread-boosting theorem gives nothing. (Ramanujan loses d1/2 and Kerdock gains only d1/2) Solution: Produce L’ = X(G,L) where L=Kerdock and G=sum-product expander Sum-product theorems [Bourgain-Katz-Tao, …] For A µFp, with |A| · p0.99 we have
Kerdock + Spectral Lemma gives ((d1/2), (1))-spread subspaces of dimension (1-)d for every eps>0 boosting L with sum-product expanders Problem: If G=Ramanujan construction and L=Kerdock, the spread-boosting theorem gives nothing. (Ramanujan loses d1/2 and Kerdock gains only d1/2) Solution: Produce L’ = X(G,L) where L=Kerdock and G=sum-product expander Using [Barak-Impagliazzo-Wigderson/BKSSW] and the spread-boosting theorem, L’ is (d1/2+c, (1))-spread for some c > 0.
boosting L with sum-product expanders Now we can plug L’ into G=Ramanujan and get non-trivial boosting. (almost done…) Solution: Produce L’ = X(G,L) where L=Kerdock and G=sum-product expander Using [Barak-Impagliazzo-Wigderson/BKSSW] and the spread-boosting theorem, L’ is (d1/2+c, (1))-spread for some c > 0.
Improve the current bounds: • First attempt would be O(1) distortion with sub-linear randomness. some open questions Improve dependence on the co-dimension (important for compressed sensing) If dim(X) ¸ (1-)N, we get distortion dependence (1/)O(log log N). Could hope for . • Stronger pseudorandom properties:Restricted Isometry Property • [T. Tao’s blog] Find an explicit collection of unit vectors v1, v2, …, vN2Rn with N À n so that every small enough sub-collection is “nearly orthogonal.” • Breaking the diameter bound: • Show that the kernel of a random {0,1} matrix with only 100 ones per • row has small distortion. Or prove that sparse matrices cannot work.
Refuting random subspaces with high distortion • Give efficiently computable certificates for (X) small or Restricted Isometry • Property which exist almost surely for random X µRN. some open questions • Linear time expander decoding? • Are their recovery schemes that run faster than Basis Pursuit?