Fast Dimension Reduction Sketching Plan: Lecture 9 with Johnson Lindenstrauss Lemma

Lecture 9:Fast Dimension ReductionSketching

Plan • PS2 due tomorrow, 7pm • My office hours after class • Fast Dimension Reduction • Sketching • Scriber? • Due on Fri eve

Johnson Lindenstrauss Lemma • with probability • for • Time to compute : • Faster? • time ? • Will show: time

Fast JL Transform • Costly because is dense • Meta-approach: use matrix ? • Suppose sample entries/row • Analysis of one row: • s.t. with probability • Expectation of : • What about variance? Set normalization constant

Fast JLT: sparse projection • Variance of can be large  • Bad case: is sparse • think: • Even for (each coordinate of goes somewhere) • two coordinates collide (bad) with probability ~ • want exponential in failure probability • really would need • But, take away: may work if is “spread around” • New plan: • “spread around” • use sparse

FJLT: construction • = matrix with random on diagonal • = Hadamardmatrix (Fourier transform) • A non-trivial rotation • can be computed in time • = projection matrix: sparse matrix as before, with size , with “spreading around” Projection: sparse matrix Hadamard (Fast Fourier Transform) Diagonal

Spreading around: intuition • Idea for Hadamard/Fourier Transform: • “Uncertainty principle”: if the original is sparse, then the transform is dense! • Though can “break” ’s that are already dense “spreading around” Projection: sparse matrix Hadamard (Fast Fourier Transform) Diagonal composed of

Spreading around: proof • Suppose • Without loss of generality since the map is linear! • Ideal spreading around: • would like , and • for all • Lemma: with probability at least , for each coordinate • Proof: • where is a random vector, times ! • as mentioned before, “behaves like” , for Gaussian (needs proof: at the end of the lecture if time permits) • Hence

Why projection ? • Why aren’t we done? • choose first few coordinates of ? • each has same distribution: • Roughly gaussian • Issue: • are not independent! • Nevertheless: • since is a change of basis (rotation in )

Projection • So far: • with probability • Or: with probability • = projection onto just random coordinates! • Proof: standard concentration • Chernoff: enough to sample terms for approximation • Hence suffices

FJLT: wrap-up • Obtain: • with probability • dimension of is • time: • Dimension not optimal: • apply regular (dense) JL on • to reduce further to • Final time:

Fast Dimension Reduction Sketching Plan: Lecture 9 with Johnson Lindenstrauss Lemma