110 likes | 124 Views
Learn about Fast Dimension Reduction and Sketching in Lecture 9. Understand the Johnson Lindenstrauss Lemma for quicker computations. Explore the concept of Fast JL Transform and the use of sparse projections. Discover how to optimize dimension reductions effectively.
E N D
Plan • PS2 due tomorrow, 7pm • My office hours after class • Fast Dimension Reduction • Sketching • Scriber? • Due on Fri eve
Johnson Lindenstrauss Lemma • with probability • for • Time to compute : • Faster? • time ? • Will show: time
Fast JL Transform • Costly because is dense • Meta-approach: use matrix ? • Suppose sample entries/row • Analysis of one row: • s.t. with probability • Expectation of : • What about variance? Set normalization constant
Fast JLT: sparse projection • Variance of can be large • Bad case: is sparse • think: • Even for (each coordinate of goes somewhere) • two coordinates collide (bad) with probability ~ • want exponential in failure probability • really would need • But, take away: may work if is “spread around” • New plan: • “spread around” • use sparse
FJLT: construction • = matrix with random on diagonal • = Hadamardmatrix (Fourier transform) • A non-trivial rotation • can be computed in time • = projection matrix: sparse matrix as before, with size , with “spreading around” Projection: sparse matrix Hadamard (Fast Fourier Transform) Diagonal
Spreading around: intuition • Idea for Hadamard/Fourier Transform: • “Uncertainty principle”: if the original is sparse, then the transform is dense! • Though can “break” ’s that are already dense “spreading around” Projection: sparse matrix Hadamard (Fast Fourier Transform) Diagonal composed of
Spreading around: proof • Suppose • Without loss of generality since the map is linear! • Ideal spreading around: • would like , and • for all • Lemma: with probability at least , for each coordinate • Proof: • where is a random vector, times ! • as mentioned before, “behaves like” , for Gaussian (needs proof: at the end of the lecture if time permits) • Hence
Why projection ? • Why aren’t we done? • choose first few coordinates of ? • each has same distribution: • Roughly gaussian • Issue: • are not independent! • Nevertheless: • since is a change of basis (rotation in )
Projection • So far: • with probability • Or: with probability • = projection onto just random coordinates! • Proof: standard concentration • Chernoff: enough to sample terms for approximation • Hence suffices
FJLT: wrap-up • Obtain: • with probability • dimension of is • time: • Dimension not optimal: • apply regular (dense) JL on • to reduce further to • Final time: