80 likes | 220 Views
Dimensionality Reduction. Random Projections. Johnson-Lindenstrauss lemma For: 0< e < 1/2, any (sufficiently large) set S of M points in R n k = O( e -2 lnM) There exists a linear map f: S R k , such that (1- e ) ||u-v|| 2 ≤ ||f(u)-f(v)|| 2 ≤ (1+ e )||u-v|| 2 for u,v in S
E N D
Random Projections • Johnson-Lindenstrauss lemma • For: • 0< e < 1/2, • any (sufficiently large) set S of M points in Rn • k = O(e-2lnM) • There exists a linear map f:SRk, such that (1- e) ||u-v||2 ≤ ||f(u)-f(v)||2 ≤ (1+ e)||u-v||2 for u,v in S • Random projection is good with constant probability
Random Projection • Set k = O(e-2lnM) • Select k random n-dimensional vectors • (an approach is to select k gaussian distributed vectors with variance 0 and mean value 1: N(1,0) ) • Project the original points into the k vectors. • The resulting k-dimensional space approximately preserves the distances with high probability
“Database friendly” RP • Achlioptas showed that it is possible to do random projections with the same guarantees using only {1, -1} or {1, 0, -1} • Thus you need to do only additions and subtractions, not multiplications
Theorem Let P a set of n points in Rd, stored as a n x d matrix A. Given e, b >0, let For integer k > k0 let R be a d x k matrix with R(i, j) = {rij}, with elements that are generated randomly and independently from the following distribution: +1 with probability 1/2 rij = -1 with probability 1/2
Let and let f: Rd Rk With probability at least 1-n-b, for all u, v in P (1- e) ||u-v||2 ≤ ||f(u)-f(v)||2 ≤ (1+ e)||u-v||2
The same is true if you use: +1 with probability 1/6 rij = 0 with probability 2/3 -1 with probability 1/6
The proof is similar to previous in spirit, but differs in details. Again, we need to show that the length of a vector concentrates around its mean value in the projected space. We show that the worst case vectors for this projection matrix are the vectors: and that the even moments of the projection of these vectors are dominated by the corresponding moments of the spherically symmetric projections.