631 likes | 1.93k Views
Robust PCA. Principal Component Analysis (PCA). PCA (Pearson, 1901) 最优线性重构 PCA (Hotelling, 1933) 最优方差投影方向 Probabilistic PCA (1999) 高斯隐变量模型 Robust PCA (2009). Why use Robust PCA?. Solve the problem with spike noise with high magnitude instead of Gaussian distributed noise. Main Problem.
E N D
Principal Component Analysis (PCA) • PCA (Pearson, 1901) 最优线性重构 • PCA (Hotelling, 1933) 最优方差投影方向 • Probabilistic PCA (1999) 高斯隐变量模型 • Robust PCA (2009)
Why use Robust PCA? Solve the problem with spike noise with high magnitude instead of Gaussian distributed noise.
Main Problem Given M = S0+L0, where S0 is a sparse spike noise matrix and L0 is a Low-rank matrix, aim at recovering L0: L0= UΣV’, in which U∈Rm*k ,Σ∈Rk*k ,V∈Rn*k
Difference from PCA In PCA, M = N0+L0, L0: low rank matrix ; N0: small idd Gaussian noise matrix, it seeks the best rank-k estimation of L by minimizing ||M-L||2 subject to rank(L)<=k. This problem can be solved by SVD.
Principal Component Analysis (PCA)[E. J. Candès, Journal of ACM58(1): 1-37]
维数约减:PCA与Probabilistic PCA PCA可以看成具有很小高斯噪声的Probabilistic PCA。
ill-posed problem (1) 要求: 低秩部分L0不能够太稀疏,其行(列)奇异矢量所张成的空间 与标准基必须incoherent。 设S0是稀疏的, L0是低秩的
ill-posed problem (2) 如果S0在稀疏的同时,也是低秩的,例如:S0仅第一列非零,因此M也可以分解为: 设S0是稀疏的, L0是低秩的 要求: 稀疏部分S0不能够太低秩,例如假定其行(列)不具有太多的零元素。
Conditions for exact recovery/decomposition 比较: random orthogonal model? S0需要满足random sparsity model,其非零值的位置是随机的。
低秩矩阵恢复问题 非凸,NP难 核范数是秩函数的凸包 1范数是0范数的凸包
Alternating Direction Method of MultipliersADMM (an variant of ALM) It can be solved approximately by first solving for x with y fixed, and then solving for y with x fixed. Rather than iterate until convergence, the algorithm proceeds directly to updating the dual variable and then repeating the process. 固定x,求解y(不必收敛);再固定y,求解x(不必收敛) ;迭代,直到收敛。
大数据 分成N块
RPCA的ALM方法 固定A,求解E? 固定E,求解A?
秩优化 PCA