570 likes | 711 Views
Matrix Extensions to Sparse Recovery. Yi Ma 1,2 Allen Yang 3 John Wright 1. 2 University of Illinois at Urbana-Champaign. 3 University of California Berkeley. 1 Microsoft Research Asia. CVPR Tutorial, June 20, 2009. TexPoint fonts used in EMF.
E N D
Matrix Extensions to Sparse Recovery Yi Ma1,2Allen Yang3 John Wright1 2University of Illinois at Urbana-Champaign 3University of California Berkeley 1Microsoft Research Asia CVPR Tutorial, June 20, 2009 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAAA
FINAL TOPIC – Generalizations: sparsity to degeneracy The tools and phenomena underlying sparse recovery generalize very nicely to low-rank matrix recovery ???
FINAL TOPIC – Generalizations: sparsity to degeneracy The tools and phenomena underlying sparse recovery generalize very nicely to low-rank matrix recovery Matrix completion: Given an incomplete subset of the entries of a low-rank matrix, fill in the missing values. Robust PCA: Given a low-rank matrix which has been grossly corrupted, recover the original matrix.
THIS TALK – From sparse recovery to low-rank recovery Examples of degenerate data: Face images Degeneracy: illumination models Errors: occlusion, corruption Relevancy data Degeneracy: user preferences co-predict Errors: Missing rankings, manipulation ??? Video Degeneracy: temporal, dynamic structures Errors: anomalous events, mismatches…
KEY ANALOGY – Connections between rank and sparsity This talk: exploiting this connection for matrix completion and RPCA
CLASSICAL PCA – Fitting degenerate data If degenerate observations are stacked as columns of a matrix then
CLASSICAL PCA – Fitting degenerate data If degenerate observations are stacked as columns of a matrix then Principal Component Analysis via singular value decomposition: • Stable, efficient computation • Optimal estimate of under iid Gaussian noise • Fundamental statistical tool, huge impact in vision, search, • bioinformatics
CLASSICAL PCA – Fitting degenerate data If degenerate observations are stacked as columns of a matrix then Principal Component Analysis via singular value decomposition: • Stable, efficient computation • Optimal estimate of under iid Gaussian noise • Fundamental statistical tool, huge impact in vision, search, • bioinformatics But… PCA breaks down under even a single corrupted observation.
ROBUST PCA – Problem formulation A – low-rank E – sparse error D - observation … … … Problem: Given recover . Low-rank structure Sparse errors • Properties of the errors: • Each multivariate data sample (column) may be corrupted in some entries • Corruption can be arbitrarily large in magnitude (not Gaussian!)
ROBUST PCA – Problem formulation A – low-rank E – sparse error D - observation … … … Problem: Given recover . Low-rank structure Sparse errors • Numerous heuristic methods in the literature: • Random sampling [Fischler and Bolles ‘81] • Multivariate trimming [Gnanadesikan and Kettering ‘72] • Alternating minimization [Ke and Kanade ‘03] • Influence functions [de la Torre and Black ‘03] • No polynomial-time algorithm with strong performance guarantees!
ROBUST PCA – Semidefinite programming formulation Seek the lowest-rank that agrees with the data up to some sparse error:
ROBUST PCA – Semidefinite programming formulation Seek the lowest-rank that agrees with the data up to some sparse error: Not directly tractable, relax:
ROBUST PCA – Semidefinite programming formulation Seek the lowest-rank that agrees with the data up to some sparse error: Not directly tractable, relax: Convex envelope over Semidefinite program, solvable in polynomial time
MATRIX COMPLETION – Motivation for the nuclear norm Related problem: we observe only a small knownsubset of entries of a rank- matrix . Can we exactly recover ?
MATRIX COMPLETION – Motivation for the nuclear norm Related problem: recover a rank matrix from a knownsubset of entries Convex optimization heuristic [Candes and Recht] : For incoherent , exact recovery with [Candes and Tao] Spectral trimming also succeeds with for [Keshavan, Montanari and Oh]
ROBUST PCA – Exact recovery? CONJECTURE: If with sufficiently low-rank and sufficiently sparse, then solving exactly recovers . Empirical evidence: probability of correct recovery vs rank and sparsity Sparsity of error Perfect recovery Rank
ROBUST PCA – Which matrices and which errors? Fundamental ambiguity – very sparse matrices are also low-rank: Decompose as or ? rank-1 0-sparse 1-sparse rank-0 Obviously we can only hope to uniquely recover that are incoherent with the standard basis. Can we recover almost all low-rank matrices from almost all sparse errors?
ROBUST PCA – Which matrices and which errors? Random orthogonal model (of rank r) [Candes & Recht ‘08]: independent samples from invariant measure on Steifel manifold of orthobases of rank r. arbitrary.
ROBUST PCA – Which matrices and which errors? Random orthogonal model (of rank r) [Candes & Recht ‘08]: independent samples from invariant measure on Steifel manifold of orthobases of rank r. arbitrary. Bernoulli error signs-and-support (with parameter ): Magnitude of is arbitrary.
MAIN RESULT – Exact Solution of Robust PCA “Convex optimization recovers almost any matrix of rank from errors affecting of the observations!”
BONUS RESULT – Matrix completion in proportional growth “Convex optimization exactly recovers matrices of rank , even with entries missing!”
MATRIX COMPLETION – Contrast with literature • [Candes and Tao 2009]: • Correct completion whp for Does not apply to the large-rank case • This work: • Correct completion whp for even with Proof exploits rich regularity and independence in random orthogonal model. Caveats: - [C-T ‘09] tighter for small r. - [C-T ‘09] generalizes better to other matrix ensembles.
MAIN RESULT – Exact Solution of Robust PCA “Convex optimization recovers almost any matrix of rank from errors affecting of the observations!”
ROBUST PCA – Solving the convex program Semidefinite program in millions of unknowns. Scalable solution: apply a first-order method with convergence to Sequence of quadratic approximations [Nesterov, Beck & Teboulle]: Solved via soft thresholding(E), and singular value thresholding (A).
ROBUST PCA – Solving the convex program • Iteration complexity for suboptimal solution. • Dramatic practical gains from continuation
SIMULATION – Recovery in various growth scenarios Correct recovery with and fixed, increasing. Empirically, almost constant number of iterations: Provably robust PCA at only a constant factor more computation than conventional PCA.
SIMULATION – Phase Transition in Rank and Sparsity [0,1] x [0,1] [0,.4] x [0,.4] Fraction of successes with , varying (10 trials) [0,1] x [0,1] [0,.5] x [0,.5] Fraction of successes with , varying (65 trials)
EXAMPLE – Background modeling from video Static camera surveillance video 200 frames, 72 x 88 pixels, Significant foreground motion Video Low-rank appx. Sparse error
EXAMPLE – Background modeling from video Static camera surveillance video 550 frames, 64 x 80 pixels, significant illumination variation Video Low-rank appx. Sparse error Background variation Anomalous activity
EXAMPLE – Faces under varying illumination 29 images of one person under varying lighting: … RPCA …
EXAMPLE – Faces under varying illumination 29 images of one person under varying lighting: Specularity … RPCA Self- shadowing …
EXAMPLE – Face tracking and alignment Initial alignment, inappropriate for recognition:
EXAMPLE – Face tracking and alignment Final result: per-pixel alignment
EXAMPLE – Face tracking and alignment Final result: per-pixel alignment
SIMULATION – Phase Transition in Rank and Sparsity [0,1] x [0,1] [0,.4] x [0,.4] Fraction of successes with , varying (10 trials) [0,1] x [0,1] [0,.5] x [0,.5] Fraction of successes with , varying (65 trials)
CONJECTURES – Phase Transition in Rank and Sparsity Hypothesized breakdown behavior as m ∞ 1 0 0 1
CONJECTURES – Phase Transition in Rank and Sparsity What we know so far: 1 This work 0 0 1 Classical PCA
CONJECTURES – Phase Transition in Rank and Sparsity 1 0 0 1 CONJECTURE I: convex programming succeeds in proportional growth
CONJECTURES – Phase Transition in Rank and Sparsity 1 0 0 1 CONJECTURE II: for small ranks , any fraction of errors can eventually be corrected. Similar to Dense Error Correction via L1 Minimization, Wright and Ma ‘08
CONJECTURES – Phase Transition in Rank and Sparsity 1 0 0 1 CONJECTURE III: for any rank fraction, , there exists a nonzero fraction of errors that can eventually be corrected with high probability.