1 / 15

Sparsity Control for Robust Principal Component Analysis

Sparsity Control for Robust Principal Component Analysis. Gonzalo Mateos and Georgios B. Giannakis ECE Department, University of Minnesota Acknowledgments : NSF grants no. CCF-1016605, EECS-1002180. Asilomar Conference November 10, 2010. Principal Component Analysis. DNA microarray.

hcarol
Download Presentation

Sparsity Control for Robust Principal Component Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sparsity Control for Robust Principal Component Analysis Gonzalo Mateos and Georgios B. Giannakis ECE Department, University of Minnesota Acknowledgments: NSF grants no. CCF-1016605, EECS-1002180 Asilomar Conference November 10, 2010

  2. Principal Component Analysis DNA microarray Traffic surveillance • Motivation: (statistical) learning from high-dimensional data • Principal component analysis (PCA) [Pearson’1901] • Extraction of low-dimensional data structure • Data compression and reconstruction • PCA is non-robust to outliers [Jolliffe’86] • Our goal: robustify PCA by controlling outlier sparsity 2

  3. Our work in context Original Robust PCA `Outliers’ • Contemporary applications • Anomaly detection in IP networks [Huang et al’07], [Kim et al’09] • Video surveillance, e.g., [Oliver et al’99] • Robust PCA • Robust covariance matrix estimators [Campbell’80], [Huber’81] • Computer vision [Xu-Yuille’95], [De la Torre-Black’03] • Low-rank matrix recovery from sparse errors [Wright et al’09] • Huber’s M-class and sparsity in linear regression [Fuchs’99] 3

  4. PCA formulations • Training data: • Minimum reconstruction error: • Dimensionality reduction operator • Reconstruction operator • Maximum variance: • Factor analysis model: Solution: 4

  5. Robustifying PCA Least-trimmed squares (LTS) regression [Rousseeuw’87] LTS-based PCA for robustness (LTS PCA) is the -th order statistic among Trimming constant determines breakdown point • Q: How should we go about minimizing ? (LTS PCA) is nonconvex; existence of minimizer(s)? A: Try all subsets of size , solve, and pick the best • Simple but intractable beyond small problems 5

  6. Modeling outliers • Natural (but intractable) estimator inlier • Introduce auxiliary variables s.t. outlier • Inliers obey ; outliers something else • Inlier noise: are zero-mean i.i.d. random vectors • Remarks • and are unknown • If outliers sporadic, then vector is sparse! 6

  7. LTS PCA as sparse regression • Tuning controls sparsity in , thus number of outliers Proposition 1: If solves (P0) with chosen such that , then solves (LTS PCA) too. • Lagrangian form (P0) • Justifies the model and its estimator (P0); ties sparsity with robustness 7

  8. Just relax! • (P0) is NP-hard relax (P2) • Role of sparsity controlling is central • Q: Does (P2) yield robust estimates ? A: Yap! Huber estimator is a special case

  9. Entrywise outliers Original Robust PCA (P2) Robust PCA (P1) Outlier pixels Entire image rejected Outlier pixels rejected • Use -norm regularization (P1)

  10. Alternating minimization • update: reduced-rank Procrustes rotation • update: coordinatewise soft-thresholding Proposition 2: Alg. 1’s iterates converge to a stationary point of (P1). (P1) 10

  11. Refinements • Options: SCAD [Fan-Li’01], or sum-of-logs [Candes etal’08] • Iterative linearization-minimization of around • Iteratively reweighted version of Alg. 1 • Warm start: solution of (P1) or (P2) • Bias reduction in (cf. weighted Lasso [Zou’06]) • Discard outliers identified in • Re-estimate missing data problem • Nonconvex penalty terms approximate better in (P0) 11

  12. Online robust PCA • Approximation [Yang’95] • At time , do not re-estimate past outlier vectors • Motivation: Real-time data and memory limitations • Exponentially-weighted robust PCA 12

  13. Video surveillance Original PCA Robust PCA `Outliers’ 13 Data: http://www.cs.cmu.edu/~ftorre/

  14. Online PCA in action Angle between C(n) and C • Inliers: • Outliers: • Figure of merit: angle between and 14

  15. Concluding summary Sparsity control for robust PCA LTS PCA as -(pseudo)norm regularized regression (NP-hard) Relaxation(group)-Lassoed PCA M-type estimator Sparsity controlling role of central • Batch and online robust PCA algorithms • i) Outlier identification, ii) Robust subspace tracking • Refinements via nonconvex penalty terms • Tests on real video surveillance data for anomaly extraction • Ongoing research • Preference measurement: conjoint analysis and collaborative filtering • Robustifying kernel PCA and blind dictionary learning 15

More Related