130 likes | 322 Views
Direct Robust Matrix Factorization. Liang Xiong , Xi Chen, Jeff Schneider Presented by xxx School of Computer Science Carnegie Mellon University. Matrix Factorization. Extremely useful… Assumes the data matrix is of low-rank. PCA/SVD, NMF, Collaborative Filtering…
E N D
Direct Robust Matrix Factorization Liang Xiong, Xi Chen, Jeff SchneiderPresented by xxx School of Computer Science Carnegie Mellon University
Matrix Factorization • Extremely useful… • Assumes the data matrix is of low-rank. • PCA/SVD, NMF, Collaborative Filtering… • Simple, effective, and scalable. • For Anomaly Detection • Assumption: the normal data is of low-rank, and anomalies are poorly approximated by the factorization. DRMF: Liang Xiong, Xi Chen, Jeff Schneider
Robustness Issue • Usually not robust (sensitive to outliers) • Because of the L2 (Frobenius) measure they use. • For anomaly detection, of course we have outliers. Minimize the approximation error Low rank DRMF: Liang Xiong, Xi Chen, Jeff Schneider
Why outliers matter • Simulation • We use SVD to find the first basis of 10 sine signals. • To make it more fun, let us turn one point of one signal into a spike (the outlier). Input signals Output basis No outlier Moderate outlier Wild outlier Cool Disturbed Totally lost DRMF: Liang Xiong, Xi Chen, Jeff Schneider
Direct Robust Matrix Factorization (DRMF) • Throw outliers out of the factorization, and problem solved! • Mathematically, this is DRMF: • : number of non-zeros in S. “Trash can” for outliers There should be only a small number of outliers. DRMF: Liang Xiong, Xi Chen, Jeff Schneider
DRMF Algorithm • Input: Data X. • Output: Low-rank L; Outliers S. • Iterate (block coordinate descent): • Let C = X – S. Do rank-K SVD: L = SVD(C, K). • Let E = X – L. Do thresholding: • t: the e-th largest elements in {|Eij|}. • That’s it! Everyone could try at home. DRMF: Liang Xiong, Xi Chen, Jeff Schneider
Related Work • Nuclear norm minimization (NNM) • Effective methods with nice theoretical properties from compressive sensing. • NNM is the convex relaxation of DRMF: • A parallel work GoDec by Zhou et al. found in ICML’11. DRMF: Liang Xiong, Xi Chen, Jeff Schneider
Pros & Cons • Pros: • No compromise/relaxation => High quality • Efficient • Easy to implement and use • Cons: • Difficult theory • Because of the rank and the L0 norm… • Non-convex. • Local minima exist. But can be greatly mitigated if initialized by its convex version, NNM. DRMF: Liang Xiong, Xi Chen, Jeff Schneider
Highly Extensible • Structured Outliers • Outlier rows instead of entries? Just use structured measurements. • Sparse Input / Missing data • Useful for Recommendation, Matrix Completion. • Non-Negativity like in NMF • Still readily solvable with the constraints. • For large-scale problems. • Use approximate SVD solvers. DRMF: Liang Xiong, Xi Chen, Jeff Schneider
Simulation Study • Factorize noisy low-rank matrices to find entry outliers. • SVD: plain SVD.RPCA, SPCP: two representative NNM methods. Error of recovering normal entries Detection rate of outlier entries. Running time (log-scale) DRMF: Liang Xiong, Xi Chen, Jeff Schneider
Simulation Study • Sensitivity to outliers • We examine the recovering errors when the outlier amplitude grows. • Noiseless case. All assumptions by RPCA hold. DRMF: Liang Xiong, Xi Chen, Jeff Schneider
Find Stranger Digits • USPS dataset is used. We mix a few ‘7’s into many ‘1’’s, and then ask DRMF to find out those ‘7’s. Unsupervised. • Treat each digit as a row in the matrix. • Rank the digits by reconstruction errors. • Use the structured extension of DRMF: row outliers. • Resulting ranked list: DRMF: Liang Xiong, Xi Chen, Jeff Schneider
Conclusion • DRMF is a direct and intuitive solution to the robust factorization problem. • Easy to implement and use. • Highly extensible. • Good empirical performance. Please direct questions to Liang Xiong (lxiong@cs.cmu.edu) DRMF: Liang Xiong, Xi Chen, Jeff Schneider