1 / 32

Sparsity Control for Robustness and Social Data Analysis

Sparsity Control for Robustness and Social Data Analysis. Gonzalo Mateos ECE Department, University of Minnesota Acknowledgments : Profs. Georgios B. Giannakis, M. Kaveh G. Sapiro, N. Sidiropoulos, and N. Waller MURI (AFOSR FA9550-10-1-0567) grant.

Leo
Download Presentation

Sparsity Control for Robustness and Social Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sparsity Control for Robustness and Social Data Analysis Gonzalo Mateos ECE Department, University of Minnesota Acknowledgments: Profs. Georgios B. Giannakis, M. Kaveh G. Sapiro, N. Sidiropoulos, and N. Waller MURI (AFOSR FA9550-10-1-0567) grant Minneapolis, MNDecember 9, 2011

  2. Learning from “Big Data” `Data are widely available, what is scarce is the ability to extract wisdom from them’ Hal Varian, Google’s chief economist Fast BIG Productive Ubiquitous Revealing Messy Smart 2 K. Cukier, ``Harnessing the data deluge,'' Nov. 2011.

  3. Social-Computational Systems • Complex systems of people and computers • The vision:preference measurement (PM), analysis, management • Understand and engineer SoCS • The means: leverage dual role of sparsity • Complexity control through variable selection • Robustness to outliers 3

  4. Conjoint analysis • Marketing, healthcare, psychology [Green-Srinivasan‘78] • Optimal design and positioning of new products • Strategy: describe products by a set of attributes, `parts’ • Goal: learn consumer’s utility function from preference data • Linear utilities: `How much is each part worth?’ • Success story [Wind et al’89] • Attributes: room size, TV options, restaurant, transportation 4

  5. Modeling preliminaries • Respondents (e.g., consumers) • Rate profiles Each comprises attributes • Linear utility: estimate vector of partworths • Conjoint data collection formats (M1)Metric ratings: (M2)Choice-based conjoint data: • Online SoCS-based preference data exponentially increases • Inconsistent/corrupted/irrelevant data Outliers 5

  6. Robustifying PM Least-trimmed squares [Rousseeuw’87] Q: How should we go about minimizing nonconvex (LTS)? A: Try all subsets of size , solve, and pick the best • Simple but intractable beyond small problems • Near optimal solvers [Rousseeuw’06], RANSAC [Fischler-Bolles’81] (LTS) • is the -th order statistic among • residuals discarded 6 G. Mateos, V. Kekatos, and G. B. Giannakis, ``Exploiting sparsity in model residuals for robust conjoint analysis,'' Marketing Sci., Dec. 2011 (submitted).

  7. Modeling outliers outlier otherwise • Nominal ratings obey (M1); outliers something else -contamination [Fuchs’99], Bayesian model [Jin-Rao’10] • Natural (but intractable) nonconvex estimator • Outlier variables s.t. • Both and unknown, typically sparse! 7

  8. LTS as sparse regression Proposition 1: If solves (P0) with chosen s.t. , then in (LTS). • Lagrangian form (P0) • Tuning parameter controls sparsity in number of outliers • Formally justifies the preference model and its estimator (P0) • Ties sparse regression with robust estimation 8

  9. Just relax! • (P0) is NP-hard relax e.g., [Tropp’06] • (P1) convex, and thus efficiently solved • Role of sparsity-controlling is central (P1) Q: Does (P1) yield robust estimates ? A: Yap! Huber estimator is a special case where 9

  10. Lassoing outliers Proposition 2: Minimizers of (P1) are , • Data-driven methods to select • Lasso solvers return entire robustification path (RP) Coeffs. Decreasing • Suffices to solve Lasso [Tibshirani’94] 10

  11. Nonconvex regularization • Iterative linearization-minimization of around • Initialize with , use • Bias reduction (cf. adaptive Lasso [Zou’06]) • Nonconvex penalty terms approximate better in (P0) • Options: SCAD [Fan-Li’01], or sum-of-logs [Candes et al’08] 11

  12. Comparison with RANSAC Nominal: Outliers: • , i.i.d. 12

  13. Nonparametric regression • Interactions among attributes? • Not captured by • Driven by complex mechanisms hard to model • If one trusts data more than any parametric model • Go nonparametric regression: • lives in a space of “smooth’’ functions • Ill-posed problem • Workaround: regularization [Tikhonov’77], [Wahba’90] • RKHS with kernel and norm 13

  14. Function approximation Nonrobust predictions True function Robust predictions Refined predictions • Effectiveness in rejecting outliers is apparent 14 G. Mateos and G. B. Giannakis, ``Robust nonparametric regression via sparsity control with application to load curve data cleansing,'' IEEE Trans. Signal Process., 2012

  15. Load curve data cleansing Uruguay’s power consumption (MW) • Load curve: electric power consumption recorded periodically • Reliable data: key to realize smart grid vision [Hauser’09] • Faulty meters, communication errors • Unscheduled maintenance, strikes, sport events • B-splines for load curve prediction and denoising [Chen et al ’10] 15

  16. NorthWrite data • Energy consumption of a government building (’05-’10) • Robust smoothing spline estimator, hours • Outliers: “Building operational transition shoulder periods” • No manual labeling of outliers [Chen et al’10] 16 Data: courtesy of NorthWrite Energy Group, provided by Prof. V. Cherkassky

  17. Principal Component Analysis DNA microarray Traffic surveillance • Motivation: (statistical) learning from high-dimensional data • Principal component analysis (PCA) [Pearson’1901] • Extraction of low-dimensional data structure • Data compression and reconstruction • PCA is non-robust to outliers [Jolliffe’86] • Our goal: robustify PCA by controlling outlier sparsity 17

  18. Our work in context • Contemporary applications tied to SoCS • Anomaly detection in IP networks [Huang et al’07], [Kim et al’09] • Video surveillance, e.g., [Oliver et al’99] • Matrix completion for collaborative filtering, e.g., [Candes et al’09] • Robust PCA • Robust covariance matrix estimators [Campbell’80], [Huber’81] • Computer vision [Xu-Yuille’95], [De la Torre-Black’03] • Low-rank matrix recovery from sparse errors, e.g., [Wright et al’09] 18

  19. PCA formulations • Training data • Minimum reconstruction error • Compression operator • Reconstruction operator • Maximum variance • Component analysis model Solution: 19

  20. Robustifying PCA (P2) • -norm counterpart tied to (LTS PCA) • (P2) subsumes optimal (vector) Huber • -norm regularization for entry-wise outliers • Outlier-aware model • Interpret: blind preference model with latent profiles 20 G. Mateos and G. B. Giannakis , ``Robust PCA as bilinear decomposition with outlier sparsity regularization,'' IEEE Trans. Signal Process., Nov. 2011 (submitted).

  21. Alternating minimization 1 (P2) • update: SVD of outlier-compensated data • update: row-wise vector soft-thresholding Proposition 3: Alg. 1’s iterates converge to a stationary point of (P2). 21

  22. Video surveillance Original PCA Robust PCA `Outliers’ 22 Data: http://www.cs.cmu.edu/~ftorre/

  23. Big Five personality factors • Five dimensions of personality traits [Goldberg’93][Costa-McRae’92] • Discovered through factor analysis • WEIRD subjects • Big Five Inventory (BFI) • Measure the Big Five • Short-questionnaire (44 items) • Rate 1-5, e.g.,`I see myself as someone who… …is talkative’ …is full of energy’ 23 Handbook of personality: Theory and research, O. P. John, R. W. Robins, and L. A. Pervin, Eds. New York, NY: Guilford Press, 2008.

  24. BFI data • Eugene-Springfield community sample [Goldberg’08] • subjects, item responses, factors • Robust PCA identifies 8 outlying subjects • Validated via `inconsistency’ scores, e.g., VRIN [Tellegen’88] 24 Data: courtesy of Prof. L. Goldberg, provided by Prof. N. Waller

  25. Online robust PCA • At time , do not re-estimate • Motivation: Real-time data and memory limitations • Exponentially-weighted robust PCA 25

  26. Online PCA in action • Nominal: • Outliers: 26

  27. Robust kernel PCA Input space Feature space • Challenge: -dimensional Kernel trick: • Kernel (K)PCA [Scholkopf ‘97] • Related to spectral clustering 27

  28. Unveiling communities ARI=0.8967 • Network: NCAA football teams (nodes), F’00 games (edges) • teams, kernel • Identified exactly: Big 10, Big 12, ACC, SEC, Big East • Outliers: Independent teams 28 Data: http://www-personal.umich.edu/~mejn/netdata/

  29. Goal: find s.t. is the spectrum at position Original Estimated S P E C T R U M M A P Approach: Basis expansion model for , nonparametric basis pursuit Spectrum cartography Idea:collaborate to form a spatial map of the spectrum J. A. Bazerque, G. Mateos, and G. B. Giannakis, ``Group-Lasso on splines for spectrum cartography,'' IEEE Trans. Signal Process., Oct. 2011.

  30. Distributed adaptive algorithms Wireless sensor Improved learning through cooperation Issues and Significance: • Fast varying (non-)stationary processes • Unavailability of statistical information • Online incorporation of sensor data • Noisy communication links Technical Approaches: • Consensus-based in-network operation in ad hoc WSNs • Distributed optimization using alternating-direction methods • Online learning of statistics using stochastic approximation • Performance analysis via stochastic averaging G. Mateos, I. D. Schizas, and G. B. Giannakis, ``Distributed recursive least-squares for consensus-based in-network adaptive estimation,'‘IEEE Trans. Signal Process., Nov. 2009.

  31. Unveiling network anomalies Approach: Flag anomalies across flows and time via sparsity and low rank Enhanced detection capabilities Anomalies across flows and time Payoff: Ensure high performance, QoS, and security in IP networks M. Mardani, G. Mateos, and G. B. Giannakis, ``Unveiling network anomalies across flows and time via sparsity and low rank,'' IEEE Trans. Inf. Theory, Dec 2011 (submitted).

  32. Concluding summary SIGNAL PROCESSING OUTLIER-RESILIENT ESTIMATION LASSO • Experimental validation with GPIPP personality ratings (~6M) Gosling-Potter Internet Personality Project (GPIPP) - http://www.outofservice.com • Control sparsity in model residuals for robust learning • Research issues addressed • Sparsity control for robust metric and choice-based PM • Kernel-based nonparametric utility estimation • Robust (kernel) principal component analysis • Scalable distributed real-time implementations • Application domains • Preference measurement and conjoint analysis • Psychometrics, personality assessment • Video surveillance • Social and power networks 32

More Related