1 / 62

Primal-Dual PCA Analysis and SVD Decomposition for Data Analysis

This text explores the concepts of Primal-Dual PCA Analysis and SVD Decomposition in data analysis, with examples and comparisons of different approaches.

Download Presentation

Primal-Dual PCA Analysis and SVD Decomposition for Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Object Orie’d Data Analysis, Last Time • PCA Redistribution of Energy - ANOVA • PCA Data Representation • PCA Simulation • Alternate PCA Computation • Primal – Dual PCA vs. SVD (centering by means is key)

  2. Primal - Dual PCA Toy Example 1: Random Curves, all in Primal Space: * Constant Shift * Linear * Quadratic Cubic (chosen to be orthonormal) Plus (small) i.i.d. Gaussian noise d = 40, n = 20

  3. Primal - Dual PCA Toy Example 1: Raw Data

  4. Primal - Dual PCA Toy Example 1: Raw Data Primal (Col.) curves similar to before Data mat’x asymmetric (but same curves) Dual (Row) curves much rougher (showing Gaussian randomness) How data were generated Color map useful? (same as mesh view) See richer structure than before Is it useful?

  5. Primal - Dual PCA Toy Example 1: Primal PCA Column Curves as Data

  6. Primal - Dual PCA Toy Example 1: Primal PCA Expected to recover increasing poly’s But didn’t happen Although can see the poly’s (order???) Mean has quad’ic (since only n = 20???) Scores (proj’ns) very random Power Spectrum shows 4 components (not affected by subtracting Primal Mean)

  7. Primal - Dual PCA Toy Example 1: Dual PCA Row Curves as Data

  8. Primal - Dual PCA Toy Example 1: Dual PCA Curves all very wiggly (random noise) Mean much bigger, 54% of Total Var! Scores have strong smooth structure (reflecting ordered primal e.v.’s) (recall primal e.v.  dual scores) Power Spectrum shows 3 components (Driven by subtraction Dual Mean) Primal – Dual mean difference is critical

  9. Primal - Dual PCA Toy Example 1: Dual PCA – Scatterplot

  10. Primal - Dual PCA Toy Example 1: Dual PCA - Scatterplot Smooth Curve Structure But not usual curves (Since 1-d curves not quite poly’s) And only in 1st 3 components Recall only 3 non-noise components Since constant curve went into mean (dual) Remainder is pure noise Suggests wrong rotation of axes???

  11. Primal - Dual PCA A 3rd Type of Analysis: Called “SVD decomposition” Main point: subtract neither mean Viewed as a serious competitor Advantage: gives best Mean Square Approximation of Data Matrix Vs. Primal PCA: best about col. Mean Vs. Dual PCA: best about row Mean Difference in means is critical!

  12. Primal - Dual PCA Toy Example 1: SVD – Curves view

  13. Primal - Dual PCA Toy Example 1: SVD Curves View Col. Curves view similar to Primal PCA Row Curves quite different (from dual): Former mean, now SV1 Former PC1, now SV2 i.e. very similar shapes with shifted indices Again mean centering is crucial Main difference between PCAs and SVD

  14. Primal - Dual PCA Toy Example 1: SVD – Mesh-Image View

  15. Primal - Dual PCA Toy Example 1: SVD Mesh-Image View Think about decomposition into modes of variation Constant x Gaussian Linear x Gaussian Cubic by Gaussian Quadratic Shows up best in image view? Why is ordering “wrong”???

  16. Primal - Dual PCA Toy Example 1: All Primal Why is SVD mode ordering “wrong”??? Well, not expected… Key is need orthogonality Present in space of column curves But only approximate in row Gaussians The implicit orthogonalization of SVD (both rows and columns) gave mixture of the poly’s.

  17. Primal - Dual PCA Toy Example 2: All Primal, GS Noise Started with same column space Generated i.i.d. Gaussians for row cols Then did Graham-Schmidt Ortho-normalization (in row space) Visual impression: Amazingly similar to original data (used same seeds of random # generators)

  18. Primal - Dual PCA Toy Example 2: Raw Data

  19. Primal - Dual PCA Compare with Earlier Toy Example 1

  20. Primal - Dual PCA Toy Example 2: Primal PCA Column Curves as Data Shows Explanation (of wrong components) was correct

  21. Primal - Dual PCA Toy Example 2: Dual PCA Row Curves as Data Still have big mean But Scores look much better

  22. Primal - Dual PCA Toy Example 2: Dual PCA – Scatterplot

  23. Primal - Dual PCA Toy Example 2: Dual PCA - Scatterplot Now poly’s look beautifully symmetric Much like chemo spectrum examples But still only 3 Same reason, dual mean ~ primal constants Last one is pure noise

  24. Primal - Dual PCA Toy Example 2: SVD – Matrix-Image

  25. Primal - Dual PCA Toy Example 2: SVD - Matrix-Image Similar Good Effects Again have all 4 components So “better” to not subtract mean???

  26. Primal - Dual PCA Toy Example 3: Random Curves, all in Dual Space: 1 * Constant Shift 2 * Linear 4 * Quadratic 8 * Cubic (chosen to be orthonormal) Plus (small) i.i.d. Gaussian noise d = 40, n = 20

  27. Primal - Dual PCA Toy Example 3: Raw Data

  28. Primal - Dual PCA Toy Example 3: Raw Data Similar Structure to e.g. 1 But Rows and Columns trade places And now cubics visually dominant (as expected)

  29. Primal - Dual PCA Toy Example 3: Primal PCA Column Curves as Data Gaussian Noise Only 3 components Poly Scores (as expected)

  30. Primal - Dual PCA Toy Example 3: Dual PCA Row Curves as Data Components as expected No Gram-Schmidt (since stronger signal)

  31. Primal - Dual PCA Toy Example 3: SVD – Matrix-Image

  32. Primal - Dual PCA Toy Example 4: Mystery #1

  33. Primal - Dual PCA Toy Example 4: SVD – Curves View

  34. Primal - Dual PCA Toy Example 4: SVD – Matrix-Image

  35. Primal - Dual PCA Toy Example 4: Mystery #1 Structure: Primal - Dual Constant Gaussian Gaussian Linear Parabola Gaussian Gaussian Cubic Nicely revealed by Full Matrix decomposition and views

  36. Primal - Dual PCA Toy Example 5: Mystery #2

  37. Primal - Dual PCA Toy Example 5: SVD – Curves View

  38. Primal - Dual PCA Toy Example 5: SVD – Matrix-Image

  39. Primal - Dual PCA Toy Example 5: Mystery #2 Structure: Primal - Dual Constant Linear Parabola Cubic Gaussian Gaussian Gaussian Gaussian Visible via either curves, or matrices…

  40. Primal - Dual PCA Is SVD (i.e. no mean centering) always “better”? What does “better” mean??? A definition: Provides most useful insights into data Others???

  41. Primal - Dual PCA Toy Example where SVD is less informative: Simple Two dimensional Key is subtraction of mean is bad I.e. Mean dir’n different from PC dir’ns And Mean Less Informative

  42. Primal - Dual PCA Toy Example where SVD is less informative: Raw Data

  43. Primal - Dual PCA PC1 mode of variation (centered at mean): Yields useful major mode of variation

  44. Primal - Dual PCA PC2 mode of variation (centered at mean): Informative second mode of variation

  45. Primal - Dual PCA SV1 mode of variation (centered at 0): Unintuitive major mode of variation

  46. Primal - Dual PCA SV2 mode of variation (centered at 0): Unintuitive second mode of variation

  47. Primal - Dual PCA Summary of SVD: Does give a decomposition I.e. sum of two pieces is data But not good insights about data structure Since center point of analysis is far from center point of data So mean strongly influences the impression of variation Maybe better to keep these separate???

  48. Primal - Dual PCA Bottom line on: Primal PCA vs. SVD vs. Dual PCA These are not comparable: Each has situations where it is “best” And where it is “worst” Generally should consider all And choose on basis of insights See work of Lingsong Zhang on this…

  49. Real Data: Primal - Dual PCA Analysis by: Lingsong Zhang Zhang, L., (2006), "SVD movies and plots for Singular Value Decomposition and its Visualization", University of North Carolina at Chapel Hill, available at http://www.unc.edu/~lszhang/research/network/SVDmovie

  50. Real Data: Primal - Dual PCA Use slides from a talk: LingsongZhangFunctionalSVD.pdf Main Points: Different approaches all can be “best” Show different aspects of data Generalized SCREE ploy “outliers” are interesting

More Related