1 / 43

SVD and LS

SVD and LS. M.A. Miceli University of Rome I Stats in the Château Jouy-en-Josas August 31 - September 4 2009. Motivations. Problems of high dimensionality in estimation: Rank < actual dimension of the data sets  inverse problems

tavia
Download Presentation

SVD and LS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SVD and LS M.A. Miceli University of Rome I Stats in the Château Jouy-en-Josas August 31 - September 4 2009 M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  2. Motivations • Problems of high dimensionality in estimation: • Rank < actual dimension of the data sets  inverse problems • Threholds in accepting variables eases on every dimension, as the number of variables/dimensions increases (ex. Wald test). • How the SVD helps in extracting robust correlations between dependent and independent variables: automatic choice of “model”. • Why • Some evidence in predicting US CPIs indexes • Some issues about normalizations M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  3. Motivations Given a simultaneous linear system of equations • Collapsing dimensionality of the system to its min rank = min [rank(Y), rank (X)], • Advantages of SVD w.r.t. Principal Components: • PC requires a sqare matrix, e.g. autocorrelation matrix, and ranks the dimensions within that single matrix; • SVD ranks the correlations between X and Y dimensions • Discretionary possibility of getting rid of some - believed negligible – dimensions: we are interested in getting rid of those dimensions that can be generated by a totally random system of same dimensions (Marchenko-Pastur conditions adapted to a rectangular matrix). M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  4. Definition of SVD of a matrix product • SVD definition Having two matrices one can write and therefore If T << max(M,N)? No problems M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  5. Diagonalizing the LS estimator • Consider regressing every column y over the set of explanatory variables X: • we write • We diagonalize both matrices: (X’X) and (X’Y): • X’X • X’Y rectangular • NB. The SVD of a square matrix IS the same as the diagonalisation. We will write M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  6. M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  7. SVD of the covariance matrix 0 (X’ Y) Vxy Uxy Sxy M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  8. SVD mapping from column basis to row basis 0 X’Y Vxy Uxy Sxy M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  9. SVD: splitting the product X’Y Y Vxy X Uxy Sxy Y linear combin X linear combin M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  10. Adding diagonalisation of both X and Y matrices M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  11. Returning to the original variables Replacing the old “B”: any advantage??!! Vxy ‘ Vyy ’ Inv(Dxx) Sxy Y X Uxx Uxy We may cancel factors: any criterium? M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  12. RMT • Marcenko-Pastur conditions compute singular values density and interval limits for square matrices. Bouchaud, Miceli et al (2005) derive them for rectangular matrices. • We run exactly the same experiment with purely random generated matrices for “many times”: limits and densities reply the theory M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  13. Marcenko-Pastur limits and density M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  14. RMT • Density and limits do change if we use raw or already diagonalized data. • Is this “double diagonalization” worthwhile? • singular values are HD0 in standardization, eigenvectors are NOT. M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  15. Diagonalized “LS estimator” Very disturbing We may approach the same problem in different ways • raw data • normalized factors • non normalized factors “unfortunately” 3. works best. Why? … Is it because factor normalization changes the ranking of the SVD singular values and this affect eventually the factor selection? NO! Answer at the end …. M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  16. Example: Forecasting US CPIs Indexes Time series are mom % changes: • Y:= 9 CPIs Indexes, aug83 – apr07 • X:= 77 macroeconomic series nov83-apr07 including 3 lags of the Ys. T=282, N=9, M=77, rolling window W=100 or else. n= N/W, m=M/W. M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  17. CPIs M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  18. Xs M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  19. Estimation by Model III M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  20. M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  21. Singular values: Model I – Random generated DATA M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  22. M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  23. Singular values for SVD on raw and random DATA M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  24. M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  25. Estimation by Model II Factors are divided by their own eigenvalue M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  26. Singular values: Model II – Data NORMALIZED FACTORS lambda max = 0.934 Lambda min =0.608 M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  27. Singular values: Model II – Random generated NORMALIZED FACTORS lambda max = 0.934 Lambda min =0.608 Random generated singular values don’t look very differently …. M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  28. M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  29. M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  30. Singular values for SVD on raw and random FACTORS M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  31. Let’s see estimations by Model III M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  32. P&L Model III - Factors on raw data M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  33. P&L Model III - CPI Indexes (Model of Non Normalized Factors) – In sample With ALL svd factors 2 svd factors M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  34. Let’s see estimations by Model II (normalized factors) M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  35. P&L Model II (Normalized factors) - Factors M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  36. P&L Model II (Normalized factors) – CPI’s M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  37. Example of CPI_comdty estimation Non normalized factors Normalized factors M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  38. OUT OF SAMPLE • Estimation on t=1,…,120 • Forecast at fixed coefficients for t= 121, … 282 M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  39. P&L: Factors (Model II) M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  40. Forecast on CPI’s All factors 2 factors only Easier to predict: 1. medical care (since stable), 2. commodities (oil), 3. Transports M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  41. Forecasts on Cpi’s Comdty M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  42. Conclusions 1 M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

  43. Conclusions on the example M. A. Miceli “SVD and LS” - Stats in the Château - August 31 - September 4 2009

More Related