1 / 15

Multicollinearity in Regression Principal Components Analysis

Multicollinearity in Regression Principal Components Analysis. Standing Heights and Physical Stature Attributes Among Female Police Officer Applicants

dwight
Download Presentation

Multicollinearity in Regression Principal Components Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multicollinearity in Regression Principal Components Analysis Standing Heights and Physical Stature Attributes Among Female Police Officer Applicants S.Q. Lafi and J.B. Kaneene (1992). “An Explanation of the Use of Principal Components Analysis to Detect and Correct for Multicollinearity,” Preventive Veterinary Medicine, Vol. 13, pp. 261-275

  2. Data Description • Subjects: 33 Females applying for police officer positions • Dependent Variable: Y ≡ Standing Height (cm) • Independent Variables: • X1 ≡ Sitting Height (cm) • X2 ≡ Upper Arm Length (cm) • X3 ≡ Forearm Length (cm) • X4 ≡ Hand Length (cm) • X5 ≡ Upper Leg Length (cm) • X6 ≡ Lower Leg Length (cm) • X7 ≡ Foot Length (inches) • X8 ≡ BRACH (100X3/X2) • X9 ≡ TIBIO (100X6/X5)

  3. Data

  4. Standardizing the Predictors

  5. Correlations Matrix of Predictors and Inverse

  6. Variance Inflation Factors (VIFs) • VIF measures the extent that a regression coefficient’s variance is inflated due to correlations among the set of predictors • VIFj = 1/(1-Rj2) where Rj2 is the coefficient of multiple determination when Xj is regressed on the remaining predictors. • Values > 10 are often considered to be problematic • VIFs can be obtained as the diagonal elements of R-1 Not surprisingly, X2, X3, X5, X6, X8, and X9 are problems (see definitions of X8 and X9)

  7. Regression of Y on [1|X*] Note the surprising negative coefficients for X3*, X5*, and X9*

  8. Principal Components Analysis While the columns of X* are highly correlated, the columns of W are uncorrelated The ls represent the variance corresponding to each principal component

  9. Police Applicants Height Data - I

  10. Police Applicants Height Data - II

  11. Regression of Y on [1|W] Note that W8 and W9 have very small eigenvalues and very small t-statistics Condition indices are 63.5 and 85.2, Both well above 10

  12. Reduced Model • Removing last 2 principal components due to small, insignificant t-statistics and high condition indices • Let V(g) be the p×g matrix of the eigenvectors for the g retained principal components (p=9, g=7) • Let W(g) = X*V(g) • Then regress Y on [1|W(g)]

  13. Reduced Regression Fit

  14. Transforming Back to X-scale

  15. Comparison of Coefficients and SEs Original Model Principal Components

More Related