1 / 13

Items to consider - 3

Items to consider - 3. 1. Multicollinearity The relationship between IV’s…when IV’s are highly correlated with one another What to do: Examine the correlation matrix of all IV’s & DV to detect any multicollinearity Look for r’s between IV’s in excess of 0.70

sal
Download Presentation

Items to consider - 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Items to consider - 3 1 • Multicollinearity • The relationship between IV’s…when IV’s are highly correlated with one another • What to do: • Examine the correlation matrix of all IV’s & DV to detect any multicollinearity • Look for r’s between IV’s in excess of 0.70 • If detected, it is generally best (or at least most simple) to re-run MLR and eliminate one of the offending IV’s from the model (see model reduction, later) 2 3

  2. Multicollinearity – what is it? • It’s to do with unique and shared variance of the IV’s with the predictor & themselves • Must establish what unique variance on each predictor (IV) is related to variance on criterion (DV) • Example 1 (graphical): • y – freshman college GPA • predictor 1 – high school GPA • predictor 2 – SAT total score • predictor 3 – attitude toward education 1

  3. Multicollinearity – what is it? Circle = variance for a variable; overlap = shared variance (only 2 predictors shown here) y 1 3 5 2 4 variance in y accounted for by predictor 2 after the effect of predictor 1 has been partialled out x2 Common variance in y that both predictors 1 and 2 account for x1

  4. Multicollinearity – what is it? Circle = variance for a variable; overlap = shared variance (only 2 predictors shown here) y 1 x2 x1 Total R2 = .66 or 66% 3 2

  5. Multicollinearity – what is it? Circle = variance for a variable; overlap = shared variance (only 2 predictors shown here) 2 y 1 x2 x1 Total R2 = .33 or 33% 4 3

  6. Multicollinearity – what is it? 1 • Example 2 (words): • y – freshman college GPA • predictor 1 – high school GPA • predictor 2 – SAT total score • predictor 3 – attitude toward education 5 4 3 2

  7. Multicollinearity – what is it? 1 = variance in college GPA predictable from variance in high school GPA = residual variance in SAT related to variance in college GPA = residual variance in attitude related to variance in college GPA

  8. Multicollinearity – what is it? • Consider these: A B C 1 Which would we expect to have the largest overall R2, and which would we expect to have the smallest?

  9. Multicollinearity – what is it? • R2 will be at least .7 for B & C, but only at least .3 for A • No chance of R2 for A getting much larger, because intercorrelations of X’s are as large for A as for B & C A B C 1 2

  10. Multicollinearity – what is it? • R will probably be largest for B • Predictors are correlated with Y • Not much redundancy among predictors • R probably greater in B than C, as C has considerable redundancy in predictors 1 2 A B C

  11. What effect does the big M have? 1 • Can increase SEE of regression coefficients (those with the multicollinearity) • This can lead to insignificant findings for those coefficients • So predictors that may be significant when used in isolation may not be significant when used together • Can also lead to imprecision among regression coefficients (mistakes in estimating the change in Y for a unit change in the IV) • So a model with multicollinearity is misleading, & can have redundancy among the predictors 2 3 4

  12. What do we do about the big M? • Many opinions • E.g. O‘Brien (2007) A Caution Regarding Rules of Thumb for Variance Inflation Factors. Quality & Quantity, 41, 5, 673-690 • Can use “VIF” (variance inflation factor) and “tolerance” values in SPSS (“problem” variables are those with “VIF” < 4) • Can painstakingly examine all possible versions of the model (putting each predictor in 1st) • We’ll just signal multicollinearity with a r > .70, and enforce removal of at least one of the variables, • and signal possible multicollinearity with a r of between .5 and .7, and suggest examination of the model with and without one of the variables. 1 2

  13. The Goal of MLR • The big picture… • What we’re trying to do is create a model predicting a DV that explains as much of the variance in that DV as possible, while at the same time: • Meet the assumptions of MLR • Best manage the other issues – sample size, n of predictors, outliers, multicollinearity, r with dependent variable, significance in model • Be parsimonious (can be very important) 1 2

More Related