130 likes | 143 Views
This guide explores various issues in regression diagnostics, including influential points, heteroscedasticity, autocorrelation, and multicollinearity. It provides methods to identify and address these problems, such as influential point diagnostics, weighted least squares, and solutions for multicollinearity. Learn how to improve the accuracy and reliability of your regression analysis.
E N D
Problems • Influentials and outliers • Heteroscedasticity • Autocorrelation • Multicollinearity – relationship of independent variables
Residuals - review • Unstandardized residuals H = hat matrix • Predicted residuals
Residuals - review • Standardized residuals • Jackknife residuals
I. Influentials • If omitted from computation big change in regression coeffs can be found. • Goal: to find and exclude
Influentials -diagnosis • DFBETA(-i)=b-b(-i) Rule of thumb: Problem if NDFBETA>2/√n Note : NDFFIT problem if NDFFIT>2/√(n/p)
II. Heteroscedasticity • Assumption for regression: variance of error is the same for all values of indep. variable • Checking: Charts for residuals vs. ind. vars • Consequence: big standard errors of coeffs, t-tests statistically insig. • Tests - Glejser, Goldfeld-Quandt tests • Analytical solution: weighted LS (WLS)
Glejser’stest • Model for residuals on ind. vars :
III. Multicollinearity • Estimate: • Strong relationship between ind. variables: X´X is singular matrix or nearly singular Consequence: standard errors of coeffs are inflated, t-test statistically insignificant, estimates are not stable
Multicollinearity Diagnosis: Correlation of ind. vars – cor. coeff>0,8 Other options: a) Tolerance (1-R2j) b) VIF = 1/(1-R2j)VIF diagonál cells R-1 c) Conditon index: square root of ratio: max lambda/min lambda ROT* > 30 → problem *ROT=Rules of thumb
Multicollinearity Solution • Ignore • Leave out variable • Get more data • Use factor analysis (see later) • Ridge regressionBiased estimates but smaller standard erorrs (slight change in diagonal’s elements)
IV. Autocorrelation • Assumption for regression: variance of error is independent for individual observations • Checking: Charts for residuals vs. time • Consequence: big standard errors of coeffs, t-tests statistically insig. • Tests - Durbin-Watson • Solution: weighted LS • Autocorrelation is present in time series (usually not used in social sciences)