250 likes | 317 Views
Regression diagnostics. Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/Talks/. Jan-20. H.S. 1. Agenda. Linear regression diagnostics Assumtions Robust results (Logistic regression) (Poisson regression). 2 January 2020. H.S. 2. Linear regression.
E N D
Regression diagnostics Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/Talks/ Jan-20 H.S. H.S. 1
Agenda • Linear regression diagnostics • Assumtions • Robust results • (Logistic regression) • (Poisson regression) 2 January 2020 H.S. H.S. 2
Linear regression Birth weight by gestational age 2 January 2020 H.S. H.S. 3
Workflow • Scatterplots • Bivariate analysis • Regression • Model fitting • Cofactors in/out • Interactions • Test of assumptions • Independent errors • Linear effects • Constant error variance • Influence (robustness) 2 January 2020 H.S. H.S. 4
Scatterplot 2 January 2020 H.S. H.S. 5
Results Outcome: birthweight Covariates: gestational age, sex, parity Model: linear regression OBS: synthetic data H.S.
Model diagnostics • Model • Assumptions • Independent errors (residuals) • Linear effects • Constant error variance • Robustness 2 January 2020 H.S. H.S. 7
Checking assumptions H.S.
1. Independent residuals • No diagnostic tool • Possible violations • Pupils nested in schools: weak correlations • Repeated measurement: strong correlations • Models • Adjust for clustering • Linear mixed models • GEE H.S.
2. Linear effects • Save residuals and predicted values • Plot resid vs pred • If non-linear: • Plot resid vs cont. vars • Add square term • or cut in categories H.S.
Significant means non-linearity Linear effect test Model 1: only linear terms Model 2: linear terms+square term H.S.
3. Constant residual variance • Plot resid vs pred • If non-constant variance: • Robust regression • Weighted regression H.S.
Constant variance test Significant means non-const. var. H.S.
Estimate residual variance Weights=1/variance Weighted regression Effects Takes care of heteroskedasticity “robustification” Weighted regression H.S.
Summary of assumptions • Dependent residuals Mixed models: xtmixed • Non linear effects gen gest2=gest^2 regress weigth gest gest2 sex • Non-constant variance regress weigth gest sex, robust 2 January 2020 H.S. H.S. 15
Measures of influence • Measure change in: • Predicted (y) • Deviance • Coefficients (beta) Remove obs 1, see change remove obs 2, see change 2 January 2020 H.S. H.S. 17
Influence idea • Outlierness • Residuals • Leverage • Distance from x-mean • Influence • Combination H.S.
Leverage versus residuals2 “Adjusted” scatterplot Added variable plot (partial regression leverage) H.S.
Summary: Robustness, influence • Linear regression sensitive! • Look for influential points • Leverage versus residual plots • Added variable plots • Delta-beta • Rerun regression without influential points and look for change in: • coefficients • constant term • p-values H.S.
Logistic, Poisson regression • Assumptions • Independent errors as before • Linear effects as before • Constant error variance no! • Robustness • Linear not robust! • Poisson medium robust • Logistic fairly robust H.S.