1 / 39

Lecture 10 – Model fitting continued

Lecture 10 – Model fitting continued. Low birth weight dataset – final model Does number of primary care visits add anything? Fit model with and without this categorical variable and do a LR test Can also perform a multivariate Wald test to test the entire group together

liluye
Download Presentation

Lecture 10 – Model fitting continued

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 10 – Model fitting continued • Low birth weight dataset – final model • Does number of primary care visits add anything? • Fit model with and without this categorical variable and do a LR test • Can also perform a multivariate Wald test to test the entire group together • Can compare 1 visit to 2 or more visits in a Wald test BIOST 536 Lecture 10

  2. BIOST 536 Lecture 10

  3. Comparison of models • LR test • Multivariate Wald test for # of primary care visits (0, 1, 2+) • In the output we have a Wald test for 1 visit versus 0 2+ visits vs. 0 • How does 1 visit compare to 2+ ? • If number of visits is not predictive, nor a confounder on the smoking association, drop this variable. BIOST 536 Lecture 10

  4. Profile likelihood confidence intervals • Usually give Wald-based CI’s for the OR • Relies on normality of the  estimates • Could lead to a conflict with the 95% CI for the OR excluding 1.0, but the LR test has p-value > 0.05 • Can construct profile likelihood ratio CI’s • Model • X1 is the scientific variable of interest and others are confounders/predictors to control for • Find MLE that maximizes the log-likelihood • Now consider what happens to the log-likelihood as we move away from this estimate along the X1 axis in either direction • As we move away from on the X1 axis, then log-lik decreases (and the other betas change since we still maximize over the remaining covariates) • The resulting curve is called a profile log-likelihood BIOST 536 Lecture 10

  5. Profile log-likelihood • Find the point on the curve that would be a significant difference from the overall MLE for 1 • Example: Suppose and the profile log likelihood was • Plot shows 2*change so find x-values that change 2 *Δ log-lik by 3.84 (0.05 critical value for a chi-square 1df distribution) • In the example 95% CI for 1 is (0.12,1.88) so exponentiate to get CI for the odds ratio BIOST 536 Lecture 10

  6. Profile log-likelihood • 95% profile-based CI will not necessarily be symmetric around the MLE; Wald CI is symmetric since • Profile CI’s are not included in standard Stata • Can download a routine written by Mark Pearce (Univ of Newcastle upon Tyne) Stata Technical Bulletin 56 July 2000 • In Stata type search logprofand install BIOST 536 Lecture 10

  7. Profile log-likelihood • 95% profile-based CI is close to the Wald based CI • Can do the same procedure for each variable in the model • More critical when a 95% CI for the log(OR) is close to including or excluding 0 BIOST 536 Lecture 10

  8. Profile log-likelihood • As the  for the variable of interest is manipulated, the other ’s are re-maximized • For example, fit the  for smoking at the lower limit of the 95% profile based CI and look at the other ’s and the log-likelihood • Can fix a term in the model by using an “offset” which forces the coefficient of the offset term to be 1 • Change in loglik when  smoke= 1.02757 versus  smoke= 0.26976is 2*[(-101.9740)-(-103.8947)] = 3.84 BIOST 536 Lecture 10

  9. Collinearity of variables • Some covariates can be highly correlated • Inclusion of two or more correlated covariates can suggest non-significance when in fact they might be important predictors • Framingham data example • Consider possible predictors of CHD such as age, height, weight, SBP, DBP, serum cholesterol (at two time points) • Number of prior pre-term labors may be too thin to model – consider dichotomizing none versus 1 or more (everptl) BIOST 536 Lecture 10

  10. Collinearity of variables • Overall logistic regression model shows high significance, but the individual Wald tests do not reflect that • Multivariate joint Wald test of height & weight together does not show much, but try a LR test • Number of prior pre-term labors may be too thin to model – consider dichotomizing none versus 1 or more (everptl) BIOST 536 Lecture 10

  11. Collinearity of variables • LR test shows no effect either – refit the model without the restriction on missing values • Number of prior pre-term labors may be too thin to model – consider dichotomizing none versus 1 or more (everptl) BIOST 536 Lecture 10

  12. Collinearity of variables • Consider eliminating the “non-significant” SBP and DBP • The joint Wald test indicates that they are highly significant – problem is collinearity • Consider instead removing the cholesterol values • Same problem – collinearity is masking the statistical significance • Consider dropping either of the two cholesterol measures • Number of prior pre-term labors may be too thin to model – consider dichotomizing none versus 1 or more (everptl) BIOST 536 Lecture 10

  13. Collinearity of variables • Dropping each in turn • About the same model fit, but we lose more observations using sc2 • Number of prior pre-term labors may be too thin to model – consider dichotomizing none versus 1 or more (everptl) BIOST 536 Lecture 10

  14. Collinearity of variables • SBP and DBP are both in the model and may be highly correlated – look at the correlation of the  coefficients • Look at the profile likelihood • Number of prior pre-term labors may be too thin to model – consider dichotomizing none versus 1 or more (everptl) BIOST 536 Lecture 10

  15. Collinearity of variables • Plot generated has wide range – changes to DBP have simultaneous impact on the SBP coefficient • Number of prior pre-term labors may be too thin to model – consider dichotomizing none versus 1 or more (everptl) BIOST 536 Lecture 10

  16. Collinearity of variables • Deletion of either SBP or DBP shows the other to be significant • Model may be more understandable with only one included • Both could be included in a prediction model ! • Number of prior pre-term labors may be too thin to model – consider dichotomizing none versus 1 or more (everptl) BIOST 536 Lecture 10

  17. Data sparseness • Models can also show poor fit when binary covariates are included that have sparse numbers of observations or high degree of overlap • Main effects and interaction terms were being fit • Look at SE’s for ’s (not OR’s) and see if it is very large relative to the coefficient • Stata is not good at warning about overfit models • Number of prior pre-term labors may be too thin to model – consider dichotomizing none versus 1 or more (everptl) BIOST 536 Lecture 10

  18. Diagnostics • Deletion diagnostics measures effects of individual observations on • Own predicted value • Contribution to the Pearson measure of fit • Contribution to the deviance measure of fit • Effect on the individual regression coefficients ’s (“delta betas”) • Joint effect on the all regression coefficients ’s (“Cook’s distance”) • Mostly Stata computes these for covariate combinations, not individuals • If the number of covariate combinations is large (e.g. several continuous covariates) this may be sufficient • May not be what we want though BIOST 536 Lecture 10

  19. Diagnostics • Goal is to identify observations that • Are not well fit by the model • Are outliers in the covariate space • Have extreme influence on the fitted coefficients • Good models • Have residuals that appear to be random • Fit all the data • Are stable and not influenced by one or few observations BIOST 536 Lecture 10

  20. Diagnostics • Covariate combinations • J = number of distinct covariate combinations • mj = number of observations in pattern j • yj = number of responses in pattern j • pj = estimated probability in pattern j • Assumptions • yj ~ Binomial ( pj ,mj ) • logit ( pj ) = linear combination of covariates and ’s • Definitions • hj = diagonal element of the hat matrix (will be defined) • rj = Pearson residual (defined earlier) • sj = standardized Pearson residual (defined earlier) • dj = deviance residual (defined earlier) BIOST 536 Lecture 10

  21. Diagnostics BIOST 536 Lecture 10

  22. Diagnostics • Plot the above diagnostics against • fitted value or logit • leverage • Large values may indicate a strong influence or even just an error in recording of the covariate values or outcome value • These values are influenced by all the coefficients – may only be interested in one or two particular covariates (scientific variables of interest), but not the effect on coefficients of the other ancillary covariates BIOST 536 Lecture 10

  23. Diagnostics • Can get these covariate specific delta beta’s for the scientific variable of interest and see if a particular covariate combination is very influential on the association with outcome • Can also do this as the individual level, not just the covariate combination • In small data sets the results can be highly susceptible to a very few observations BIOST 536 Lecture 10

  24. Diagnostics • Leverage values • Leverage hj is the diagonal element of the “hat” matrix • In logistic regression hj = binomial variance * normalized distance from the center of the covariate space • Large leverage values result from observations with high variance and/or from covariate combinations that are extreme relative to the others • Covariate combinations that have estimated probabilities close to 0 or 1 do not have high leverage BIOST 536 Lecture 10

  25. Diagnostics • Example – Framingham data with three continuous covariates • Many covariate combinations • Generate leverage, Cook, ΔPearson, Δdeviance, estimated probability respectively BIOST 536 Lecture 10

  26. Diagnostic plots BIOST 536 Lecture 10

  27. Diagnostic plots BIOST 536 Lecture 10

  28. Diagnostic plots BIOST 536 Lecture 10

  29. Diagnostic plots BIOST 536 Lecture 10

  30. Observations with high deviance change BIOST 536 Lecture 10

  31. Cases with few risk factors – other explanation? Smoking? BIOST 536 Lecture 10

  32. Diagnostic plots BIOST 536 Lecture 10

  33. Leverage • Look at individuals with high leverage • Tend to have middle range probabilities • Can be extreme on covariate values BIOST 536 Lecture 10

  34. Diagnostics – true delta betas • Would like to estimate the effect of each individual on the three specific coefficients • Need to use matrix computation in Stata to do this BIOST 536 Lecture 10

  35. Diagnostics – true delta betas • Put the score vectors into a matrix and multiply by the covariance matrix • Result is the influence function • Put back into variables • dbet1 = change in age coefficient if observation deleted • dbet2 = change in cholesterol coefficient if observation deleted • dbet3 = change in SBP coefficient if observation deleted • Robust standard errors are a derived from the influence function BIOST 536 Lecture 10

  36. Diagnostics – true delta betas • In this case the individual effect of an observation may be small since we have many observations • Would look at the extreme values on these individual influence functions to determine their actual effect on the fitted model BIOST 536 Lecture 10

  37. Diagnostic plots BIOST 536 Lecture 10

  38. Diagnostic plots BIOST 536 Lecture 10

  39. Diagnostics • What to do with influential observations? • Check values for accuracy • Determine if your “significant” findings are really the function of a few individuals with high values • Can report a sensitivity analysis with and without those values BIOST 536 Lecture 10

More Related