1 / 68

HSRP 734: Advanced Statistical Methods June 12, 2008

nyla
Download Presentation

HSRP 734: Advanced Statistical Methods June 12, 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. HSRP 734: Advanced Statistical Methods June 12, 2008

    2. General Considerations for Multivariable Analyses

    3. An Effective Modeling Cycle

    4. Overview Model building: applies outside of Logistic regression Model diagnostics: specific to Logistic regression

    5. Model Building

    6. Model selection “Proper model selection rejects a model that is far from reality and attempts to identify a model in which the error of approximation and the error due to random fluctuations are well balanced.” - Shibata, 1989

    7. Model building Models are just that: approximating models of a truth How best to quantify approximation? Depends upon study goals (prediction, explanatory, exploratory)

    8. Principle of Parsimony “Everything should be made as simple as possible, but no simpler.” – Albert Einstein Choose a model with “the smallest # of paramters for adequate representation of the data.” – Box & Jenkins

    9. Principle of Parsimony Bias vs. Variance trade-off as # of variables/parameters increases Collect sample to learn about population (make inference) Models are just that: approximating models of a truth Balance errors of underfitting and overfitting

    10. Why include multiple predictors in a model? Interaction (effect modification) Confounding Increase precision (reduce unexplained variance) Method of adjustment Exploratory for unknown correlates

    11. Interpreting Coefficients When you have more than 1 variable in the model the interpretation is different Continuous: “ß1: For a unit change in X, there is a ß1 change in Y, adjusting for the other variables in the model.”

    12. Relationship between Variables

    13. Interaction vs. Confounding Confounding is a BIAS we want to REMOVE Interaction is a PROPERTY we want to UNDERSTAND Confounding Apparent relationship of X (exposure of interest) with Y is distorted due to the relationship of Z (confounder) with X (and Y) Interaction Relationship between X and Y differs by the level of Z (when X and Z interact)

    14. Model building Science vs. Art Different philosophies Some agreement on what is worse Not many agree on a best approach

    15. Model building: Two approaches Data-based approach Non-data based

    16. How do you decide what predictor variables to include?

    17. Selecting Predictor Variables

    18. Rule of Model Parsimony

    19. Variable Selection

    20. Data-based: Using p-values Popular (Remember Johnny from Cobra Kai?) Selection methods: Forward, Backwards, Stepwise Bivariate screening, then multivariable on those initially significant

    21. Automatic Selection

    22. Forward Selection

    23. Backwards Elimination

    24. Stepwise Selection

    25. Criticisms of P-value based Model Building Does not incorporate thinking into the problem/automates Multiple comparisons issue If multicollinearity is present, selection is made arbitrarily ß’s, SEß’s are biased (Harrell Jr., 2001) Test statistics don’t have right distribution (Grambsch, O’Brien, 1991)

    26. Selection methods using p-values If using these methods there is some preference given to Backwards elimination selection Some evidence of performing better than Forward selection (Mantel, 1970) At least initial full model is accurate

    27. Non P-value based Methods

    28. Theoretical Considerations

    29. Prior Literature Considerations

    30. Information Criteria: AIC, BIC

    31. Data-based: Using AIC AIC is unbiased estimator of the theoretical distance of a model to an unknown true mechanism (that actually generated the data)

    32. AIC is unbiased estimator of the theoretical distance of a model to an unknown true mechanism (that actually generated the data) How is this so??? If you are really curious… Data-based: Using AIC

    33. A Gross Simplification of AIC

    34. Data-based: Using AIC Useful for selecting best model out of candidate model set (not great if all are poor) The size of 1 AIC value is not important but rather relative size to other AIC’s Models need not be nested but have same sample size (Burnham & Anderson, 2002)

    35. Treatment Effect Approach

    36. Model Building for Treatment Effect Goal If we don’t include confounders or interactions that were important then that could obscure picture of outcome-exposure relationship

    37. Still will consider Parsimony If we include many covariates (not confounders or interactions) perhaps some will only add “noise” to model Noise added could obscure picture of outcome-exposure relationship

    38. Data-based: Prediction goal When Parsimony matters: find most accurate model that is most parsimonious (smallest # of predictors) When doesn’t matter: pure accuracy = goal at any cost Example: Quality control Plausible but not typical

    39. Best Predictive Model Approach

    40. Book on Model building Chapters 6, 7 Basically takes the approach of trying to accurately establish the outcome-exposure relationship

    41. Book recommendations Multistage strategy: Determine variables under study from research literature and/or that are clinically or biologically meaningful Assess interaction prior to confounding Assess for confounding Additional considerations for precision

    42. Book recommendations Use backwards elimination of modeling terms Retain lower-order terms if higher-orders are significant: Keep 2 variables if 2-way interaction if significant Keep lower power terms if highest power is significant

    43. Model building We will focus on treatment effect goal Will consider book guidelines

    44. Note about Model Building Differences between “Best” model and nearest competitors may be small Ordering among “Very Good” models may not be robust to independent challenges with new data

    45. Note about Model Building Be careful not to overstate importance of variables included in “Best” model Remember that “Best” model odds ratios & p-values tend to be biased away from the null Cross-validation approaches allow estimation of prediction errors associated with variable selection and also provide comparisons between sets of best models

    46. SAS Lab: ICW

    47. Model Diagnostics

    48. After selecting a model Want to check modeling fit and diagnostics to ensure adequacy Could be worried about: Influential data points Correlated predictor variables Leaving out variables or using wrong form Overall model fit and prediction value

    49. Problems to check for Convergence problems Model goodness-of-fit Functional form (confounding, interaction, higher order for continuous) Multicollinearity Outlier effects

    50. Convergence problems SAS usually converges but sometimes will get a message: “There is possibly a quasicomplete separation in the sample points. The ML estimate may not exist. Validity of the model fit is questionable.”

    51. Convergence problems Quasi-complete separation = occurs whenever there is complete separation except for a single value of the predictor Complete separation = some linear combination of the predictors perfectly predicts the outcome Problem is they’re too good! Example: CHD=1 whenever Gender=Male

    52. Quasi-complete separation Typically easy to diagnose. Why? SAS prints a log warning. SE’s are gigantic, OR’s or CI’s are extreme. What to do about it?

    53. Quasi-complete separation Options: If continuous, create groups If multi-group categorical, collapse groups If dichotomous, group another way if possible Drop variable Drop cases from analyses

    54. Diagnostics Modeling fit: Hosmer & Lemeshow goodness of fit c statistic (area under ROC curve) Generalized R-square Residual analyses: Examine for outliers in X space (hii’s) Examine for odd combinations of Y, X Examine for influential points on ?’s (on all or on specific ones)

    55. Hosmer-Lemeshow Goodness-of-fit LACKFIT option in LOGISTIC Generate predicted probabilities from the fitted model Group into i intervals (usually 10) based on size and compare to observed frequencies Calculate a Chi-square statistic with df = # of intervals - 2

    56. Considerations for H-L GOF test Is a conservative test Low power to detect specific types of lack of fit (e.g., nonlinearity in a predictor variable) Highly dependent on how the observations are grouped Caution if p-value if large in concluding model is ok

    57. Generalized R-square

    58. Area under ROC curve: c statistic The Receiver Operating Characteristic (ROC) curve is a plot of the proportion of correctly predicted events (Sensitivity) against 1-proportion of correctly predicted non-events (1-Specificity) The sharper the initial rise of the ROC curve the better predicting model

    59. Area under ROC curve: c statistic The c statistic is the area under the ROC curve and is a statistic that quantifies predictive ability Examples for c (Ashton, 1995): Good = 0.831 Bad = 0.493

    60. c=0.696

    61. Multicollinearity Diagnosing multicollinearity is similar to what was done for regression This is because it is a problem of the predictor variables One approach: can just use VIF in an analogous Linear Regression model Better approach: weight by predicted probabilities in an initial step

    62. Multicollinearity If VIF > 7 attention is warranted If VIF > 10 indication of multicollinearity What do you do if you have it? Combine variables in an index Consider data reduction (e.g., PCA) Drop variables

    63. hii: extreme points in X space hii’s are the Leverage values; the diagonal values of the Hat matrix Observations that are unusual in the combination of predictors can be quantified by hii’s

    64. Deviance residuals: obs not explained by model well Deviance residuals can identify cases that are not explained well by the model The sum of the squared deviance residuals is the Deviance = -2lnL Why not plot di vs. hii ?

    65. DFBETAs: influential points on ?’s Measures how much each regression coefficient changes with the ith case deleted Actual change is divided by the SEß If one case changes ßK substantially then observation is highly influential

    66. C-bar: confidence interval displacement Measure of the overall change in all the coefficients with the ith case deleted Similar to Cook’s distance in linear regression If one case changes ß’s substantially then observation is highly influential

    67. SAS Lab: ICW

    68. Looking ahead Extensions & Advanced methods Review with Q&A Exam 1: June 26th

More Related