1 / 35

Problems with Statistical Predictions and Decision-Making Approaches

This article discusses the limitations and issues with common regression modeling approaches in statistical predictions and their impact on decision-making. It provides insights on validation, calibration, discrimination, and clinical usefulness of prediction models. The article also explores strategies to improve performance and prevent poor decision-making.

lynnem
Download Presentation

Problems with Statistical Predictions and Decision-Making Approaches

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Why Most Statistical Predictions Cannot Reliably Support Decision-Making: Problems Caused By Common RegressionModeling Approaches Ewout Steyerberg Professor of Clinical Biostatistics and Medical Decision Making Leiden, October 2017

  2. Why most predictionmodels are false • Methods at development • Rigorous validation

  3. Validation

  4. Three competitive models • MMRPredict(NEJM, 2006) - Common regressionmodeling approach; small data set • MMRPro(JAMA 2006)- Bayesian modeling approach; moderate size data set • PREMM (JAMA 2006; Gastroenterology 2011; JCO 2016)- Sensible regression modeling approach; large data set Which model wins? Which may do harm?

  5. 6 clinic-based, 5 pop-based cohorts

  6. Discrimination Clinic-based Population-based

  7. Calibration plots: obs vs predicted • Calibration slope as a measure of overfitting

  8. Calibration

  9. Calibration

  10. Clinical usefulness • Statistical performance: Discrimination and calibration • Consider full range of predictions • Decision-analytic performance: • Define a decision threshold: act if risk > threshold • TP and FP classifications • Net Benefit as a summary measure:NB = (TP – w FP) / n, with w = harm/benefit(Vickers & Elkin, MDM 2006)

  11. Decision curve analysis Clinic-based Population-based

  12. Overview • Clinical context: testing for Lynch syndrome • Statistical and decision-analytic performance • Could poor performance have been foreseen? Prevented?

  13. Example of “barbarian modeling strategy”

  14. Selection based on statistical significance

  15. Many predictors, >37 df; dichotomized

  16. Exaggeratedeffects

  17. Sample size issues Robust: strong, vigorous, sturdy, tough, powerful, powerfully built, solidly built, as strong as a horse/ox, muscular, sinewy, rugged, hardy, strapping, brawny, burly, husky

  18. Poor performance foreseeable? • Simulate modeling strategy • Small sample size • 38 events at development • 35 events vs >2000 at validation • Stepwise selection • Univariate and multivariable statistical testing • Dichotomization • New cohort: n=19,866; 2,051 mutations

  19. Poor calibration Poor discrimination

  20. Poor decision-makingIllustration with 10 random samples

  21. Could poor performance be prevented? • PREMM modeling strategy • Coding of family history • Continuous age

  22. SiM 2007

  23. Could poor performance be prevented? • PREMM modeling strategy • Coding of family history • Continuous age • Larger sample size

  24. Better discrimination and calibration if a) more sensible modeling and b) larger sample size

  25. Substantially better decision-making if a) more sensible modeling and b) larger sample size

  26. Discussion • Avoid stepwise selection • Prespecification with summary variables • Advanced estimation • Avoid dichotomization • Keep continuous • Increase sample size • Combining development and validation sets • Collaborative efforts • Rigorous validation • Statistical and decision-analytic perspective

  27. Evaluation of decision-making • Net Benefit: “utility of the method” • Peirce, Science 1884 • Youden index: sens + spec – 1 • Net Benefit • Vickers, MDM 2006 • Weight FP:TP = H:B = odds(threshold) (Vergouwe 2003) • Decision Curve Analysis

  28. Youden index and Net Benefit

  29. Avoid miscalibration by overfitting • ShrinkageReduce coefficients by multiplying by s, s<1E.g.: multiply by 0.8 • PenalizationRidge regression: shrink during fittingLASSO: shrink to zero; implicit selectionElastic Net: combination of Ridge and LASSO • Machine learning ?

More Related