1 / 15

Matth i as Kormaksson May 3 rd , 2019

Novartis Pharmaceuticals Advanced Exploratory Analytics. A novel statistical learning method for time-to-event outcomes  w/ comparison to other ML approaches on Heart Failure data. Matth i as Kormaksson May 3 rd , 2019. Agenda. Objectives Heart Failure Studies Methods LASSO GAM

bettyw
Download Presentation

Matth i as Kormaksson May 3 rd , 2019

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Novartis Pharmaceuticals Advanced Exploratory Analytics A novel statistical learning method for time-to-event outcomes w/ comparison to other ML approaches on Heart Failure data Matthias Kormaksson May 3rd, 2019

  2. Agenda • Objectives • Heart Failure Studies • Methods • LASSO • GAM • Random Survival Forest • GAMLASSO • Experimental Comparison • Discrimination (C-index) • Calibration • Conclusions Business Use Only

  3. Objectives • Simpson et al. (2016): • Develop multi-domain prognostic models for Heart Failure time-to-event endpoints. • Can benchmark model be improved by using state-of-the-art Machine Learning? Business Use Only

  4. Heart Failure Studies • Data ( PARADIGM (training set), ATMOSPHERE (test set) • Response: Time to event (CV Death, HF-Hospitalization, Composite-endpoint, All Cause Death) • Categorical Predictors ( Sex, Ethnicity, PriorHeartFailureFLAG, ... • Continuous Predictors ( Age, Potassium, NT-proBNP, ... Business Use Only

  5. Methods • LASSO: regularized regression that shrinks size of variable coefficients (some to zero), thus facilitating variable selection. • Generalized Additive Model (GAM): models non-linear relationships between risk and baseline predictors. • Random Survival Forest: an ensemble (survival) tree method for analysis of right-censored time-to-event data. • GAMLASSO: a novel statistical learning method that is a hybrid between GAM and LASSO, thus facilitating both non-linear modeling and variable selection. Business Use Only

  6. Cox-Methods Continuous Categorical LASSO: All terms linear: -penaltyon all ’s GAM: -penalty on (or no penalty) -type-penalty on the non-linear ’s (or -penalty) GAMLASSO: -penalty on -type-penalty on the non-linear ’s Business Use Only

  7. GAMLASSO Algorithm Continuous Categorical Set , then iterate (until convergence) between these steps: • LASSO step • GAM step L1-penalty on Fit Set L1-penalty on Fit Set Business Use Only

  8. Random Survival Forest Data Bootstrap samples Survival Tree 1 Survival Tree 2 Survival Tree n CBH 1 CBH 2 CBH n Ensemble Cumulative Baseline Hazard Business Use Only

  9. Discrimination (C-index) *Simpson et al. (2016) reported C=0.71 and C=0.70 respectively *Simpson et al. (2016) reported C=0.71 and C=0.70 respectively Business Use Only

  10. Calibration Business Use Only

  11. Calibration (Nam d’Agostino) * Simpson replicated model score = 14.22 * Simpson replicated model score = 9.81 * Simpson replicated model score = 13.10 Compare with Business Use Only

  12. Conclusions • Carefully constructed benchmark model is robust and comparable to best Machine Learning (ML) models. • Advantage of ML models over carefully constructed benchmark model is automation. • Random survival forest suffered from poor calibration (an important performance metric often overlooked) • LASSO’s main advantage is variable selection, while GAM’s main advantage is non-linear modeling. GAMLASSO enjoys the best of both worlds and fared well in comparison to the other methods. Business Use Only

  13. R-package Business Use Only

  14. Acknowledgements Key team members • Guenther Mueller-Velten, DEV Biostatistics CM & EM • Core member of HF Study Group • Planning and coordination of statistical contributions • Hui Wang, (former) GCE remote contractor, DEV Biostatistics CM & EM • Planning and implementation of the multi-domain prognostic model • David James, DEV Stats Meth & Consulting* • Methodologic consultant • Assessment of various machine learning approaches for modeling of outcomes • IndrayudhGhosal, PhD Student Cornell University • Implemented the R-package during his summer internship at Novartis. Business Use Only

More Related