Evaluation of Support Vector Machines for Risk Modeling in Interventional Cardiology

Evaluation of Support Vector Machines for Risk Modeling in Interventional Cardiology Michael E. Matheny, M.D.

Goal • Comparison of support vector machines and logistic regression risk modeling performance over time for the outcome of death in pre-intervention cardiac catheterization patients.

Pre-intervention Risk Assessment • Percutaneous Coronary Intervention (PCI) is a high volume procedure with significant morbidity & mortality • Risk of death in PCI varies widely based on co-morbidities • Providing accurate case level estimations can greatly aid patient and physician decision-making

Domain Data Quality • The American College of Cardiologists has published a standardized data dictionary (ACC-NCDR) and mandates that accredited centers maintain detailed data on all PCI patients • Some states, including Massachusetts, now have mandatory reporting of case data based on the ACC-NCDR

Current Risk Model StandardLogistical Regression (LR) • Gold standard for risk modeling in interventional cardiology • Type of generalized non-linear model • Used in analysis of a binary outcome • Bounded by 0 and 1 • Feature (variable) selection • From All Available Data • Known Risk Factors from Prior Studies • Selected Subset of data based on Study Design

Alternative Risk ModelSupport Vector Machine (SVM) • Key Features • Kernel Functions - introduce non-linearity in the hypothesis space without explicitly requiring a non-linear algorithm • Linear • Polynomial • Radial Based • Global Minimum

Risk Model EvaluationDiscrimination • Provides an estimate of population level accuracy • Area under the Receiver Operating Characteristic (ROC) Curve • Graphed by the sensitivity vs. 1-specificity at different thresholds

Risk Model EvaluationCalibration • Provides an estimation of case level accuracy • Hosmer-Lemeshow’s Goodness-of-Fit Test • Primarily used in logistic regression • Calculates how well the observed and expected frequencies match • Handles data sparsity better than more common methods (Variance, Pearson’s) • P > 0.05 is a good fit

Source Data • Brigham & Women’s Hospital • Interventional Cardiology Database • January 1, 2002 – October 30, 2004 • 5383 Cases • Data split two ways each into 2/3 Training (3588) and 1/3 Test (1795) • Sequential Split • sorted chronologically • October 27, 2003 split • Random Split

Sample DemographicsOverview

Model Features

Logistic RegressionModel Development • STATA 8.2 (College Station, TX) • Backwards Stepwise Technique • Exclusion Threshold (P 0.05 – 0.15) • Feature Selection

Logistic RegressionFeature Selection • Model development • Sequential Training Set • Stepwise Backwards (P = 0.10) used for feature selection • Stepwise feature removal based on ROC and HL Goodness-of-fit (HL) optimization

Logistic RegressionFeature Selection

Logistic RegressionEvaluation

Support Vector MachineModel Development • GIST 2.1.1 (Columbia University, NY, NY) • STATA 8.2 (College Station, TX) • All variables used • Kernel Choice • Polynomial (1-6) • Radial width factor (related to sigma) (0.1-20) • Probabilistic Output Methodology • Discriminant: distance from hyperplane • LR Model using Discriminant as the only feature • Established method to convert SVM classification to regression • Allows use of HL Goodness of fit

Support Vector MachinePolynomial Evaluation

Support Vector MachineRadial Evaluation

DiscussionAll Discrimination • All Models showed excellent performance • None of the models was significantly different in performance • This measure was relatively insensitive to changes in data across widely variable levels of calibration

DiscussionLR Calibration • For this data, LR was unable to maintain calibration. This is likely due to temporal data drift • The LR models required manual feature selection and expert knowledge to calibrate the training data sets

DiscussionSVM Calibration • Some versions of both kernel types were able to maintain calibration on both data sets • Calibration was maintained across larger parameter ranges of both kernels for the random data set than the sequential data set • Current assessments of discrimination and calibration on the training set are insufficient to choose the optimal kernel parameter

Conclusions • SVMs could be superior to LR in terms of maintaining calibration over time in this domain • Further exploration is needed to develop additional markers of model robustness • Further work in evaluating optimal time intervals to create new models or recalibrate old models

The end

Evaluation of Support Vector Machines for Risk Modeling in Interventional Cardiology

Evaluation of Support Vector Machines for Risk Modeling in Interventional Cardiology

Presentation Transcript

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines

Support Vector Machines