260 likes | 645 Views
Evaluating Risk Adjustment Models. Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics. Evaluating Model’s Predictive Power. Linear regression (continuous outcomes) Logistic regression (dichotomous outcomes). Evaluating Linear Regression Models.
E N D
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics
Evaluating Model’s Predictive Power • Linear regression (continuous outcomes) • Logistic regression (dichotomous outcomes)
Evaluating Linear Regression Models • R2 is percentage of variation in outcomes explained by the model - best for continuous dependent variables • Length of stay • Health care costs • Ranges from 0-100% • Generally more is better
Risk Adjustment Models • Typically explain only 20-25% of variation in health care utilization • Explaining this amount of variation can be important if remaining variation is extremely random • Example: supports equitable allocation of capitation payments from health plans to providers
More to Modeling than Numbers • R2 biased upward by more predictors • Approach to categorizing outliers can affect R2 as predicting less skewed data gives higher R2 • Model subject to random tendencies of particular dataset
Evaluating Logistic Models • Discrimination - accuracy of predicting outcomes among all individuals depending on their characteristics • Calibration - how well prediction works across the range of risk
Discrimination • C index - compares all random pairs of individuals in each outcome group (alive vs dead) to see if risk adjustment model predicts a higher likelihood of death for those who died (concordant) • Ranges from 0-1 based on proportion of concordant pairs and half of ties
Adequacy of Risk Adjustment Models • C index of 0.5 no better than random • C index of 1.0 indicates perfect prediction • Typical risk adjustment models 0.7-0.8
C statistic • Area under ROC curve for a predictive model no better than chance at predicting death is 0.5 • Models with improved prediction of death by • 0.5 SDs better than chance results in c statistic =0.64 • 1.0 SDs better than chance resutls in c statistic = 0.76 • 1.5 SDs better than chance results in c statistic =0.86 • 2.0 SDs better tha chance results in c statistic =0.92
Best Model Doesn’t Always Have Biggest C statistic • Adding health conditions that result from complications will raise c statistic of model but not make the model better for predicting quality.
Spurious Assessment of Model Performance • Missing values can lead to some patients being dropped from models • Be certain when comparing models that the same group of patients is being used for all models otherwise comparisons may reflect more than model performance
Calibration - Hosmer-Lemeshow • Size of C index does not indicate how well model performs across range of risk • Stratify individuals into groups (e.g. 10 groups) of equal size according to predicted likelihood of adverse outcome (eg death) • Compare actual vs expected outcomes for each stratum • Want a non significant p value for each stratum and across strata (Hosmer-Lemeshow statistic)
Hosmer-Lemeshow • For k strata the chi squared has k-2 degrees of freedom • Can obtain false negative (non significant p value) by having too few cases in a stratum
Calculating Expected Outcomes • Solve the multivariate model incorporating an individual’s specific characteristics • For continuous outcomes the predicted values are the expected values • For dichotomous outcomes the sum of the derived predictor variables produces a “logit” which can be algebraically converted to a probability • (e nat log odds/1 + e nat log odds)
Individual’s CABG Mortality Risk • 65 y.o obese non white woman with diabetes and serum creatinine of 1 mg/dl presents with an urgent need for CABG surgery. What is her risk of death?
Individual’s Predicted CABG Mortality Risk • 65 y.o obese non white woman with diabetes presents with an urgent need for CABG surgery. What is her risk of death? • Log odds = -9.74 +65(0.06) + 1(.37)+1(.16)+1(.42)+1(.26)+1(1.15) +1(.09) = 3.39 • Probability of death = elnodds/1+elnodds 0.034/1.034=3.3%
Observed CABG Mortality Risk • Actual outcome of whether individual lived or died • Observed rate for a group is number of deaths per the number of people in that group
Actual and Expected CABG Surgery Mortality Rates by Patient Severity of Illness in New York Chi squared p=.16
Stratifying by Risk • Hosmer Lemeshow provides a summary statistic of how well model is calibrated • Also useful to look at how well model performs at extremes (high risk and low risk)
Validating Model – Eye Ball Test • Face validity/Content validity • Does empirically derived model correspond to a pre-determined conceptual model? • If not is that because of highly correlated predictors? A dataset limitation? A modeling error?
Validating Model in Other Datasets: Predicting Mortality following CABG Jones et al, JACC, 1996
Recalibrating Risk Adjustment Models • Necessary when observed outcome rate different than expected derived from a different population • This could reflect quality of care or differences in coding practices • Assumption is that relative weights of predictors to one another is correct • Recalibration is an adjustment to all predictor coefficients to force average expected outcome rate to equal observed outcome rate
Recalibrating Risk Adjustment Models • New York AMI mortality rate is 15% • California AMI mortality rate is 13% • Is care or coding different? • If want to use New York derived risk adjustment model to predict expected deaths in California need to adjust predictors (eg multiply by 13/15)
Summary • Summary statistics provide a means for evaluating the predictive power of multivariate models • Care should be taken to look beyond summary statistics to ensure that the model is not overspecified and that it conforms to a conceptual model • Models should be validated with internal and ideally external data • Next time we will review how a risk-adjustment model can be used to identify providers who perform better and worse than expected given their patient mix