1 / 33

Evaluating Risk Adjustment Models

Evaluating Risk Adjustment Models. Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics. Goals of Risk-Adjustment. Account for pertinent patient characteristics before making inferences about effectiveness, efficiency, or quality of care

cody
Download Presentation

Evaluating Risk Adjustment Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics

  2. Goals of Risk-Adjustment • Account for pertinent patient characteristics before making inferences about effectiveness, efficiency, or quality of care • Minimize confounding bias due to nonrandom assignment of patients to different providers or systems of care • Confirm the importance of specific predictors

  3. Why Risk-Adjustment? • Monitoring and comparing outcomes of care (death, readmission, adverse events, functional status, quality of life) • Monitoring and comparing utilization of services and resources (LOS, cost) • Monitoring and comparing patient satisfaction • Monitoring and comparing processes of care

  4. How is Risk Adjustment Done • On large datasets • Uses measured differences in compared groups • Model impact of measured differences between groups on variables shown, known, or thought to be predictive of outcome so as to isolate effect of predictor variable of interest

  5. When Risk-Adjustment May Be Inappropriate • Processes of care which virtually every patient should receive (e.g., immunizations, discharge instructions) • Adverse outcomes which virtually no patient should experience (e.g., incorrect amputation) • Nearly certain outcomes (e.g., death in a patient with prolonged CPR in the field) • Too few adverse outcomes per provider

  6. When Risk-Adjustment May Be Unnecessary • If inclusion and exclusion criteria can adequately adjust for differences • If assignment of patients is random or quasi-random

  7. When Risk-Adjustment May Be Impossible • If selection bias is an overwhelming problem • If outcomes are missing or unknown for a large proportion of the sample • If risk factor data (predictors) are extremely unreliable, invalid, or incomplete

  8. Data Sources for Risk-Adjustment • Administrative data are collected primarily for a different purpose, but are commonly used for risk-adjustment • Medical records data are more difficult to use, but contain far more information • Patient surveys may complement either or both of the other sources

  9. Advantages of Administrative Data • Universally inclusive population-based • Computerized, inexpensive to obtain and use • Uniform definitions • Ongoing data monitoring and evaluation • Diagnostic coding (ICD-9-CM) guidelines • Opportunities for linkage (vital stat, cancer)

  10. Disadvantages of Administrative Data • Missing key information about physiologic and functional status • No control over data collection process • Quality of diagnositc coding varies across hospitals • Incentives to upcode (DRG creep), possibly avoid coding complications • Inherent limitations of ICD-9-CM

  11. Doing Your Own Risk-Adjustment vs. Using an Existing Product • Is an existing product available or affordable? • Would an existing product meet my needs? - Developed on similar patient population - Applied previously to the same condition or procedure - Data requirements match availability - Conceptual framework is plausible and appropriate - Known validity

  12. Conditions Favoring Use of an Existing Product • Need to study multiple diverse conditions or procedures • Limited analytic resources • Need to benchmark performance using an external norm • Need to compare performance with other providers using the same product • Focus on resource utilization, possibly mortality

  13. A Quick Survey of Existing ProductsHospital/General Inpatient • APR-DRGs (3M) • Disease Staging (SysteMetrics/MEDSTAT) • Patient Management Categories (PRI) • RAMI/RACI/RARI (HCIA) • Atlas/MedisGroups (MediQual) • Cleveland Health Quality Choice • Public domain (MMPS, CHOP, CSRS, etc.)

  14. A Quick Survey of Existing ProductsIntensive Care • APACHE • MPM • SAPS • PRISM

  15. A Quick Survey of Existing ProductsOutpatient Care • Resource-Based Relative Value Scale (RBRVS) • Ambulatory Patient Groups (APGs) • Physician Care Groups (PCGs) • Ambulatory Care Groups (ACGs)

  16. How Do Commercial Risk-Adjustment Tools Perform • Better predictor of use/death than age and sex • Better retrospectively (~30-50% of variation) than prospectively (~10-20% of variation) • Lack of agreement among measures • More than 20% of in-patients assigned very different severity scores depending on which tool was used (Iezzoni, Ann Intern Med, 1995)

  17. Building Your Own Risk-Adjustment Model • Previous literature • Expert opinion - Generate specific hypotheses, plausible mechanisms - Translate clinically important concepts into measurable variables (e.g., cardiogenic shock) - Separate factors that could be risk for disease or complication of treatment • Data dredging (retrospective)

  18. Empirical Testing of Risk Factors • Univariate/bivariate analyses to eliminate low frequency, insignificant, or counterintuitive factors • Test variables for linear, exponential,or threshold effects • Test for interactions

  19. Potential Risk Factors for CABG Outcomes • Age, gender, race, ht, wt, BMI, • Ejection fraction, NY Heart Class, # of vessels • Comorbidity - hypertension, chf, copd, dm, hepatic failure, renal failure, calcified aorta • Acute treatment/complications - IABP, thrombolysis, PTCA, PTCA complication, hemodynamic instability • Past hx - previous surgery, PTCA, MI, stroke, fem-pop • Behaviors - smoking

  20. Significant Risk Factors for Hospital Mortality for Coronary Artery Bypass Graft Surgery in New York State, 1989-1992

  21. Significant Risk Factors for Hospital Mortality for Coronary Artery Bypass Graft Surgery in New York State, 1989-1992

  22. Risk Factors in Large Data Sets: Can you have too much power? • Clinical vs. statistical importance • Risk of overfitting, and need for a comprehensible model, mandate data reduction • Consider forcing in clinically important predictors

  23. Evaluating Model Quality • Linear regression (continuous outcomes) • Logistic regression (dichotomous outcomes)

  24. Evaluating Linear Regression Models • R2 is percentage of variation in outcomes explained by the model - best for continuous dependent variables • Ranges from 0-100% • Generally more is better but biased upward by more predictors • Sometimes explaining a small amount of variation is still important

  25. Evaluating Logistic Models • c statistic - compares all random pairs of individuals in each outcome group (alive vs dead) to see if risk adjustment model predicts a higher likelihood of death for those who died • Ranges from 0-1 • c value of 0.5 means model is no better than random • c value of 1.0 indicates perfect performance

  26. How well model predicts outcomes across range of risks - Hosmer-Lemeshow • Stratify individuals into groups (e.g. 10 groups) of equal size according to predicted likelihood of adverse outcome (eg death) • Compare actual vs predicted deaths for each stratum • Hosmer-Lemeshow chi-square statistic (8 degrees of freedom for 10 deciles) • Trying to demonstrate a non significant p value

  27. Actual and Expected Mortality Rates for Different Levels of Patient Severity of Illness Chi squared p=.16

  28. Goodness-of-fit tests for AMI mortality models OSHPD: AMI Outcomes Project, 1996

  29. Aggregating to the group level • Sum observed and predicted events • Statistical problems arise when total number of predicted events are small • Assuming chi-squared comparisons of groups testing minimum of five expected events per group as a rule of thumb

  30. Comparing observed and expected outcomes • Observed events or rates of events • Expected events or rates of events • Risk adjusted events or rates = site specific(observed/expected) X average observed across all sites

  31. Validating Model • Face validity/Content validity • Gold standard = external validation with new data • Separate development and validation data sets - Randomly split samples - Samples from different time periods/areas • Re-estimate model using all available data

  32. Bootstrap Procedure: “If things had been a little different” • Multiple (e.g. 1000) random samples derived from original sample with replacement • Estimate model’s performance in each new random sample • Can derive C.I.s of model coefficients from empirical results of “new samples”

  33. Consistency in the evidence • Similar findings over time help to rule out random effects • Differences between observed and expected may be due to things other than ‘quality’ • Confirmation through very different types of evidence is a major goal • View the risk adjusted estimates as ‘yellow flags’, not ‘smoking guns’

More Related