Models for cost analysis in health care: a critical and selective review

Department of BioinformaticsLjubljana, 1st April 2005 Models for cost analysis in health care: a critical and selective review Dario Gregori Department of Public Health and Microbiology, University of Torino Giulia Zigon, Department of Statistics, University of Firenze Rosalba Rosato, Eva Pagano, Servizio di Epidemiologia dei Tumori, Università di Torino, CPO Piemonte Simona Bo, Gianfranco Pagano, Dipartimento di Medicina Interna, Università di Torino Alessandro Desideri, Service of Cardiology, Castelfranco Veneto Hospital University of TorinoDepartment of Public Health and Microbiology

Outline • Cost-effectiveness and cost-analisys • Problems in cost analisys of clinical data • zero costs • skewness • censoring • Models for cost data • Two case studies • Diabetes costs in the Molinette cohort • COSTAMI trial Department of Public Health and Microbiology University of Torino

The Molinette Diabetes Cohort • 3892 subjects, including all type 2 diabetic patients, resident in region Piedmont, attending the Diabetic Clinic of the San Giovanni Battista Hospital of the city of Torino (region Piedmont, Italy) during 1995 and alive at 1st January 1996. • A mortality and hospitalization follow-up was carried over up to 30th June 2000. • A sub-cohort of 2550 patients having at least one hospitalization in the subsequent years was also identified. • Demographic data (age, sex) and clinical data relative to the year 1995 ( duration of disease or years of diabetes and number of other co-morbidities) were recorded. • Costs (in euros) for the daily and the ordinary hospitalizations have been calculated referring to the Italian DRG system. Department of Public Health and Microbiology University of Torino

The COSTAMI study • 487 patients with uncomplicated AMI were randomly assigned to three different strategies: • (132 patients) early (Day 3-5) use of pharmacological stress echocardiography and discharge on days 7-9 in case of a negative test result ; • (130 patients) pre-discharge exercise ECG, that is a maximum, symptom limited test on days 7-9, followed by discharge in case of a negative test result; • (225 patients) clinical evaluation and hospital discharge in Day 7-9. • The suggested strategy in case of a positive test for the strategy 1 and 2 was coronary angiography followed by ischaemia guided revascularisation (Desideri et. al, 2003). • A follow up of 1 year for medical costs was carried out. Cost of hospitalization was estimated referring to mean reimbursement for the diagnosis-related groups (DRG). Department of Public Health and Microbiology University of Torino

The CE Incremental Ratio • Goal is to compare efficacy with costs • T1, T2 treatment-groups of patients Department of Public Health and Microbiology University of Torino

The Cost-Efficacy plane ΔC Upper Threshold R1 R1c Lower Threshold R1B R2A ΔE R1A R2B R2 R2c Department of Public Health and Microbiology University of Torino

Dominance • Laska & Wakker work (late 80’s) • ΔC < 0, ΔE > 0 T1 is dominant • ΔC > 0, ΔE < 0 T2 is dominant • ΔC > 0, ΔE > 0 T1 more effective and more costly • ΔC < 0, ΔE < 0 T1 less costly but less effective If effects are equivalent or of no interest, then the approach is the analysis of costs alone Department of Public Health and Microbiology University of Torino

Typical goals in cost-analysis • To get an estimate of the mean costs of treating the disease • In experimental settings: to test for differences among two or more groups • In observational settings: to identify patients/structure characteristics influencing costs • To get an estimate of the expected costs, at a fixed time point, for specific types of patients (cost profiling) Department of Public Health and Microbiology University of Torino

Typical problems in cost-analysis • • The possible large mass of observations with zero cost; • • The asymmetry of the distribution, given that there is a minority of individuals with high medical cost compared to the rest of the population • Possible presence of censoring: • Right censoring due to loss at follow-up or administrative rule (O’Hagan 2002) • Death censoring: dead patients are seen as lost at follow-up, to compensate for higher/earlier mortality at lower costs (Dudley et al, 1993) • • General requisite are • the censoring must be independent or non informative. This condition is needed because the individuals still under observation must be representative of the population at risk in each group, otherwise the observed failure rate in each group will be biased • the assumption of proportional hazards may be violated by the medical costs due to accumulation at different rates Department of Public Health and Microbiology University of Torino

Proportionality on cost accumulation and censoring Etzioni, 1999 Department of Public Health and Microbiology University of Torino

Accumulation under alternatives (without covariates) Department of Public Health and Microbiology University of Torino

Censoring: some conflicting definitions Department of Public Health and Microbiology University of Torino

Cost distribution # zero-cost patients: 2226 Department of Public Health and Microbiology University of Torino

Accumulation of costs over time Department of Public Health and Microbiology University of Torino

Studies with no-zero mass • OLS on untransformed use or expenditures • OLS for log(y) to deal with skewness • Box-Cox generalization • Gamma regression model with log link • Generalized Linear Models (GLM) • Robustness to skewness • Reduce influence of extreme cases • Good forecast performance • No systematic misfit over range of predictions • Efficiency of estimator Department of Public Health and Microbiology University of Torino

Linear models • Ordinary Least Square (OLS) model assumes the following form for the costs estimated via Gauss-Markov or ML, in this case requiring normality and constant variance on residuals To reduce skewness in the residuals, the Box-Cox transform of ci can be used • Problems: • normality is still assumed • bias is • thus, heteroscedasticity, if present, raises additional efficiency and inference problems on the transformed scale Department of Public Health and Microbiology University of Torino

Log-normal models • A particular case of transformation is the ln(Cij) ~ N(γj, σj2) for two treatments j=0,1 • In this case, E(Cij)=exp(γj+0.5 σj2) and a test of H0: γ1 – γ2=0 is a test for the geometric means. This was argued to be less interesting for policy makers, but observing • H0: exp(γ1+0.5 σ12) = exp(γ2+0.5 σ22) implies • H0: γ1 – γ2=0 iff σ12= σ22 • Making a test for the geometric means being equivalent to one on arithmetic means only in case of homogeneity of variances in the treatment groups Department of Public Health and Microbiology University of Torino

Box-Cox transform varying λ Department of Public Health and Microbiology University of Torino

The threshold-logit model • Utilized to model the probability of having costs in excess of a given threshold, usually chosen as the median q2 or the third quartile q3 in the cost distribution • It does not requires normality, and can work also for very skewed cost-distributions. • Problems: • it does not give an estimate of the mean costs, although it estimates the covariates’ effects on costs • conclusions are sensitive to the threshold chosen, which, in addition is sample-based Department of Public Health and Microbiology University of Torino

GLM models • To avoid bias in transforming the costs directly, since the idea is to model the transformation of the expectation • Where the distribution for the response is usually taken to be Gamma() and the link function • for additive effects as the identity function I() • for multiplicative models as the log() • allowing in this case back-transformation to avoid bias Department of Public Health and Microbiology University of Torino

GLM and QL/GEE estimate • Use data to find distributional family and link • Family “down weights” noisy high mean cases • Link can handle linearity • Note difference in roles from Box-Cox • Box-Cox power addresses mostly symmetry in error. • GLM with power function addresses linearity of response on scale to be chosen • GLM/GEE/GMM modeling approach’s estimating equations Given correct specification of E[y|x] = µ(xβ), key issues relate to second-order or efficiency effects This requires consideration of the structure of v(y|x) Department of Public Health and Microbiology University of Torino

Variance determination • Accommodates skewness & related issues via variance weighting rather than transform/retransform methods • Assumes Var[y|x] = α × [E(y|x)]γ • = α × [exp(xβ)]γ • For GLM, solutions are • Adopt alternative "standard" parametric distributional assumptions, • γ = 0 (e.g. Gaussian NLLS) • γ = 1 (e.g. Poisson) • γ = 2 (e.g. Gamma) • γ = 3 (e.g. Wald or inverse Gaussian) • Estimate γ via: • linear regression of log((y- µ)2) on [1, log( µ)] (modified "Park test" by least squares) • gamma regression of (y- µ)2 on [1, log( µ)] (modified "Park test" estimated by GLM) • nonlinear regression of (y- µ)2 on αµγ • Given choice of γ, can form V(x) and conduct (more efficient) second-round estimation and inference Department of Public Health and Microbiology University of Torino

Monte Carlo Simulation (Mannings, 2000) • Data Generation • Skewness in dependent measure • Log normal with variance 0.5, 1.0, 1.5, 2.0 • Heavier tailed than normal on the log scale • Mixture of log normals • Heteroscedastic responses • Std. dev. proportional to x • Variance proportional to x • Alternative pdf shapes • monotonically declining or bell-shaped • Gamma with shapes 0.5, 1.0, 4.0 • Estimators considered • Log-OLS with • homoscedastic retransformation • heteroscedastic retransformation • Generalized Linear Models (GLM), log link • Nonlinear Least Squares (NLS) • Poisson • Gamma Department of Public Health and Microbiology University of Torino

Effect of skewness on the raw scale Department of Public Health and Microbiology University of Torino

Effects of heavy tails on the log scale Department of Public Health and Microbiology University of Torino

Effects of shape for Gamma Department of Public Health and Microbiology University of Torino

Effect of heteroschedasticity on the log scale Department of Public Health and Microbiology University of Torino

Simulation summary • All consistent, except Log-OLS with homoscedastic retransformation if the log-scale error is actually heteroscedastic • GLM models suffer substantial precision losses in face of heavy-tailed (log) error term. If kurtosis > 3, substantial gains from least squares or robust regression. • Substantial gains in precision from estimator that matches data generating mechanism Department of Public Health and Microbiology University of Torino

The “zero” problem • Problems with standard model • OLS may predict negative values • Zero mass may respond differently to covariates • These problems may be bigger when higher mass at 0 • Alternative estimators • Ignore the problem • ln(c+k) • Tobit and Adjusted Tobit models (Heckman type model) • Two-part models Department of Public Health and Microbiology University of Torino

The log(c+k) solution • Solution: add positive constant k to costs • Advantages • Easy • Log addresses skewness, constant deals with ln(0) • Disadvantages • Zero mass may respond differently to covariates • Many set k=1 arbitrarily • Value of k matters, need grid search for optimum • Poorly behaved (Duan 1983) • Retransformation problem aggravated at low end Department of Public Health and Microbiology University of Torino

Latent Variables • Sometimes binary dependent variable models are motivated through a latent variables model • The idea is that there is an underlying variable y*, that can be modeled as • y* = b0 +xb + e, but we only observe • y = 1, if y* > 0, and y =0 if y* ≤ 0, Department of Public Health and Microbiology University of Torino

The Tobit Model • Can also have latent variable models that don’t involve binary dependent variables • Say y* = xb + u, u|x ~ Normal(0,s2) • But we only observe y = max(0, y*) • The Tobit model uses MLE to estimate both b and s for this model • Important to realize that b estimates the effect of x on y*, the latent variable, not y Department of Public Health and Microbiology University of Torino

Interpretation of the Tobit Model • Unless the latent variable y* is what’s of interest, can’t just interpret the coefficient • E(y|x) = F(xb/s)xb + sf(xb/s), so • ∂E(y|x)/∂xj = bj F(xb/s) • If normality or homoskedasticity fail to hold, the Tobit model may be meaningless Department of Public Health and Microbiology University of Torino

Tobit fit to diabetes data Department of Public Health and Microbiology University of Torino

Tobit – some notes • Only works well if dependent variable is censored Normal • Places many restrictions on parameters, error term • Hypersensitive to minor departures from normality • (Almost) never recommended for health economics Department of Public Health and Microbiology University of Torino

Mixed models • On the basis of the basic rule of expectation one can partition • Thus, expectation is splitted in two parts, • Pr(any use or expenditures) • Full sample • Use logit or probit regression • 2. Level of use or expenditures • Conditional on c > 0 (subsample with c >0) • Use appropriate continuous model • Estimates of mean costs are obtained using the Duan’s (1983) smearing estimator (mean of the exponentiated residuals) Department of Public Health and Microbiology University of Torino

Diabetes two-part model Department of Public Health and Microbiology University of Torino

Marginal effect in the two-part model Continuous variable x P(y>0)=0.54 E(Y|Y>0)=7509.82 For year of diabetes, this means Βlogit = 0.025 Βols=49.83 Marginal effect is 208€ per year of diabetes Department of Public Health and Microbiology University of Torino

Weighted-regression models • To adjust for censoring, the basic idea is to weight the costs for the inverse of the probability of being alive, mimicking the basic Horvitz-Thompson estimator. • Thus, the Bang-Tsiatis (2000) basic estimator is where δ is the censoring indicator, M(t) is the cumulative cost up to time t and K() is the Kaplan-Meier estimate Bang-Tsiatis (2000) proposed an improved version accounting for cost-history lost due to censoring, allowing the cost function M() and the Kaplan-Meier to be estimated in each of the K intervals, defined optimally according to Lin (1993) Department of Public Health and Microbiology University of Torino

Improving estimation (Jiang, 2004) • Bootstrap confidence interval had much better coverage accuracy than the normal approximation one when medical costs had a skewed distribution. • When there is light censoring on medical costs (<25%) the bootstrap confidence interval based on the simple weighted estimator is preferred due to its simplicity and good coverage accuracy. • For heavily censored cost data (censoring rate >30%) with larger sample sizes (n>200), the bootstrap confidence intervals based on the partitioned estimator has superior performance in terms of both efficiency and coverage accuracy Department of Public Health and Microbiology University of Torino

Censored estimation (diabetes cohort) Department of Public Health and Microbiology University of Torino

Survival models • The cost function is defined as and the hazard of having an “excess” of costs is modeled avoiding (Cox’s model) or not (Weibull model) the full specification of the baseline λ0 to avoid assumption of proportional accumulation over time (Etzioni, 1999), an alternative model can be the Aalen additive regression (Zigon, 2005) where the hazard rate is a linear combination of the variables x(c) and α(c) are functions estimated from the data Department of Public Health and Microbiology University of Torino

Survival approach – some notes • Coefficients are interpretable as the “risk” of having costs greater than actual ones • If proportionality does not hold, then • Baseline cost-hazard with strata • Partition of the costs axis • Model non-proportionality by cost-dependent covariates β(c)X = βX(c) • Refer to other models (accelerated failure or additive hazards) Department of Public Health and Microbiology University of Torino

Diabetes Full cohort Department of Public Health and Microbiology University of Torino

Issues and models in cost-analysis X= satisfied, o = partially satisfied Department of Public Health and Microbiology University of Torino

Estimates on the Molinette Cohort • We compared performances of the survival models with two “benchmarks” widely (and often inappropriately) used in the literature, OLS and Threshold-logit model, using the non-zero costs cohort Both normality (Shapiro-Wilk test p<0.0001) and proportionality in hazards (Grambsch-Therneau test p<0.001) assumptions refused Department of Public Health and Microbiology University of Torino

Covariates effects Department of Public Health and Microbiology University of Torino

Estimates of the mean Department of Public Health and Microbiology University of Torino

Cost profiling Department of Public Health and Microbiology University of Torino

Effect of covariates (Aalen model) on Λ(c) Department of Public Health and Microbiology University of Torino

Models for cost analysis in health care: a critical and selective review