130 likes | 148 Views
Explore the use of parametric modelling for cost data analysis, comparing Lognormal and Gamma distributions. Simulation results on bias, coverage probability, and RMSE to guide efficient estimator selection. Includes empirical cost distribution summary and commentary on model assumptions.
E N D
Parametric modelling of cost data: some simulation evidence Andrew BriggsUniversity of OxfordRichard NixonMRC Biostatistics Unit, CambridgeSimon DixonUniversity of SheffieldSimon ThompsonMRC Biostatistics Unit, Cambridge 2003 CHEBS Seminar, Friday 7th November
Parametric modelling of cost data: Background • Cost data are typically non-normally distributed, with high skew and kurtosis • Arithmetic mean cost is of interest to policy makers • Central Limit Theorem ensures sample mean is consistent estimator • Commentators have proposed parametric modelling of cost data to improve efficiency • In particular, Lognormal distribution commonly advocated • Alternatively, Gamma distribution is an increasingly popular choice
Parametric modelling of cost data: Choice of estimator • If data are Lognormal an efficient estimator of mean cost is: exp(lm+lv/2) • If data are Gamma distributed the maximum likelihood estimate of the population mean is the sample mean
Parametric distributions: Simulation experiment • Lognormal / Gamma distributions • Population mean was set to be 1000 • Five choices of coefficient of variation (CoV = 0.25, 0.5, 1.0, 1.5, 2.0) to define distribution parameters • Samples of five different sizes (n = 20, 50, 200, 500, 2000) drawn from each distribution for each CoV • 2 x 5 x 5 = 50 experiments • Bias, coverage probability and RMSE all recorded
Empirical cost distributions:Summary statistics for 3 data sets Raw cost Log transformed cost
Empirical cost distributions:Data set 1: CPOU Raw cost Log transformed cost
Empirical cost distributions:Data set 2: IV Fluids Raw cost Log transformed cost
Empirical cost distributions:Data set 3: Paramedics Raw cost Log transformed cost
Parametric cost modelling:Comments & conclusions • “All models are wrong” (Box 1976) • “No data are normally distributed” (Nester 1996) • Costs are estimated from resource use times unit cost • Any parametric assumption relating to costs is at best an approximation • Simulations confirm that there are efficiency gains if appropriate distribution is chosen • But incorrect assumptions can lead to very misleading conclusions • Sample mean performs well and is unlikely to lead to inappropriate inference • Only when there are sufficient data to permit detailed modelling is the choice of an alternative estimator warrented