130 likes | 240 Views
Parametric modelling of cost data: some simulation evidence. Andrew Briggs University of Oxford Richard Nixon MRC Biostatistics Unit, Cambridge Simon Dixon University of Sheffield Simon Thompson MRC Biostatistics Unit, Cambridge 2003 CHEBS Seminar, Friday 7 th November.
E N D
Parametric modelling of cost data: some simulation evidence Andrew BriggsUniversity of OxfordRichard NixonMRC Biostatistics Unit, CambridgeSimon DixonUniversity of SheffieldSimon ThompsonMRC Biostatistics Unit, Cambridge 2003 CHEBS Seminar, Friday 7th November
Parametric modelling of cost data: Background • Cost data are typically non-normally distributed, with high skew and kurtosis • Arithmetic mean cost is of interest to policy makers • Central Limit Theorem ensures sample mean is consistent estimator • Commentators have proposed parametric modelling of cost data to improve efficiency • In particular, Lognormal distribution commonly advocated • Alternatively, Gamma distribution is an increasingly popular choice
Parametric modelling of cost data: Choice of estimator • If data are Lognormal an efficient estimator of mean cost is: exp(lm+lv/2) • If data are Gamma distributed the maximum likelihood estimate of the population mean is the sample mean
Parametric distributions: Simulation experiment • Lognormal / Gamma distributions • Population mean was set to be 1000 • Five choices of coefficient of variation (CoV = 0.25, 0.5, 1.0, 1.5, 2.0) to define distribution parameters • Samples of five different sizes (n = 20, 50, 200, 500, 2000) drawn from each distribution for each CoV • 2 x 5 x 5 = 50 experiments • Bias, coverage probability and RMSE all recorded
Empirical cost distributions:Summary statistics for 3 data sets Raw cost Log transformed cost
Empirical cost distributions:Data set 1: CPOU Raw cost Log transformed cost
Empirical cost distributions:Data set 2: IV Fluids Raw cost Log transformed cost
Empirical cost distributions:Data set 3: Paramedics Raw cost Log transformed cost
Parametric cost modelling:Comments & conclusions • “All models are wrong” (Box 1976) • “No data are normally distributed” (Nester 1996) • Costs are estimated from resource use times unit cost • Any parametric assumption relating to costs is at best an approximation • Simulations confirm that there are efficiency gains if appropriate distribution is chosen • But incorrect assumptions can lead to very misleading conclusions • Sample mean performs well and is unlikely to lead to inappropriate inference • Only when there are sufficient data to permit detailed modelling is the choice of an alternative estimator warrented