260 likes | 276 Views
Utilising rank and DCE data to value health status on the ‘QALY’ scale using conventional and Bayesian methods. John Brazier and Theresa Cain with Aki Tsuchiya and Yaling Yang Health Economics and Decision Science, ScHARR, University of Sheffield, UK
E N D
Utilising rank and DCE data to value health status on the ‘QALY’ scale using conventional and Bayesian methods John Brazier and Theresa Cain with Aki Tsuchiya and Yaling Yang Health Economics and Decision Science, ScHARR, University of Sheffield, UK Prepared for the CHEBS Focus Fortnight
Outline • Concerns with current cardinal methods for valuing health states • Problems in using ordinal data • Application of rank and DCE methods to valuing Asthma health states using conventional methods • Application of Bayesian methods to analysing DCE data • Implications for research and policy
Problems with cardinal methods for valuing health states • TTO and SG seen to be cognitively complex tasks that may be too difficult for some (e.g children, very elderly) • TTO values contaminated by time preference, standard gamble by risk attitude and rating scales by end point bias (among other things) • Role for ordinal methods (rank and discrete choice)
Ordinal tasks: Ranking and discrete choice experiments • Ranking respondents asked to order a set of health states from best to worst - traditionally used as a warm up exercise prior to VAS/SG/TTO based preference elicitation • Discrete choice experiments (DCE) - typically asks respondents to choose between two health states (A and B)
Problems with using ordinal data to value health for QALYs • DCE and rank models estimate a latent health state utility value, but with arbitrary anchors • QALYs require health states to be valued on the full health (one) and being dead (zero) scale • Key problem is linking results of DCEs to the full health-dead scale
Previous work using ordinal data Ranking • Early application of Thurstone’s method by Kind (1982) • Use of conditional logit on rank data by Salomon (2003) on EQ-5D and McCabe et al (2005) on SF-6D and HUI2 – some success DCE • DCE applications in health economics mainly concerned with relative weight of different attributes of health care rather than to valuing health per se • DCE considered unsuitable for assessing cost effectiveness (because utility scale is not comparable between studies)
Past attempts to apply DCE to valuing HRQoL • Hakim and Pathak (1999) applied DCE to valuing EQ-5D states • used ‘pick one’ from 12 choice sets (each containing 3 states plus dead) • Exploratory and did not produce weights • McKenzie et al (2001) estimated weights for asthma symptoms • no link to full health-dead scale • Viney et al (2004) included attributes for HRQoL and survival – but did not estimate health state values
Alternative approaches to using DCE The latent utility scale needs to be anchored on the full health-dead scale and there are a number of different ways: • Value PITS state externally by TTO/SG (Ratcliffe and Brazier, 2005) • Include a dead state in the pair wise choice set* • Using the question ‘is this a state worth living’ in the best-worst scaling method (Flynn et al, 2005) * Method used in this study
Background to AQLQ study • Asthma Quality of Life Questionnaire (AQLQ) developed by Professor Juniper is a condition specific measure with 32 questions with 7 levels each covering 4 dimensions • A simplified health state classification was developed from the AQL-5D based on a sample of items on 5 domains: concern, breathlessness, pollution and environment, sleep and activity
AQL-5D • Feel concerned about having asthma [1]None of the time [2]A little or hardly any of the time [3]Some of the time [4]Most of the time [5] All of the time • Feel short of breath as a result of asthma [1]None of the time [2]A little or hardly any of the time [3]Some of the time [4]Most of the time [5] All of the time • Experience asthma as a result of air pollution [1]None of the time [2]A little or hardly any of the time [3]Some of the time [4]Most of the time [5] All of the time • Asthma interferes with getting a good night’s sleep [1]None of the time [2]A little or hardly any of the time [3]Some of the time [4]Most of the time [5] All of the time • Overall, the activities I have done have been limited [1] Not at all [2] A little [3] Moderate or some [4] Extremely or very [5] Totally
Health state 32345 Feel concerned about having asthma some of the time [3] Feel short of breath as a result of asthma a little or hardly any of the time [2] Experience asthma symptoms as a result of air pollution some of the time [3] Asthma interferes with getting a good night’s sleep most of the time [4] Overall, totally limited with all the activities done [5]
Valuation survey: sampling and interview • Representative sample of adult general population invited to participate At the interview: • Ranked health states from best to worst (7 AQLQ health states, full health (i.e. best AQLQ state), the worst AQLQ state and immediate death) • Time trade-off (York MVH variant) of 8 AQLQ health states against shorter time in full health • 100 health states valued in this way
Methods: postal follow-up • Approx 4 weeks after interview respondents received DCE questionnaire in post • Optimal statistical design for DCE based upon level balance, orthogonality and minimum overlap was produced by programme in SAS (Huber and Zwerina, 1996) • 12 pair wise comparisons were produced and randomly allocated to two versions of questionnaire with 6 choices in each • Two additional pairs presented to respondents containing with AQL-5D states vs. dead.
Statistical model for rank and DCE data General model:µij = f(ß’xij + ΦD+uij) Where µij is the latent utility function of respondent i for state j x is a vector of dummy explanatory variables for each level of each dimension of the classification. For example,x32 denotes dimension α=3, level λ = 2. D is a dummy variable for the state of being dead which takes the value 1 for being dead or otherwise zero.
Modelling health state values Modelling: • TTO: individual level model (random effects) • DCE: random effects probit model • Ranking: rank ordered logit model Rescaling: • Re-scale by dividing ß coefficients on each dimension level by the coefficient for being dead. • These rescaled coefficients provide predictions for health state values on the same scale as TTO valuations although the predicted values for health states may not necessarily be the same as those obtained using the TTO technique.
Results of valuation survey Rank/TTO interview: • 308 respondents (response rate 40% ) • Representative in terms of gender, age, education • 2455 TTO valuations across 100 health states DCE • 168 returned questionnaires (response rate 55%) • 1336 pair wise comparisons
Concern2 -0.047* Concern3 -0.064* Concern4 -0.074* Concern5 -0.095* Breath2 -0.024 Breath3 -0.045* Breath4 -0.107* Breath5 -0.116* * statistically significant in 0.05 level Dependent variable: TTO values MAE = 0.051 Pollution2 -0.017 Pollution3 -0.028 Pollution4 -0.063* Pollution5 -0.099* Sleep2 -0.013 Sleep3 -0.029 Sleep4 -0.054* Sleep5 -0.069* Activity2 -0.029 Activity3 -0.044* Activity4 -0.139* Activity5 -0.164* Results - impact of dimension level on TTO scores (Individual level Random Effects model with main effects)
Overall comparison • TTO model predicts observed TTO values best (lowest MAE) • Rank model predicts observed TTO values nearly as well as TTO model • DCE model is associated with largest difference from observed TTO values and seems to have a steeper gradient (i.e. more extreme values)
Research questions 1. Is DCE really easier than TTO/SG or VAS? 2. Does DCE produce different estimates from TTO and SG? 3. Theoretical basis for using DCE rather than conventional TTO or SG 4. Basic DCE design issues 5. Analysis – mixed logit or Bayesian models 6. Does the dead dummy solve the problem?
Does including dead solve the problem? • A more natural solution is to include survival as an attribute – but this has a multiplicative relationship to QoL and so would require a far larger design • Using ‘dead’ requires the ‘pits’ health state of the classification to be considered worse than dead by some respondents – so not suitable for milder classifications • What about those who do not think any state is worse than dead (85% in this sample)? • For those who do not think any state is worse than dead, then their data tells us nothing about their strength of preference for QoL compared to quantity of life • Are the 85% all none traders? SF-6D (67%), HUI3 (33%) and EQ-5D (14%)