Evidence-based decision making. How would you like that done, Minister ?

Evidence-based decision making. How would you like that done, Minister ? Tony Ades RSS Avon Group, February 7 2008 Academic Unit of Primary Health Care, Department of Community Based Medicine

Outline • If you were a Minister of Health, and DoH had a “decision-making machine (DMM)”, what properties would you want it to have ? • How is evidence used to make decisions ? • Some examples (good and bad) • Obstacles to building a DMM.

Evidence-Based Medicine • response to increasing amounts of evidence, arbitrary / selective use of evidence. • formal protocol inclusion / exclusion • formal synthesis (“meta-analysis”) of studies • Hierarchy of evidence • Objective: “to help clinicians make decisions”

Evidence-based decision-making ? Decision Evidence

Evidence-based decision-making ? ? Decision Evidence

Evidence-based decision-making Costs Net Benefit Model Benefits Decision Evidence

Evidence-based decision-making …. Costs Net Benefit Model Benefits Decision Evidence … needs an evidence-based model

How are health care decisions made? 1. Estimate the Costs Cjof each strategy, j costs of intervention, “downstream” treatment costs 2. Estimate the Benefits Lj of each strategy quality-adjusted life years 3. Decide the “value” of an additional Life Year £20,000 - £30,000 4. Work out the Net Benefit of each strategy j NBj = Lj . 20000 – Cj 5. Choose strategy j*, with the highest Net Benefit

Desirable properties of a decision making “machine” • Makes complete use of evidence • Checks consistency of evidence • Propagates uncertainty

Complete use of evidence • EBM-style Protocol for study inclusion / exclusion. Aim: all relevant evidence • Repeatability. ? > 95% overlap between different reviewers would help guarantee “completeness” • Wider inclusion: If parameters a and b are in a model, data on the product ab is relevant • But what counts as “relevant” ? Anything that reduces decision uncertainty is relevant

Consistency of evidence • All the studies informing the parameter a should agree. Pooling • The combined information on a, together with the combined information on b, should agree with the combined (independant) information on the product ab. External validation, “triangulation”, backward-propagation of evidence

Uncertainty propagation Uncertainty in parameters (or data) is propagated through the model to be reflected in uncertainty in the decision Average Net Benefit over the joint distribution of parameters θ: • Choose strategy j*with highest Expected Net benefit. • “Probabilistic Decision Analysis”, usually done by “forward” Monte Carlo • Tells Decision Maker the probability the optimal decision is wrong • Should be joint posterior distribution of parameters θ

What kind of machine can do this ? … • Complete use of evidence achieved through Bayesian or Likelihood methods. Likelihood quite difficult computationally • Consistency of evidence: detecting inconsistency is normal practice in statistics – but less so in decision modelling. Validation considered important, but formal inference seldom brought to bear • Uncertainty propagation : essentially Bayesian idea. Monte Carlo (MC) framework very convenient for decision modelling: hence simulation from a Bayesian posterior distribution. Or try to mimic through bootstrap ? Bayesian Markov chain MC, eg WinBUGS …very convenient

Completeness, consistency, uncertainty propagation: hardly controversial !So, what’s the problem ? • Some modellers are not following this script • Script not always easy to follow • Understanding the data at all • Responding to inconsistencies • Inference for computationally complex models Next: Some examples of what is done badly, what can be done that is good, and what is still difficult to do well

TOPIC 1: Chlamydia screening (… not following the script) • Most common STI. Treatable, but often ‘silent’ • Complications of chlamydia: Pelvic Inflammatory Disease, Ectopic Pregnancy, Infertility • => Prima facie case for screening, But is it cost-effective ? • Two UK studies: ClaSS (“pro-active”), HPA (“opportunistic”). Both UK Government funded. • Each produced models for CEA. Epidemiology / transmission model, and a natural history model, embedded in a cost-effectiveness analysis.

Chlamydia complication rates – ClaSS study • Pr(PID | chlamydia) based on Uppsala Study. • GP registers with screening information linked with hospital registers of PID. • Pr(PID | chlamydia) estimate based on all PID, not just those screened +ve. Therefore not linked to chlamydia incidence / prevalnce • both Over –estimates as includes PID not caused by chlamydia • And under –estimates as misses PID treated by GPs • Use of this estimate assumes (i) chlamydia incidence same in UK and Sweden (ii) the over- and under-estimation biases are known to exactly cancel each other out. About 3.6%

Chlamydia complication rates – HPA study • No single, evidence-based, estimate • 3 Alternative scenarios: 1%, 10%, 30% • But authors observe that 10% is around the maximum credible UK complication rate, based on 1.7% pa PID incidence observed in GPRD • A “deterministic sensitivity analysis”. Useful to know Screening not cost-effective if complication rate < 10%....? ….. but not if we have no idea what the true rate is ! • Ignores correlations between parameters …

Chlamydia complication rates: What is the evidence ? Prospective UK chlamydia prevalence p = 1.5% (Women aged 15-44, NATSAL 2000) Duration of an episode d: 120 days (recent meta-analysis) => UK chlamydia Incidence h = p/d = 1.5 x 365.25/120 = 4.6% pa Probability Pelvic Inflammatory Disease (PID) per episode of chlamydia : c = 0.15 (meta-analysis of prospective studies) => incidence of PID due to chlamydia:ch = 0.69% pa Retrospective UK Incidence of all PID (GPRD): ch/B = 1.7% pa, but not all PID caused by chlamydia. Proportion of All PID due to chlamydia (retrospective study) : B = 0.39 => incidence of PID due to chlamydia: ch = 0.66 % pa

Towards complete use of evidence • Meta-analyses of prevalence, duration. • Meta-analyses of prospective studies of PID • Several retrospective studies of PID • Similar prospective & Retrospective studies on other complications: ectopic pregnancy and infertility • Further prospective studies following PID forward to infertility and further retrospective studies looking back from infertility to PID. • Large network of evidence, to be assembled, synthesised, checked for consistency • BUT some difficulties interpreting literature, adjusting for biases, understanding recruitment into studies, taking test sensitivity / specificity into account …. etc, etc etc

TOPIC 2: How it should be done ! Antenatal screening for HIV: universal or targeted ? a = Proportion Sub-Saharan African (SSA) b = Proportion Injecting Drug Users (IDU) 1-a-b = Proportion ‘Low-Risk’ c = HIV prevalence SSA d = HIV prevalence IDU e = HIV prevalence Rest (‘Low Risk’) No direct evidence

Complete use of evidence: HIV in London, 1998 % Num Denom • Proportion Sub-Saharan African (SSA a 10.6 11044 104577 • Proportion Injecting Drug Users (IDU) b 1.36 12 882 • HIV prevalence SSA c 1.63 252 15428 • HIV prevalence IDU d 2.11 10 473 • HIV prevalence non-SSA [db + e(1-a-b)]/(1-a) 0.054 74 136139 Mixture of Low-risk & IDU: HIV prevalence is a weighted average of Low-Risk prevalence & IDU prevalence … provides information on e the Low-Risk prevalence

Complete use of evidence: another example value (%) Num Denom • Proportion Sub-Saharan African (SSA) a 10.6 11044 104577 • Proportion Injecting Drug Users (IDU) b 1.36 12 882 • HIV prevalence SSA c 1.63 252 15428 • HIV prevalence IDU d 2.11 10 473 • HIV prevalence non-SSA [db + e(1-a-b)]/(1-a) 0.054 74 136139 • Overall HIV prevalence ac +db +e(1-a-b) 0.187 254 102287 Weighted average of all 3 risk groups …. Possibility of INCONSISTENCY

… including proportion already diagnosed in the HIV model…

… the larger data set … 12 independent data items : 9 parameters 1. proportion Sub-Saharan African (SSA) a 2. proportion Injecting Drug Users (IDU) b 3. HIV prevalence SSA c 4. HIV prevalence IDU d 5. HIV prevalence non-SSA [db + e(1-a-b)]/(1-a) 6. Overall HIV prevalence ac +db +e(1-a-b) 7. Diagnosed HIV in SSA : all diag HIV fca/[fca +gdb +he(1-a-b)] 8. Diagnosed HIV in IDU : non-SSA diag HIV gdb/[gdb + h(1-a-b)] 9. Overall proportion diagnd: [fca+gdb+ he(1-a-b)] / [ca+db+e(1-a-b)] 10. Proportion infected IDUs diagnosed g 11. Proportion serotype B in infected women (SS) w 12. Proportion serotype B in infected women (Non-SSA) [db/[db+e(1-a-b)]]+we(1-a-b)/[db+e(1-a-b)]

Uncertainty propagation from posterior Uncertainty in the data is propagated back to the model parameters, and then through the model to the decision. The decision is based on Expected Incremental Net Benefit, difference between Exp NB Universal and Targeted Screening. inb = 105000*(1-a-b) * (M * e * (1-h) - T*(1-e*h)) If Exp[INB] is +ve, choose Universal; if –ve, Targeted. Proportion Low-Risk M is the Net Benefit of diagnosing an Infected Mother T is the cost of an HIV test Number of pregnant women in London per year

Uncertainty propagation h e inb Probability INB +ve = 0.974 Choose “universal” with confidence

TOPIC 3: Inconsistency - it’s not so easy after all. Highlighted items fit poorly 1. proportion Sub-Saharan African (SSA) a 2. proportion Injecting Drug Users (IDU) b 3. HIV prevalence SSA c 4. HIV prevalence IDU d 5. HIV prevalence non-SSA [db + e(1-a-b)]/(1-a) 6. Overall HIV prevalence ac +db +e(1-a-b) 7. Diagnosed HIV in SSA : all diag HIV fca/[fca +gdb +he(1-a-b)] 8. Diagnosed HIV in IDU : non-SSA diag HIV gdb/[gdb + h(1-a-b)] 9. Overall proportion diagnd: [fca+gdb+ he(1-a-b)] / [ca+db+e(1-a-b)] 10. Proportion infected IDUs diagnosed g 11. Proportion serotype B in infected women (SS) w 12. Proportion serotype B in infected women (Non-SSA) [db/[db+e(1-a-b)]]+we(1-a-b)/[db+e(1-a-b)]

Information on b and information on d conflicting with information on bd 1. proportion Sub-Saharan African (SSA) a 2. proportion Injecting Drug Users (IDU) b 3. HIV prevalence SSA c 4. HIV prevalence IDU d 5. HIV prevalence non-SSA [db + e(1-a-b)]/(1-a) 6. Overall HIV prevalence ac +db +e(1-a-b) 7. Diagnosed HIV in SSA : all diag HIV fca/[fca +gdb +he(1-a-b)] 8. Diagnosed HIV in IDU : non-SSA diag HIV gdb/[gdb + h(1-a-b)] 9. Overall proportion diagnd: [fca+gdb+ he(1-a-b)] / [ca+db+e(1-a-b)] 10. Proportion infected IDUs diagnosed g 11. Proportion serotype B in infected women (SS) w 12. Proportion serotype B in infected women (Non-SSA) [db/[db+e(1-a-b)]]+we(1-a-b)/[db+e(1-a-b)]

How to deal with inconsistency ? DETECTION of inconsistency based on statistical methods But Identifying the SOURCE of inconsistency NOT a statistical issue: • Synthesise, check, detect, reconsider evidence. • then drop or bias adjust ‘suspect’ data item. Post-hoc: danger of abuse. Final estimates will depend on which item is dropped or adjusted • Before synthesis, critically examine every item of information. • Elicit expert opinion on bias and relevance of each item • Devise bias priors for each item • Overall evidence synthesis Works well as a two-stage process in simple pair-wise meta-analysis (Turner & Spiegelhalter) Identifiability problems, convergence problems ……

How often is this done ? Real examples of Bayesian synthesis based on “all available evidence” ? • DoH estimates of HIV prevalence • DoH estimates of HCV prevalence • NHS / HTA report on antenatal screening for Early Onset neonatal Group B Strep • Several NICE analyses: mixed treatment comparisons. (Sort of)

TOPIC 4: Things we cannot yet do:Uncertainty propagation in STI models • Infectious disease models have “dynamic” elements. (Lower incidence => lower prevalence => lower incidence) • “Compartmental” models. Susceptible, Infected, Immune, states. Differential equations. • Typically, no formal parameter estimation by ML or Bayesian methods. • Instead “iterative recalibration”. ie tweaking. • “Scenerio”-type sensitivity analysis. No propagation of uncertainty, no formal inference …

TOPIC 5: Things we cannot yet do:Uncertainty propagation in STI models • STI models ESPECIALLY difficult. As well as the compartmental model, ALSO need to capture rate of formation / break-up of partnerships, numbers of concurrent partners. • ClaSS and HPA used “microsimulation” models. For each set of parameter values, several thousand individuals form relationships, acquire chlamydia, develop complications etc, and are tracked over a period of years until they stabilise. • Computer intensive. Neither ML nor Bayes solutions currently available: so no uncertainty propagation

Why Does This Matter • Heroic & time consuming deterministic SA required. • Conclusions like “Screening not CE if PID rate < 10%” not secure. (a) we need to know Pr(PID rate <10%) (b) Given data on PID incidence, same PID progression rate may be “high” if chlamydia prevalence incidence assumed high, or “low” if chlamydia incidence is high. - Given (indirect) data in chlamydia incidence, chalmydia prevalence highly correlated with chlamydia duration. But chlamydia duration parameter highly correlated with transmission rate (per sexual contact)…. Correlated with concurrency rate ….etc etc etc

Methods research needed • How to formulate STI models in a form where they are tractable enough to use Bayesian MCMC OR • Develop Bayesian computation methods so that they can accommodate current formulations of STI models.

Summary • We do know how what counts as “evidence-based decision making” : (completeness, consistency, uncertainty). eg NICE • Some people have not been trying hard enough ! • Not always easy. More research on: • How to respond to inconsistent evidence • How to do proper statistical inference (and so uncertainty propagation) with infectious disease models, especially STI models

Multi-Parameter Evidence Synthesis page: • Slides, papers, programs: http://www.bristol.ac.uk/cobm/research/mpes

Evidence-based decision making. How would you like that done, Minister ?