Alex Sutton & Nicola Cooper Centre for Biostatistics and Genetic Epidemiology,

METHODS FOR SYNTHESISING EVIDENCE FROM STUDIES EVALUATING DIAGNOSTIC PERFORMANCE OF A MEDICAL TEST FOR ECONOMIC DECISION MODELLING Alex Sutton & Nicola Cooper Centre for Biostatistics and Genetic Epidemiology, Department of Health Sciences, University of Leicester, UK. Acknowledgements: Steve Goodacre (University of Sheffield) Jo Lord (NICE)

BACKGROUND • Increasingly decision models are being developed to inform complex clinical/economic decisions • e.g. NICE technology appraisals • Decision models provide: • Explicit quantitative & systematic approach to decision making • Compares at least 2 alternatives • Useful way of synthesising evidence from multiple sources (e.g. effectiveness data from trials, adverse event rates from observational studies, etc.)

BACKGROUND • Decision modelling techniques commonly used for: • i) Extrapolation of primary data beyond endpoint of a trial, • ii) Indirect comparisons when no ‘head-to-head’ trials • iii) Investigation of how cost-effectiveness of clinical strategies/interventions changes with values of key parameters (often not observable in primary data analysis), • iv) Linking intermediate endpoints to ultimate measures of health gain (e.g. QALYs) • v) Incorporation of country specific data relating to disease history and management.

BACKGROUND • Economic decision models are more established for the evaluation of medical interventions than for the evaluation of diagnostic tests • Evaluation of diagnostic tests: • Address issues regarding getting appropriate treatments to the appropriate people • Methodologically more challenging • AIM: To consider how evidence on diagnostic studies should be synthesised and incorporated into economic decision models

OUTLINE • Comprehensive decision modelling • Clinical evaluation of diagnostic tests • Meta-analysis of diagnostic tests • Economic evaluation of diagnostic tests • Putting it all together: Economic decision model for deep vein thrombosis (DVT) • Discussion

COMPREHENSIVE DECISION MODELLING

EVIDENCE-BASED MODELS • Decision models contain many unknown parameters & evidence may include published data, controlled trial data, observational study data, or expert knowledge. • Need to utilise/synthesise available evidence • Model parameters can include: • clinical effectiveness, • costs, • disease progression rates, & • utilities

EVIDENCE-BASED MODELS • Evidence-based models – Require systematic methods for evidence synthesis to estimate model parameters with appropriate levels of uncertainty • “Two-stage” process - evidence synthesis performed in statistical computer package (e.g. Stata) & pooled estimate input into a spreadsheet model (e.g. EXCEL) often without uncertainty • COMPARED TO • Single comprehensive framework - incorporating evidence synthesis, data manipulation & model evaluation within one coherent framework

EVIDENCE-BASED MODELS • Advantages of the single comprehensive modelling framework compared to 2-stage approach: • Transparent framework as all analysis within in one computer programme • Facilitates sensitivity analysis & updating • Distribution for pooled result(s) estimated from evidence synthesis, transformed into appropriate format & input into model – no distributional assumption necessary

COMPREHENSIVE DECISION MODEL FRAMEWORK DATA SOURCES RCT1 RCT2 RCT3 OBS1 OBS2 ROUTINE EXPERT Opinion pooling EVIDENCE SYNTHESIS Meta-analysis Gen. synthesis Bayes theorem In combination Adverse Events Clinical Effect MODEL INPUTS Cost Utility DECISION MODEL

MCMC SIMULATION • Replacing analytical (closed form) methods by simulation • Monte Carlo (MC) • Applied extensively in decision modelling using software which allows sampling from a wide variety of distributions. Also termed probabilistic sensitivity analysis • Markov chain Monte Carlo (MCMC) • Used when not possible to derive posterior distribution algebraically; i.e. provides a means of sampling from posterior distribution even when form of that distribution unknown

ADVANTAGES OF BAYESIAN METHODS FOR DECISION MODELLING • Flexible framework for complex models • Incorporation of greater parameter uncertainty(e.g. allows for fact that between-study precision in M-A estimated by the data) • Full allowance made for potential inter-relationships between all parameters in both decision model & M-A • Incorporation of expert opinion directly, or regarding the relative credibility of different data sources • Can make direct probability statements such as the probability that a new treatment is cost effective (CEACs) • WinBUGS – freely available Bayesian specialist software • http://www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml

EXAMPLE: SIMPLE DECISION TREE • Cost implications of using prophylactic antibiotics to prevent wound infection following caesarean section • Current rate of wound infection in UK taken from large registry (p1)=6000/75000 (8%) • Want to estimate p2 for UK hospitals

METHOD OUTLINE • Cochrane review of 61 RCTs(Smaill & Hofmeyr 2001) evaluating prophylactic antibiotics use for caesarean section • Meta-analysis of 61 RCTs to obtain Odds Ratio (OR)

META-ANALYSIS 0.34 (0.25 to 0.45)

METHOD OUTLINE • Risk of infection without treatment from large UK registry (p1=0.08) • Derive riskof infection if antibiotics introduced to UK hospitals (p2)

p2 0.02 (0.02 to 0.03) (1+ ) 0.080 (0.078 to 0.82) p1 RESULTS

RESULTS p2 0.02 (0.02 to 0.03) (1+ )  cost using antibiotics Treatment <- p2*(cwd+ctrt) + (1-p2)*(ctrt+cnwd) Control <- p1*cwd + (1-p1)*cnwd Diff <- Treatment - Control £16.93 (£7.96 to £25.76) 0.080 (0.078 to 0.82) p1

RESULTS: COST-EFFECTIVENESS PLANE Bayesian (MCMC) simulations Treatment more effective but more costly Control dominates Treatment less costly but less effective Treatment dominates

CALCULATING COST EFFECTIVENESS ACCEPTABILITY CURVE (CEAC) Incremental Net (Monetary) Benefit framework CE decision rule: • Re-arranging: • INB = NOTE: Rc= a decision makers willingness to pay for an additional unit of benefit (i.e. QALY)

RESULTS: CEAC

2) CLINICAL EVALUATION OF DIAGNOSTIC TESTS

EVALUATION OF DIAGNOSTIC TESTS • Consider a population to be made up of 2 groups: • Those with a disease • Those without the disease • A test aims to identify people as belonging to one of these two groups • Often a ‘Gold Standard’ test can perfectly distinguish groups, but cannot be used in routine practice (eg pathology) • Other imperfect tests are available, yielding continuous diagnostic markers

Test - Test + Group 0(Healthy) TN Group 1(Diseased) TP Threshold DT SENSITIVITY vs. SPECIFICITY pdf Diagnostic variable, D Sensitivity = number of true positives/total with disease Specificity = number of true negatives/total without disease

Lower threshold Higherthreshold TRACING OUT THE RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE 1 TP rate,Se 0 1 FP rate,(1-Sp)

SELECTING THE THRESHOLD T Point T gives Max accuracythreshold DT Accuracy (Se x Sp) Ignores relative opportunity costs of FP and FN results

Test 2 Area Under Curve Test 1 COMPARING TESTS Test 2 has maximum AUC What if curves cross over? Ignores costs of test & side effects etc (see later)

3) META-ANALYSIS OF DIAGNOSTIC TEST EVALUATION DATA

META-ANALYSIS OF DIAGNOSTIC TEST EVALUATION DATA • Used when multiple studies are available • More complicated than for effectiveness data • At least 4 different methods proposed • Vary in assumptions & sophistication • As well as usual sources of heterogeneity, diagnostic threshold may vary (explicitly or implicitly) between studies • Each study only adds one point in ROC space • All methods have “issues”

METHOD 1: Pool sensitivity and specificity independently • Assumes all studies evaluated at the same threshold

METHOD 2: Sens. & Spec. Bivariate meta-analysis model • Correlation between sensitivity and specificity taken into account

METHOD 3: Combining Diagnostic Odds Ratios • Used to summarise information contained within sensitivity and specificity – useful for meta-analysis (difficult to interpret clinically) • Traces out an SROC curve which is symmetrical around the line: sensitivity = specificity

METHOD 4: Asymmetric ROC • Littenberg and Moses method based on (transformed) linear regression

ISSUES • Output format of models is different: points, ellipses, SROC curves • None of these methods allow for explicit incorporation of threshold data (even if know) • Further options of fixed/random study effects and weighting schemes • Can make a lot of difference! • Not clear multiple summary data necessary better than one good IPD study from which ROC can be derived??? • Little work done how these methods ‘interface’ with a decision models

4) ECONOMIC EVALUATION OF DIAGNOSTIC TESTS

DIAGNOSTIC TEST DECISION MODELS • For a full (economic) evaluation consider “bigger picture” of how test(s) fit in with treatment and clinical outcomes beyond the test (as well as costs) • Decision modelling techniques used to evaluate diagnosis because: • i) RCT evaluation through to treatment and clinical outcomes can be large, costly and lengthy • ii) All tests/test combinations of interest may not have been compared in RCTs • Diagnostic test models outlined using an individual study of diagnostic performance (Laking et al., submitted)

COMPREHENSIVE DECISION MODEL FRAMEWORK DATA SOURCES OBS1 OBS2 RCT1 RCT2 OBS3 ROUTINE EXPERT Opinion pooling Bayes theorem In combination EVIDENCE SYNTHESIS Meta-analysis Meta-analysis Clinical Effect Test Accuracy MODEL INPUTS Cost Utility MODEL

5) EXAMPLE: DIAGNOSTIC TESTING FOR DEEP VEIN THROMBOSIS (DVT)

DEEP VEIN THROMBOSIS (DVT) • May form pulmonary embolus (PE) • PE may be fatal • May cause post-thrombotic syndrome (PTS) • Treated with anticoagulants • Anticoagulants may cause haemorrhage • Accurate diagnosis is important

SYSTEMATIC REVIEW AND META-ANALYSIS • Aimed to identify all diagnostic cohort studies comparing test to gold standard • Diagnostic tests for DVT - Number in bracket papers included in m-a • Wells score (22) • * D-dimer(111) • Plethysmography (89) • * Ultrasound(143) • Contrast venography (Gold standard) • Detailed exploration of heterogeneity + complications of distal and radial DVT, but no room to report here

INDIVIDUAL STUDIES OF D-DIMER • Good sensitivity but poor specificity • Substantial heterogeneity • Publication bias?

INDIVIDUAL ULTRASOUND STUDIES • High accuracy • Needs a (highly) trained operator (expensive)

DVT DECISION MODEL OBJECTIVE:To evaluate the cost-effectiveness of diagnostic strategies for DVT In “real” evaluation: • Literature review: 16 diagnostic strategies • NHS survey: 11 additional strategies • Theoretical: 5 additional “strategies” • 32 possible options using combinations of tests For illustration purposes, evaluating (singularly): • Ultrasound v. D-dimer v. Nothing (no treat) • Structure of model post test slightly simplified also

THE MODEL

THEORETICAL POPULATION • 1000 patients with suspected DVT • 150 assumed to have (proximal) DVT • Mean age 60 years • 60% female

WHICH DIAGNOSTIC META-ANALYSIS METHOD TO USE FOR DECISION MODEL? 1) Independent Sensitivity & Specificity? D-dimer Ultrasound

WHICH DIAGNOSTIC META-ANALYSIS METHOD TO USE FOR DECISION MODEL? 2) Asymmetric SROC based on regression? D-dimer Ultrasound

WHICH POINT ON SROC CURVE SHOULD BE USED? • Evaluate decision model along curve to identify specificity and sensitivity combinations which maximise net benefit (Rceffect – cost, where Rc=decision-makers willingness to pay per additional QALY) • Threshold may change with Rc • Compare tests using these thresholds

Willingness to Ddimer Ultrasound pay per Specificity Sensitivity Specificity Sensitivity additional QALY £ 0 0.01 1. 00 0.01 1.00 £ 5,000 0.01 1.00 0.60 0.96 £ 10,000 0.01 1.00 0.70 0.96 £ 15,000 0.01 1.00 0.70 0.96 £ 20,000 0.01 1.00 0.80 0.94 £ 25,000 0.01 1.00 0.80 0.94 £ 30,000 0.01 1.00 0.80 0.94 £ 50,00 0 0.50 0.93 0.80 0.94 WHICH POINT ON SROC CURVE SHOULD BE USED? • Evaluate decision model along curve to identify specificity and sensitivity combinations which maximise net benefit (Rceffect – cost, where Rc=decision-makers willingness to pay per additional QALY) • Compare tests using these thresholds

Alex Sutton & Nicola Cooper Centre for Biostatistics and Genetic Epidemiology,