420 likes | 628 Views
Systematic Reviews of Test Accuracy. Paul Glasziou Screening and Test Evaluation Program School of Population Health University of Queensland. Diagnosis – what is the problem?. Log of reasons by several docs: Monitoring – has it changed? Prognosis – risk/stage within Dx
E N D
Systematic Reviews of Test Accuracy Paul Glasziou Screening and Test Evaluation Program School of Population Health University of Queensland
Diagnosis – what is the problem? Log of reasons by several docs: Monitoring – has it changed? Prognosis – risk/stage within Dx Treatment planning, e.g., location Stalling for time! What are tests used for?
Evidence for inkplot test • A promising test, inkblots, has been developed for the diagnosis of blotitis. Some local doctors have been using it (off-label). • In your journal reading you come across a study which shows inkblots have a 80% sensitivity but only 20% specificity. • What do you do/think now?
Is the test helpful?The Youden Index • Youden Index = sensitivity+specificity-1 • For a test to be useful, then • sensitivity + specificity > 1 (Youden Index > 0) • Examples: • Coin Toss with +ve = "heads"sensitivity = 0.5 specificity = 0.5 • Youden = 0
ROC plot of 2-coin toss H, H H, T T, H H, H H, H H, T
2. Find All studies 3. Assess Synthesise SYSTEMATIC Review: Steps 1. Formulate question(s) 4. Applicability analysis
EBM Steps Answerable Question Search Appraise Apply Time: 30 seconds Systematic Review Steps Answerable Question Search ++++ Appraise x 2 Synthesize Apply Time: 6 months EBM and Systematic Review
Systematics Reviews • Finding • Electronic search • Supplementary search • Appraising • Quality Assessment • Selection & extraction • Synthesis • Summary Table • Plots: summary & diagnostic • Summary estimators
YOUR MISSION: Are exercise scans accurate for CAD? • Are exercise scans (e.g, thallium) accurate in predicting coronary artery disease? • Search strategy • Inclusion/exclusion criteria • ? • Synthesis of selected studies • ?
FINDING all Studies • Is there an existing systematic review? • Electronic Search • Initial Search • MEDLINE • Other databases: EMBASE, CINAHL, CCTR, ... • Further search • Check references of relevant papers & reviews and • Find terms (words or MeSH terms) you didn’t use • Search again! (snowballing) • Supplementary search • Hand search • Write to researchers
Problems with searching • Finding overpublished work • Duplicate publications common • Finding unpublished work • Negative trials unpublished?
Publication Bias: the problem • Negative studies less likely to be published than ‘Positive’ • How does this happen? • Follow-up of 737 studies at Johns Hopkins (Dickersin, JAMA, 1992) • Positive SUBMITTED more than negative (2.5 times)
Systematics Reviews • Finding • Electronic search • Supplementary search • Appraising • Quality Assessment • Selection & extraction • Synthesis • Summary Table • Plots: summary & diagnostic • Summary estimators
Selective Criticism of EvidenceBiased appraisal increases polarization Lord et al, J Pers Soc Psy, 1979, p2098
Selective Criticism of Evidence 28 reviewers assessed one “study” results randomly positive or negative (Cog Ther Res, 1977, p161-75)
Assessment of Quality and Selection of Studies • Quality varies, therefore Standardized Assessment (?blind*) Group/Rank by quality • Select a threshold, e.g. all prospective studies with blind reading of reference and index tests. * assessment of quality blind to study outcome
Diagnostic Accuracy Study: Basic Design Series of patients Index test Reference standard Blinded cross-classification
Spectrum Bias Selected Patients Index test Reference standard Blinded cross-classification
Verification Bias Series of patients Index test Reference standard Blinded cross-classification
Differential Reference Bias Series of patients Index test Ref. Std A Ref. Std. B Blinded cross-classification
Observer Bias Series of patients Index test Reference standard Unblinded cross-classification
“Case-control” design HF patients controls Index test Blinded cross-classification
Assessing a Study of a Test (Jaeschke et al, JAMA, 1994, 271: 389-91) Was an appropriate spectrum of patients included? (Spectrum Bias) All patients subjected to a Gold Standard? (Verification Bias) Was there an independent, "blind" comparison with a Gold Standard? Observer Bias; Differential Reference Bias Methods described so you could repeat test?
Empirical Effects of Bias Lijmer JG et al. JAMA 1999;282:1062-1067
How well are diagnostic studies reported? • 112 studies in 4 major journals (1978-1993) Standard N (%) Spectrum composition 30 (27) Avoidance of workup bias 51 (46) Avoidance of review bias 43 (38) Test accuracy precision 12 (11) Indeterminate test results 26 (23) Test reproducibility 26 (23) Accuracy in subgroups 9 (8) Reid MC, Lachs MS, Feinstein AR. Use of Methodological Standard in diagnostic test research. JAMA 1995;274:645-651
Steering Committee: Bossuyt, Bruns, Gatsonsis, Glasziou, Irwig, Lijmer, Moher, Rennie, de Viet
Systematics Reviews • Finding • Electronic search • Supplementary search • Appraising • Quality Assessment • Selection & extraction • Synthesis • Summary Table • Plots: summary & diagnostic • Summary estimators
Are the studies consistent? • Are variations in results between studies consistent with chance? (Test of homogeneity: has low power) • If NO, then WHY? • Variation in study methods (biases) • Variation in intervention • Variation in outcome measure (e.g. timing) • Variation in population
Present absolute numbers for test results Distribution of plasma concentrations of B type natriuretic peptide in normal elderly people and in those with left ventricular systolic dysfunction confirmed by echocardiography BMJ, 2000; 320: 906-8.
Appropriate plots, e.g., ROC curve Receiver operator characteristic curve for plasma B type natriuretic peptide normal elderly people and in those with left ventricular systolic dysfunction confirmed by echocardiography BMJ, 2000; 320: 906-8.
“… doing a meta-analysis is easy, doing one well is hard.” Ingram Olkin
Organ Feature Agreement Kappa Reference Rectal Cancer Grading 50% to 69% 0.11 to 0.5 Thomas Hodgkins Classification 56% 0.44 Holman Melanoma depth 82%; 64% 0.68; 0.23 Breslow; Clark Breast cancer classification 73% 0.46 Stenkvist Reproducibility:Agreement of histopathologists Ken Fleming, Evidence-based pathology. EBM 1997
Registered vs Published StudiesOvarian Cancer chemotherapy: single v combined Simes, J. Clin Oncol, 86, p1529
Registered vs Published StudiesOvarian Cancer chemotherapy: single v combined Simes, J. Clin Oncol, 86, p1529
Publication Bias: Solution • All trials registered at inception, • The National Clinical Trials Registry: Cancer Trials • National Institutes of Health Inventory of Clinical Trials and Studies • International Registry of Perinatal Trials • etc (see Cochrane Handbook) • Unethical NOT to make results available • Whether published or not, data submitted to Registry