630 likes | 954 Views
Systematic Reviews of Diagnostic Studies. Guides for appraisal. Acknowledgements: Paul Glasziou, Jon Deeks, Madhukar Pai, Patrick Bossuyt, and Matthias Egger. Information Overload. 20,000 biomedical periodicals (6M articles) 17,000 biomedical books annually 30,000 recognized diseases
E N D
Systematic Reviews of Diagnostic Studies Guides for appraisal Acknowledgements: Paul Glasziou, Jon Deeks, Madhukar Pai, Patrick Bossuyt, and Matthias Egger.
Information Overload • 20,000 biomedical periodicals (6M articles) • 17,000 biomedical books annually • 30,000 recognized diseases • 15,000 therapeutic agents (250/yr) • MEDLINE • 4,000 journals surveyed • 11,000,000 citations • 1.27 million articles related to oncology • 35,000 articles related to ear, nose, or throat surgery
What makes a Review “Systematic”? • Based on a clearly formulated question • Identifies relevant studies • Appraises quality of studies • Summarizes evidence by use of explicit methodology • Comments based on evidence gathered
Origin of Clinical Questions Diagnosis: how to select and interpret diagnostic tests Prognosis: how to anticipate the patient’s likely course Therapy: how to select treatments that do more good than harm Prevention: how to screen and reduce the risk for disease
ROAD MAP FOR DIAGNOSTIC REVIEWS
Steps in a Systematic Review • Framing the Question (Q) • Identifying relevant publications (F) • Assessing Study quality (A) • Summarising Evidence and interpreting finding (S)
Step 1- Framing the Question (Q) • Clear, unambiguous, structured question • Questions formulated re: PPICO • Populations of interest • Prior test(s) (if appropriate) • Intervention • Comparisons (if appropriate) • Outcomes
Unstructured Question • Is cervicovaginal fetal fibronectin useful? • For what? • For whom? • What is meant by “useful”?
Structured Question Test (Intervention) Does a positive cervicovaginal fetal fibronectin test predict spontaneous preterm birth in asymptomatic women? Outcome Patient
Step 2 – Identifying relevant publications (F) • Wide search of medical/scientific databases • Medline • Cochrane Reviews • Ovid • Relevance to focused question PPICO • Population • Prior test • Intervention • Comparator • Outcome
Publication and reporting biases All studies conducted All studies published • Positive Results Bias • Grey Literature Bias • Time-Lag Bias • Language and Country Bias • Multiple Publication Bias • Selective Citation Bias • Database Indexing Bias • Selective Outcome Reporting B. Grey literature Studies reviewed Health Technology Assessment, 2000; 4(10):1-115
Registered vs. Published StudiesOvarian Cancer chemotherapy: single v combined Simes, J. Clin Oncol, 86, p1529
Search filters for diagnostic studies 0 .2 .4 .6 .8 1 1 1 .9 .9 .8 .8 Sensitivity 0 0 0 .2 .4 .6 .8 1 1-Specificity Lucas Bachmann No search filter: 39 studies retrieved
0 .2 .4 .6 .8 1 1 1 .9 .9 .8 .8 Sensitivity 0 0 0 .2 .4 .6 .8 1 1-Specificity With search filter: 12 studies retrieved (27 missed) Lucas Bachmann
Assessment of Study Quality (A) • Quality varies, therefore Standardized Assessment (?blind*) Group/Rank by quality • Select a threshold, e.g. all prospective studies with blind reading of reference and index tests. * assessment of quality blind to study outcome
Quality Score: Mammals example • In natural habitat (No = 0; Yes = 1) • Setting • Whole animals (No = 0; Yes = 1) • Complete information • Photographs (No = 0; Yes = 1) • Level of evidence
Exercise I: Study Quality 3 1 1 2 3 1 3 3 1 2 1 1 3 1 2
Assessing a Study of a Test (Jaeschke et al, JAMA, 1994, 271: 389-91) Was an appropriate spectrum of patients included? (Spectrum Bias) All patients subjected to a Gold Standard? (Verification Bias) Was there an independent, "blind" comparison with a Gold Standard? Observer Bias; Differential Reference Bias Methods described so you could repeat test?
Diagnostic Accuracy Study: Basic Design Series of patients Index test Reference standard Blinded cross-classification
Spectrum Bias Selected Patients Index test Reference standard Blinded cross-classification
Verification Bias Series of patients Index test Reference standard Blinded cross-classification
Differential Reference Bias Series of patients Index test Ref. Std A Ref. Std. B Blinded cross-classification
Observer Bias Series of patients Index test Reference standard Unblinded cross-classification
“Case-control” design HF patients controls Index test Blinded cross-classification
Empirical Effects of Bias Lijmer JG et al. JAMA 1999;282:1062-1067
Step 4 – Summarising the Evidence (S) • Extracting data from trials • Combining data – Meta analysis • Does it make sense to combine?
What is a meta-analysis? • A way to calculate an average • Estimates an ‘average’ or ‘common’ effect • Improves the precision of an estimate by using all available data
What is a meta-analysis? Optional part of a systematic review Systematic reviews Meta-analyses
Threshold effects Decreasing threshold increases sensitivity but decreases specificity Increasing threshold increases specificity but decreases sensitivity
Accuracy effects Over-estimation of accuracy e.g. spectrum bias Under-estimation of accuracy e.g. poor reference standard
Spectrum effects Variation in the diseased study participants Variation in the non-diseased study participants
Methods of Meta-analysis • Separate pooling of sensitivity and specificity (and likelihood ratios) • Inappropriate when highly heterogeneous • Underestimates if there is heterogeneity in threshold
Methods of Meta-analysis • Creation of Summary ROC • DOR often reasonably consistent across studies • Deals with variation in threshold • Moses/Littenberg – allows for trends in DOR with threshold • Difficult to interpret a unique operating point • More advanced methods (HSROC, bivariate normal, random effects) estimate variability and uncertainty in values • Investigate why studies have different results
Variation in studies:Fetal fibronectin in asymptomatic women SROC (95%CI) predicting 37 weeks’ gestation (28 studies)
Does it make sense to combine? • Do we need studies to be exactly the same? • When can we say we are measuring the same thing?
Are the studies consistent? • Are variations in results between studies consistent with chance? (Test of homogeneity: has low power) • If NO, then WHY? • Variation in study methods (biases) • Variation in intervention • Variation in outcome measure (e.g. timing) • Variation in population
Exercise II: Fetal fibronectin for predicting spontaneous preterm birth • Q – Clearly focused question?
Exercise II: Fetal fibronectin for predicting spontaneous preterm birth • Objective: • To determine the accuracy with which a cervicovaginal fetal fibronectin test predicts spontaneous preterm birth in women with or without symptoms of preterm labour. Q
Exercise II: Fetal fibronectin for predicting spontaneous preterm birth • Q – Clearly focused question? • F – Found all available evidence?
Exercise II: Fetal fibronectin for predicting spontaneous preterm birth • Electronic Search: Medline (19662000), Embase (19802000), PASCAL (19732001), BIOSIS (19692001), the Cochrane Library (2000:4), MEDION (19742000), National Research Register (2000:4), SCISEARCH (19742001), and conference papers (19732000). • Grey literature: Contacted individual experts and manufacturer of fetal fibronectin test. • Cross-checking: Checked reference lists of known reviews and primary articles toidentify cited articles not captured by electronic searches. F
Exercise II: Fetal fibronectin for predicting spontaneous preterm birth • Q – Clearly focused question? • F – Found all available evidence? • A – Studies are critically appraised?
Exercise II: Fetal fibronectin for predicting spontaneous preterm birth A