500 likes | 612 Views
Evidence Based Medicine in the Golden Age January 2007. Joshua T. Napier, MD CPT, MC, USA US Army Clinical Consultant, DoD Pharmacoeconomic Center. Overview. Rationale for evidence based medicine Assessing quality of a paper Using available resources Statistics Conclusion.
E N D
Evidence Based Medicine in the Golden AgeJanuary 2007 Joshua T. Napier, MD CPT, MC, USA US Army Clinical Consultant, DoD Pharmacoeconomic Center
Overview • Rationale for evidence based medicine • Assessing quality of a paper • Using available resources • Statistics • Conclusion
Rationale for Evidence Based Medicine • In one study researchers found that clinical questions arose 3.2 times for every 10 patients seen, answer only sought on 64%1 • Physicians traditionally spend less than 2 minutes seeking an answer to a question. Most common sources—fellow physicians or other colleague, drug references or text • Medical myths are propagated from as far back as medical school, become truisms • Never use Beta-blockers in patients with CHF • Replacement for vitamin B12 deficiency caused by pernicious anemia must not be done orally • Patching the eye improves comfort and healing in patients with corneal abrasions • Patients with IVC filter should be on 1mg coumadin daily • Practicing in “The Golden Age” of medical information 1 Ely et al BMJ 1999; 319:358-61
1995 Style Clinical question Physician conducts literature search yielding multiple articles Physician selects the best articles, evaluates the research, determines validity of research, and decides what to do Patient expires in waiting room Few physicians had time or will to conduct this on regular basis 2007 Style Clinical question Physician utilizes one of many validated, updated online, or other electronic resources Physician finds patient-oriented evidence in the form of an evidence based clinical practice guideline, an outcomes based systematic review or meta-analysis based on published literature Physician puts information into practice EBM 1995 vs EBM 2007 Difference: Availability and use of quality work that has already been done!
Important Points to Remember in Evaluating Evidence • Assessing quality of what is read • Hierarchy of published evidence • Meta analyses and systematic reviews • Randomized, double-blinded, placebo-controlled trials • Cohort studies • Case control studies • Cross-sectional surveys • Case reports
Assessing Quality of LiteratureEvidence Based Medicine (EBM) Principles • Helps to assess the quality of clinical evidence • rigorous study design (randomization, blindedness, equal treatment of all study groups, fair comparison) • appropriate statistical tests (baseline adjustments) • inclusion of patients similar to yours • appropriate inclusion and exclusion criteria • appropriate objective clinical endpoints • appropriate definition of adverse events • valid conclusions based on studied outcomes and statistical tests
Assessing Quality of LiteratureCommon Reasons Why Journals Reject Papers • The study did not address an important scientific issue • The study was not original (someone else had already done the same or a similar study) • The study did not actually test the authors' hypothesis • A different type of study should have been done • Practical difficulties (in recruiting subjects, for example) led the authors to compromise on the original study protocol • The sample size was too small • The study was uncontrolled or inadequately controlled • The statistical analysis was incorrect or inappropriate • The authors drew unjustified conclusions from their data • There is a significant conflict of interest (one of the authors, or a sponsor, might benefit financially from the publication of the paper and insufficient safeguards were seen to be in place to guard against bias) • The paper is so badly written that it is incomprehensible
Hierarchy of EvidenceMeta-analyses and Systemic Reviews • Meta-analyses or systematic reviews • Condense the results of several trials (10s-100s) into an understandable format • Can determine differences in efficacy, safety between interventions (drugs, procedures) • Treatment effect demonstrated as odds ratio or graphed on Forrest plot to show differences • Or results shown as “number needed to treat” (NNT) • NNT: # of pts needed to be treated with the drug to avoid one outcome. (1/ARR—see later stats section) Most treatments commonly used have NNT of 100 or less • Ex. NNT of 25 for ramipril for pts with high CV risk: for every 25 pts treated with ramipril, avoid one adverse CV event (death, MI, stroke) • Popular sources Cochrane, UK’s NICE, AHRQ, OHSU’s DERP
Meta-analysis of Placebo-Controlled TrialsSleep Latency at One Week, Eszopiclone vs. Zolpidem Eszopiclone 1 mg 2 mg 3 mg Zolpidem 5 mg 10 mg Favors treatment Source: DERP Review
Hierarchy of EvidenceMeta-analyses and Systemic reviews • Limitations of Meta-analyses and other systematic reviews • Not always available, especially for new treatments/questions • Cochrane reviews are available for gabapentin in diabetic peripheral neuropathic pain, post-herpetic neuralgia, and seizures, but there are no reviews for pregabalin • Trials may include widely differing patient populations, background level of health, inclusion/exclusion criteria, and other differences in study designs—may make true comparisons difficult • May include individual studies that are not of good quality
Hierarchy of EvidenceRandomized, double-blind controlled trials • Meta-analyses • Randomized, double-blind, controlled trials • Surprises may occur when treatments are randomly assigned, vs conscious decisions by clinicians or patients • Clinical outcomes result from many causes, treatment is only one of them • Underlying severity of illness • Presence of comorbid conditions • Known and unknown prognostic factors • These in aggregate tend to swamp any effect of therapy, and influence the decision to offer treatment at issue • Non-randomized studies of efficacy limited in distinguishing useful from useless or harmful therapy • Studies where treatment is not allocated randomly tend to show larger (usually “false positive”) treatment effects vs randomized • Randomization assures (if sample is large enough) that both known and unknown determinants of outcome are evenly distributed between groups so outcome is clinically generalizable
Advantages Allows rigorous evaluation of a single variable (effect of drug treatment versus placebo, for example) in a precisely defined patient group (postmenopausal women aged 50-60 years) Prospective design (data are collected on events that happen after you decide to do the study) Uses hypotheticodeductive reasoning (seeks to falsify, rather than confirm, its own hypothesis) Potentially eradicates bias by comparing two otherwise identical groups Allows for meta-analysis (combining the numerical results of several similar trials at a later date) Disadvantages Expensive and time consuming, hence: Many RCTs are either never done, performed on too few patients, or undertaken for too short a period Most are funded by large research bodies (university or government sponsored) or drug companies, who ultimately dictate the research agenda Surrogate endpoints often used in preference to clinical outcome measures may introduce "hidden bias," especially through: Imperfect randomization Failure to blind assessors to randomization status Hierarchy of EvidenceRandomized, double-blinded controlled trials
Fig 1 Sources of bias to check for in a randomised controlled trial Greenhalgh, T. BMJ 1997;315:305-308
Randomized Well Controlled ResearchMyths Dispelled • Extracranial-intracranial bypass (anastomosis of a branch of the external carotid artery, to a branch of the internal carotid artery • Performed to prevent strokes in patients whose symptomatic cerebrovascular disease was not surgically accessible • Evidence for procedure: comparison of outcomes among non-randomized cohorts—it appeared that surgical patients fared much better • Large multicenter randomized trial: only effect of surgery—patients worse off in immediate post op period, no benefit in long term outcome • Other surprises from randomized, controlled trials: • Steroids may increase mortality in sepsis • Steroid injections do not ameliorate facet-joint back pain • Plasmapheresis does not benefit pts with polymyositis
Hierarchy of Levels of EvidenceCohort Studies, Case-Control Studies • Meta-analyses • Randomized, double-blind controlled trials • Cohort studies • Two groups chosen based on exposure to something (vaccine, drug, treatment etc) and followed up to determine outcome • Follow up usually in years • Great way to look at natural course of conditions or disease • No control group, may have selection bias • Case-control • Cohort identified (disease, exposure etc) and “matched” with similar control group without particular exposure • Usually concerned with etiology • Useful for studying rare conditions • Hard to show causality: Association of A with B does not prove that A caused B
Hierarchy of EvidenceCross sectional Surveys, Case Reports • Meta-analysis and Systemic Reviews • Randomized, double blinded controlled trials • Cohort studies • Case control studies • Cross sectional Surveys • representative sample of subjects (or patients) interviewed, examined, or otherwise studied to gain answers to a specific clinical question. Data collected at a single time but may refer retrospectively to experiences in the past—eg. study of charts to see how often patients' blood pressure has been recorded in the past five years. • Case reports/case series and anecdotal reports: lowest level of evidence • Important to direct future work, raise concern (two cases reported of phocomelia in infants whose mothers used thalidomide) • not peer-reviewed, often only incomplete data available • Case reports are not valid, since only a few patients evaluated • Modafinil in chronic fatigue syndrome: One case report of success in one patient, vs RDPCT in 50 patients that showed no statistically significant improvement in validated outcomes
Are the results valid? Was the assignment of patients to treatment randomized? Were all patients who entered the trial properly accounted for and attributed at its conclusion? Was follow-up complete? Were patients analyzed in the groups to which they were randomized ? Intention to treat analysis? Were patients, their clinicians and study personnel 'blind' to treatment? Were the groups similar at the start of the trial? Baseline prognostic factors (demographics, co-morbidity, disease severity, other known confounders) balanced? If different, were these adjusted for? Aside from the experimental intervention, were the groups treated equally? Co-intervention? Contamination? Compliance? What are the results? How large is the treatment effect? Absolute risk reduction? Relative risk reduction? Did the study have a sufficiently large sample size? How precise is the estimate of the treatment effect? Confidence intervals? Will the results help me in patient care? Can the results be applied to my patients? Patients similar for demographics, severity, co-morbidity and other prognostic factors? Compelling reason why they should not be applied? Were all clinically relevant outcomes considered? Are surrogate endpoints valid? Are the benefits worth the harms and costs? NNT for different outcomes? EBM ChecklistsArticle About Therapy
Are the results valid? Was there an independent, blind comparison with a reference standard? Is reference standard used acceptable? Were both reference standard and test applied to all patients? Did the patient sample include an appropriate spectrum of patients to whom the test will be applied? Did the results of the test being evaluated influence the decision to perform the reference standard? “Verification” or "work-up" bias? Were the test's methods described clearly enough to permit replication? Preparation of patient? Performance of test? Analysis and interpretation of results? Did the study have a sufficiently large sample size? What are the results? What are the likelihood ratios for the test results? Will the results help me in patient care? Will the test be reproducible and well interpreted in my practice setting? Are the results applicable to my patients? Similar distribution of disease severity? Similar distribution of competing diseases? Compelling reasons why the results should not be applied? Will the test results change my management? Test and treatment thresholds? High or low LR's? Will my patients be better off because of the test? Is target disorder dangerous if left undiagnosed? Is test risk acceptable? Does effective treatment exist? Will information from test lead to change of management beneficial to patient? EBM ChecklistsArticle About Diagnostic Tests
Are the results valid? Did the overview address a focused clinical question? Patients? Exposures? Outcomes? Therapy? Causation? Diagnosis? Prognosis? Were the criteria used to select articles for inclusion appropriate? Patients? Exposures? Outcomes? Methodological standards? Is it unlikely that important, relevant studies were missed? Bibliographic databases? Reference lists? Personal contacts? Was the validity of the included studies appraised? Validity criteria? Were assessments of studies reproducible? Blinded reviewers? Inter-observer agreement? Were the results similar from study to study? Tests of homogeneity? What are the results? What are the overall results of the review? Overall ORs,RRs? Weighting of studies? How precise were the results? Confidence intervals? Did the study have a sufficiently large sample size? Will the results help me in patient care? Can the results be applied to my patients? Patients similar for demographics, severity, co-morbidity and other prognostic factors? Compelling reason why they should not be applied? Were all clinically relevant outcomes considered? Are substitute endpoints valid? Are the benefits worth the harms and costs? NNT for different outcomes? EBM ChecklistsMeta-Analyses and Systematic Reviews
Statistics Commonly Used • Relative Risk (RR)= EER/CER • Relative Risk Reduction (RRR)= CER-EER/CER • Absolute Risk Reduction (ARR)= CER-EER • Number Needed to Treat (NNT)= 1/ARR • Sensitivity= TP/TP+FN Specificity=TN/TN+FP • Negative Likelihood Ratio (-LR)=1-sensitivity/specificity • Positive Likelihood Ratio (+LR)= sensitivity/1-specificity • The likelihood ratio incorporates both the sensitivity and specificity of the test and provides a direct estimate of how much a test result will change the odds of having a disease. The likelihood ratio for a positive result (LR+) tells you how much the odds of the disease increase when a test is positive. The likelihood ratio for a negative result (LR-) tells you how much the odds of the disease decrease when a test is negative. • You combine the likelihood ratio with information about • prevalence of the disease, • characteristics of your patient pool, and • information about this particular patient to determine the post-test odds of disease. • If you want to quantify the effect of a diagnostic test, you need to specify the pre-test odds: usually related to the prevalence of the disease in the population, though you might adjust it upwards or downwards depending on characteristics of your overall patient pool or of the individual patient. • Once you have specified the pre-test odds, you multiply them by the likelihood ratio. This gives you the post-test odds.
Conclusion • Physicians have been using evidence to make medical decisions since the early 1900s, the problem was that the research wasn’t always good • Well established technique for critical appraisal of published literature of all types • Variety of available resources that professionally evaluate evidence, most available online • Use the quality work that has already been done