1 / 102

Plan

Plan. GRADE background certainty in evidence (quality, confidence evidence) evidence profiles strength of recommendation exercises in applying GRADE. experience participating guideline panels? clin epi methodology course? is grading recommendations a good idea? If so, why?

anka
Download Presentation

Plan

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Plan • GRADE background • certainty in evidence (quality, confidence evidence) • evidence profiles • strength of recommendation • exercises in applying GRADE

  2. experience participating guideline panels? clin epi methodology course? is grading recommendations a good idea? If so, why? experience with grading systems used?

  3. Grading good idea, but which grading system to use? • many available • Australian National and MRC • Oxford Center for Evidence-based Medicine • Scottish Intercollegiate Guidelines (SIGN) • US Preventative Services Task Force • American professional organizations • AHA/ACC, ACCP, AAP, Endocrine society, etc.... • cause of confusion, dismay

  4. Common international grading system? • GRADE (Grades of recommendation, assessment, development and evaluation) • international group • Australian NMRC, SIGN, USPSTF, WHO, NICE, Oxford CEBM, CDC, CC • ~ 35 meetings over last 14 years • (~10 – 70 attendants)

  5. GRADE GUIDANCE • 2004 BMJ, first description • 2008 BMJ six part series • for guideline users • 2010-13, 21 part series, 15 published • for systematic review authors, HTA practitioners, guideline developers

  6. Grading system – for what? • interventions • management strategy 1 versus 2 • what grade is not about • individual studies (body of evidence)

  7. What GRADE is not primarily about • diagnostic accuracy questions • in patients with a sore leg, what is the accuracy of a blood test (D-Dimer) in sorting out whether a deep venous thrombosis is the cause of the pain • prognosis • what it is about: diagnostic impact • are patients better off (improved outcomes) when doctors use the d-dimer test

  8. 80+ Organizations 2008 2010 2006 2005 2007 2009 2011 9

  9. GRADE uptake

  10. What are we grading? two components certainty in estimate of effect adequate to support decision (quality of body of evidence) high, moderate, low, very low

  11. Likelihood of and confidence in an outcome

  12. Semantic Issue: Label for trustworthiness • Quality • Initial choice, defined as confidence • natural to clinicians, but confusion with risk of bias • Confidence • what we actually mean, but confusion with confidence intervals, and experts always confident • Certainty • avoids confusion of others, experts might acknowledge uncertainty - Current preferred term

  13. What are we grading? • two components • certainty in evidence adequate to support decision (quality of body of evidence) • high, moderate, low, very low • strength of recommendation • strong and weak • weak alternatives • conditional, contingent, discretionary

  14. Studies S1 S2 S3 S4 S5 Health Care Question (PICO) Systematic reviews Outcomes OC1 OC2 OC3 OC4 Important outcomes Critical outcomes OC1 OC2 OC3 OC4 Generate an estimate of effect for each outcome Rate the quality of evidence for each outcome, across studies RCTs start high, observational studies start low (-) Study limitations Imprecision Inconsistency of results Indirectness of evidence Publication bias likely Final rating of quality for each outcome: high, moderate, low, or very low (+) Large magnitude of effect Dose response Plausible confounders would ↓ effect when an effect is present or ↑ effect if effect is absent Rate overall quality of evidence (lowest quality among critical outcomes) Decide on the direction (for/against) and grade strength (strong/weak*) of the recommendation considering: Quality of the evidence Balance of desirable/undesirable outcomes Values and preferences Decide if any revision of direction or strength is necessary considering: Resource use *also labeled “conditional” or “discretionary”

  15. Structured question • patients: • Males over 50 presenting with fatigue, malaise and erecticle dysfunction with laboratory evidence of decreased testosterone • intervention, testosterone • comparator no testosterone • outcomes?

  16. Rating certainty • Where to start RCTs and observational studies (High, moderate, low, very low)? • Recall antioxidant vitamins • Observational studies less cancer, CV outcomes • RCTs no difference • Result observed repeatedly • What went wrong?

  17. Determinants of confidence • RCTs start high • observational studies start low • what can lower confidence? • risk of bias • inconsistency • indirectness • imprecision • publication bias

  18. Risk of Bias - RCTs • what to consider? • well established • concealment • intention to treat principle observed • blinding • completeness of follow-up • more recent • selective outcome reporting bias • Stopping early for benefit

  19. RoB – Observational Studies • what to consider? • accurate assessment of exposure • adjusted analysis for all important prognostic factors, accurately measures • accurate assessment of outcome • completeness of follow-up

  20. Risk of Bias differs – what to do? • 6 studies, 100 patients each • 3 studies low risk of bias, 3 high • rate down for risk of bias?

  21. Consistency

  22. Consistency

  23. Consistency of results • How did you decide? • Similarity of point estimates • less similar, less happy • Overlap of confidence intervals • less overlap, less happy

  24. -40 -24 -8 8 24 40 56 RRR (95% CI)

  25. Homogenous test for heterogeneity what is the p-value? what is the null hypothesis for the test for heterogeneity? Ho: RR1 = RR2 = RR3 = RR4 p=0.99 for heterogeneity

  26. Heterogeneous test for heterogeneity what is the p-value? p-value for heterogeneity < 0.001 p-value for heterogeneity < 0.001

  27. I2 Interpretation 100% Why are we pooling? 75% Very concerned 25% Only a little concerned 50% Getting concerned 0% No worries

  28. Homogenous What is the I2 ? p=0.99 for heterogeneity I2=0%

  29. Heterogeneous What is the I2 ? p-value for heterogeneity < 0.001 I2=89%

  30. Relative Risk with 95% CI for Vitamin D Non-vertebral Fractures

  31. Relative Risk with 95% CI for Vitamin D (Non-Vertebral Fractures, Dose >400)

  32. Relative Risk with 95% CI for Vitamin D (Non-Vertebral Fractures, Dose = 400)

  33. Should we believe sub-group analysis? within-study comparison? No unlikely chance Yes, p = 0.006 consistent across studies Yes one of small number a priori hypothesis with direction Yes biologically compelling Yes shall we believe sub-group analysis?

  34. Credibility of sub-group analysis no way sure thing 0 100

  35. Confidence judgments: Directness • populations • older, sicker or more co-morbidity • interventions • warfarin in trials vs clinical practice • outcomes • important versus surrogate outcomes • glucose control versus CV events

  36. Directness interested in A versus B available data A vs C, B vs C Alendronate Risedronate Placebo

  37. Imprecision • small sample size • small number of events • wide confidence intervals • uncertainty about magnitude of effect • how do you decide what is too wide? • primary criterion: • would decisions differ at ends of CI

  38. Precision • atrial fib at risk of stroke • warfarin increases serious gi bleeding • 3% per year • 1,000 patients 1 less stroke • 30 more bleeds for each stroke prevented • 1,000 patients 100 less strokes • 3 strokes prevented for each bleed • where is your threshold? • how many strokes in 100 with 3% bleeding?

  39. 1.0% 0

  40. 1.0% 0

  41. 1.0% 0

  42. 1.0% 0

  43. Example: clopidogrel or ASA? • pts with threatened stroke • RCT of clopidogrel vs ASA • 19,185 patients • ischaemic stroke, MI, or vascular death compared • 939 events (5·32%) clopidogrel • 1021 events (5·83%) with aspirin • RR 0.91 (95% CI 0.83 – 0.99) (p=0·043) • rate down for precision?

  44. Clopidogrel or ASA for threatened vascular events • RCT 19,185 patients 1.7% - 0.9 – 0.1% • RR 0.91 (95% CI 0.83 – 0.99) 1.0% 0

  45. small trials, large effect • likely to be overestimate • analogy to stopping early • lack of prognostic balance • solution: optimal information size • # of pts from conventional sample size calculation • specify control group risk, α, β, Δ

  46. Fluoroquinolone prophylaxis in neutropenia: infection-related mortality Total number of events: 47

  47. Fluoroquinolone prophylaxis in neutropenia: infection-related mortality sample size 1,002 α 0.05, β 0.20, Δ 0.25 RRR, CER 7% N = 6,000

  48. Publication bias • high likelihood could lower quality • when to suspect • number of small studies • industry sponsored

More Related