600 likes | 845 Views
Grading evidence and recommendations: Starting with GRADE basics vs. utilizing the full framework. AHRQ Annual Meeting 2010: “Better Care, Better Health: Delivering on Quality for All Americans" September 28, 2010 Yngve Falck-Ytter , M.D. Associate Professor of Medicine
E N D
Grading evidence and recommendations: Starting with GRADE basics vs. utilizing the full framework AHRQ Annual Meeting 2010:“Better Care, Better Health: Delivering on Quality for All Americans" September 28, 2010 Yngve Falck-Ytter, M.D. Associate Professor of Medicine Case Western Reserve University, Cleveland, Ohio HolgerSchünemann, M.D., Ph.D. Chair, Department of Clinical Epidemiology & Biostatistics Michael Gent Chair in Healthcare Research McMaster University, Hamilton, Canada
Disclosures In the past 5 years, Dr. Falck-Ytter received no personal payments for services from industry. His research group received research grants from Three Rivers, Valeant and Roche that were deposited into non-profit research accounts. He is a member of the GRADE working group which has received funding from various governmental entities in the US and Europe, such as the AHRQ. Some of the GRADE work he has done is supported in part by grant # 1 R13 HS016880-01 from the Agency for Healthcare Research and Quality (AHRQ).
Content Part 1 • A 7 minute version of GRADE Part 2 • Rapid interactive exchange contrasting GRADE basic vs. the full GRADE approach • Advantages of a structured approach • Asking good clinical questions • Systematic review vs. ad hoc approaches • Grading the quality of evidence • How to determine the strength of recommendations
Question to the audience Decisions in your medical practice are based on: • Training, experience and knowledge of respected colleagues • Patient preferences • Convincing evidence (non experimental) from case reports, case series, disease mechanism • RCTs, systematic reviews of RCTs and meta-analyses • All of the above
Evidence-based clinical decisions Clinical circumstances Patient values and preferences Expertise Research evidence Haynes et al. 2002
A real world example… P: In patients with acute hepatitis C …I : Should anti-viral treatment be used … C: Compared to no treatment … O: To achieve viral clearance? Evidence Recommendation Organization B Class I AASLD (2009) II-1 “Should be initiated…” VA (2006) 1+ A SIGN (2006) -/- -/- “Most authorities…” B “It works…” AGA (2006) AWMF(2004)
Question to the audience By now… • …you are thoroughly confused • …you send her to a doctor because treatment is recommended • …you send her to a doctor but she can expect that, according to guidelines, she will not be treated • …you look at the evidence yourself because past experience tells you that guidelines don’t help
GRADE is outcome-centric Outcome #1 Quality: High Outcome #2 Quality: Moderate Outcome #3 Quality: Low I B II V III Old system GRADE
Create evidence profile with GRADEpro Summary of findings & estimate of effect for each outcome Guideline development Rate overall quality of evidence across outcomes based on lowest quality of critical outcomes Rate quality of evidence for each outcome Outcomes across studies Formulate question Rate importance Select outcomes RCT start high, obs. data start low Risk of bias Inconsistency Indirectness Imprecision Publication bias P I C O Outcome Critical High Outcome Critical Moderate Grade down Low Outcome Important Very low Outcome Less important Large effect Dose response Confounders Grade up Panel • Formulate recommendations: • For or against (direction) • Strong or weak (strength) • By considering: • Quality of evidence • Balance benefits/harms • Values and preferences • Revise if necessary by considering: • Resource use (cost) Systematic review • “We recommend using…” • “We suggest using…” • “We recommend against using…” • “We suggest against using…”
Question to the audience Which question follows a well structured clinical PICO format: • What is the evidence that food allergens cause eosinophilic esophagitis? • Is it known what the evidence is that aspirin can prevent progression of dysplasia to cancer in Barrett’s esophagus? • In patients undergoing hip replacement, does warfarin compared to aspirin reduce venous thromboembolism, pulmonary embolism and mortality?
That’s an excellent question • Translating informal clinical questions into specific PICO questions = central to GRADE • Even if an organization has limited resources, taking care of this step actually saves resources: • Helps limiting your scope • Specifies the search strategy more clearly • Guides data extraction • Helps with formulating recommendations
Importance of outcomes Deciding on the importance of outcomes on decision making: 1 2 34 5 67 8 9 Less important Important Critically important P: In patients after hip replacement… I : Should warfarin rather than… C: Aspirin be given… O: To reduce symptomatic venous thromboembolism and mortality?
Question to the audience Deciding on the importance of outcomes on decision making: 1 2 34 5 67 8 9 Less important Important Critically important Please rate outcome: Dying from pulmonary embolism • (1, 2, 3): Less important for decision making • (4, 5, 6): Important for decision making • (7, 8, 9): Critically important for decision making
Question to the audience Deciding on the importance of outcomes on decision making: 1 2 34 5 67 8 9 Less important Important Critically important Asymptomatic deep vein thrombosis in the calf (e.g., as seen on mandatory venography at end of study) • (1, 2, 3): Less important for decision making • (4, 5, 6): Important for decision making • (7, 8, 9): Critically important for decision making
Question to the audience Deciding on the importance of outcomes on decision making: 1 2 34 5 67 8 9 Less important Important Critically important Stomach ulcer bleeding requiring endoscopy • (1, 2, 3): Less important for decision making • (4, 5, 6): Important for decision making • (7, 8, 9): Critically important for decision making
Question to the audience Deciding on the importance of outcomes on decision making: 1 2 34 5 67 8 9 Less important Important Critically important Regular blood work and dose adjustments • (1, 2, 3): Less important for decision making • (4, 5, 6): Important for decision making • (7, 8, 9): Critically important for decision making
Rating the importance of outcomes • Train the content expert to understand that outcomes that are critical for decision making are identified • Rating is done before, during and after the evidence review • The rating may change in light of new information
Create evidence profile with GRADEpro Summary of findings & estimate of effect for each outcome Guideline development Rate overall quality of evidence across outcomes based on lowest quality of critical outcomes Rate quality of evidence for each outcome Outcomes across studies Formulate question Rate importance Select outcomes RCT start high, obs. data start low Risk of bias Inconsistency Indirectness Imprecision Publication bias P I C O Outcome Critical High Outcome Critical Moderate Grade down Low Outcome Important Very low Outcome Less important Large effect Dose response Confounders Grade up Panel • Formulate recommendations: • For or against (direction) • Strong or weak (strength) • By considering: • Quality of evidence • Balance benefits/harms • Values and preferences • Revise if necessary by considering: • Resource use (cost) Systematic review • “We recommend using…” • “We suggest using…” • “We recommend against using…” • “We suggest against using…”
Taking it to the next level • Early involvement of consumers in the guideline development process • Selecting systematic reviews that are known to make an effort to include consumer views (e.g., Cochrane etc.) • Can be used to identify research gaps
Evidence review stage What format of evidence do you use? $$$ Using mainly systematic reviews (SR) Mainly using single study data Have the resources Don’t have the resources Ready to use SR Not ready to use SR Out-source Do it in-house Search for SR Use GRADE without evidence profiles Update SR Ad hoc reviews $ Utilize the full GRADE framework (± evidence Profiles)
Question to the audience Select the best answer: You can find high quality systematic reviews for “free” here: • AHRQ • The Cochrane Library • Canadian Agency for Drugs and Technologies in Health (CADTH) • National Institute for Clinical Excellence (NICE), UK • All of the above
Taking it to the next level • What to look for when selecting evidence review centers • Commissioning systematic reviews: Making sure the center understands GRADE requirements • What SR methodology they use • What databases they can search • What software they use • How they document their work
Question to the audience GRADE rating evidence: The quality of evidence may need downgrading if: • The outcome is reduction of elevated pressure in the eye (IOP) instead of loss of vision • There are large losses to follow-up • Some trials showing benefits, others reporting harms • The confidence interval is wide and there are few events • All of the above
Quality of evidence: beyond risk of bias Definition: The extent to which our confidence in an estimate of the treatment effect is adequate to support a particular recommendation Methodological limitations Inconsistency of results Indirectness of evidence Imprecision of results Publication bias Sources of indirectness: Indirect comparisons Patients Interventions Comparators Outcomes Risk of bias: Allocation concealment Blinding Intention-to-treat Follow-up Stopped early
Quality assessment criteria Study design Lower if… Higher if… Quality of evidence Randomized trials Study limitations (design and execution) High Moderate Inconsistency What can raise the quality of evidence? Observational studies Low Indirectness Very low Imprecision Publication bias
Question to the audience A systematic review of observational studies showed a relationship between front sleeping position (versus back position) and sudden infant death syndrome (SIDS): OR 2.93 (1.15, 7.47). Rate the quality of evidence for the outcome SIDS: • High • Moderate • Low • Very low
Question to the audience You review all colonoscopies for average risk screening in your health system and document a percentage of patient who developed a perforation after the procedure (evidence of free air on imaging). No comparison group without colonoscopy available. Rate the quality of evidencefor the outcome perforation: • High • Moderate • Low • Very low
Question to the audience Several RCTs have shown the effectiveness of natalizumab to induce remission in Crohn’s disease. Study/post-marketing data showed 31 cases of potentially lethal progressive multifocal leukoencephalopathy (PML, JC virus related). Rate the quality of evidence for PML: • High • Moderate • Low • Very low
Quality assessment criteria Study design Lower if… Higher if… Quality of evidence Randomized trials Study limitations (design and execution) Large effect (e.g., RR 0.5) Very large effect (e.g., RR 0.2) High Moderate Inconsistency Evidence of dose-response gradient Observational studies Low Indirectness All plausible confounding would reduce a demonstrated effect Very low Imprecision Publication bias
“Categories” of quality (1) High Further research is very unlikely to change our confidence in the estimate of effect Moderate Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate Low Very low Any estimate of effect is very uncertain
Conceptualizing quality (2) High We are very confident that the true effect lies close to that of the estimate of the effect. Moderate We are moderately confident in the estimate of effect: The true effect is likely to be close to the estimate of effect , but possibility to be substantially different. Our confidence in the effect is limited: The true effect may be substantially different from the estimate of the effect. Low Very low We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect.
Taking it to the next level • Advantages of systematically assessing quality of evidence • Downgrading and upgrading “on-the-fly” can introduce errors
Question to the audience PICO: Should children with otitis media be treated with antibiotics? Rate the overall quality of evidence for this clinical question by evaluating all critical outcomes (use the evidence profile): • High • Moderate • Low • Very low
Quality rating outcomes across studies Clinical question Rate importance Select outcomes High P I C O Outcome Critical Moderate Outcome Critical Grade down or up Outcome Important Overall quality of evidence Low Outcome Important Less Outcome Very low important Panel • Formulate recommendations: • For or against (direction) • Strong or weak (strength) • By considering: • Quality of evidence • Balance benefits/harms • Values and preferences • Revise if necessary by considering: • Resource use (cost)
Question to the audience PICO: Should children with otitis media be treated with antibiotics? Rate the overall strength or recommendations: • “We recommend early antibiotics in children with acute otitis media” • “We suggest early antibiotics…” • “We suggest against using antibiotics initially…” • “We recommend against using antibiotics initially…”
Strength of recommendation “The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.”
4 determinants of the strength of recommendation Factors that can weaken the strength of a recommendation Explanation • Lower quality evidence The higher the quality of evidence, the more likely is a strong recommendation. • Uncertainty about the balance of benefits versus harms and burdens The larger the difference between the desirable and undesirable consequences, the more likely a strong recommendation warranted. The smaller the net benefit and the lower certainty for that benefit, the more likely is a weak recommendation warranted. • Uncertainty or differences in patients’ values The greater the variability in values and preferences, or uncertainty in values and preferences, the more likely weak recommendation warranted. • Uncertainty about whether the net benefits are worth the costs The higher the costs of an intervention – that is, the more resources consumed – the less likely is a strong recommendation warranted.
Implications of a strong recommendation • Patients: Most people in this situation would want the recommended course of action and only a small proportion would not • Clinicians: Most patients should receive the recommended course of action • Policy makers: The recommendation can be adapted as a policy in most situations
Implications of a weak recommendation • Patients: The majority of people in this situation would want the recommended course of action, but many would not • Clinicians: Be prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making • Policy makers: There is a need for substantial debate and involvement of stakeholders
Taking it to the next level • Explicit separation of quality of evidence from making recommendations • Correctly balancing the benefits against the undesirable effects • Special challenges: resource use • Increasing transparency in the process of making recommendations
Question to the audience Should patients with chronic hepatitis C be treated with interferon/ribavirin combination? There is high quality evidence for benefits and high quality evidence for harms. Rate the overall strength or recommendations: • “We recommend treatment of chronic hepatitis C” • “We suggest treatment…” • “We suggest against treating patients…” • “We recommend against treating patients…”
Patient values & preferences • In the absence of evidence, guideline panels have to function as surrogates to estimate values and preferences (V&P) • Consumer involvement can help • Attaching V&P statements to guideline recommendations increases transparency
Taking it to the next level • Systematically searching the literature for studies of values and preferences • Systematic reviews of V&P • Querying the guideline panel to rate health utilities of outcomes using case scenarios
Question to the audience Please select the most appropriate answer. The reason you attended this session: • Just interested in the topic • Have been involved in narrative evidence reviews, but have not used any formal grading system • Have used a grading system but not GRADE • Using or considered using GRADE
Question to the audience Please select the most appropriate answer. Selecting a system to rate the quality of evidence and strength of recommendations, such as GRADE: • Appears too expensive to implement • Appears valuable, but still requires substantial upfront expense • Appears to have some upfront cost but long-term savings • I use GRADE – it has been paying off for me
Basic dimensions Guideline work aligns along 3 basic dimensions • High quality vs. low quality • Fast vs. slow • Expensive vs. cheap
Ideal vs. practical ad hoc GRADE approaches Stage Elements Advantage Comment Ideal Systematic review GRADE eTables Qual. of evidence Strength of rec. Follows highest standards Methodolog. most rigorous Easily maintainable Fully transparent process Access to methodologist Access to evidence centers Initially more resource intensive, long-term savings Inter-mediary Ad hoc review GRADE eTables Qual. of evidence Strength of rec. Still retaining major advantages of the of the “ideal approach” Risk of bias higher Access methodologist rec. Only minimal addl. cost Initiation Ad hoc review GRADE eTables Qual. of evidence Strength of rec. Option to fully “upgrade” to an “ideal approach” Foundation of a methodo-logically sound system Risk of bias higher Access methodologist prn No additional cost
Sources of funding • Funders may have an agenda • Industry – tricky • Foundations • Public – AHRQ, criteria • EHC program fit (3: available, relevance for public payer, priority condition) • Importance (7: e.g., public interest etc.) • No duplication • Feasibility • Impact (6: e.g., addresses inequity)