360 likes | 581 Views
Grading evidence and recommendations. 1 February 2005. Professional good intentions and plausible theories are insufficient for selecting policies and practices for protecting, promoting and restoring health . Iain Chalmers.
E N D
Grading evidence and recommendations 1 February 2005
Professional good intentions and plausibletheories areinsufficientfor selecting policies and practices for protecting, promoting and restoring health. Iain Chalmers
How can we judge the extent of our confidence that adherence to a recommendation will do more good than harm?
GRADE Grades of Recommendation Assessment, Development and Evaluation
David Atkins, chief medical officera Dana Best, assistant professorb Peter A Briss, chiefc Martin Eccles, professord Yngve Falck-Ytter, associate directore Signe Flottorp, researcherf Gordon H Guyatt, professorg Robin T Harbour, quality and information director h Margaret C Haugh, methodologisti David Henry, professorj Suzanne Hill, senior lecturerj Roman Jaeschke, clinical professork Gillian Leng, guidelines programme directorl Alessandro Liberati, professorm Nicola Magrini, directorn James Mason, professord Philippa Middleton, honorary research fellowo Jacek Mrukowicz, executive directorp Dianne O’Connell, senior epidemiologistq Andrew D Oxman, directorf Bob Phillips, associate fellowr Holger J Schünemann, associate professorgg,s Tessa Tan-Torres Edejer, medical officer/scientistt Helena Varonen, associate editoru Gunn E Vist, researcherf John W Williams Jr, associate professorv Stephanie Zaza, project directorw a) Agency for Healthcare Research and Quality, USA b) Children's National Medical Center, USA c) Centers for Disease Control and Prevention, USA d) University of Newcastle upon Tyne, UK e) German Cochrane Centre, Germany f) Norwegian Centre for Health Services, Norway g) McMaster University, Canada h) Scottish Intercollegiate Guidelines Network, UK i) Fédération Nationale des Centres de Lutte Contre le Cancer, France j) University of Newcastle, Australia k) McMaster University, Canada l) National Institute for Clinical Excellence, UK m) Università di Modena e Reggio Emilia, Italy n) Centro per la Valutazione della Efficacia della Assistenza Sanitaria, Italy o) Australasian Cochrane Centre, Australia p) Polish Institute for Evidence Based Medicine, Poland q) The Cancer Council, Australia r) Centre for Evidence-based Medicine, UK s) University of Buffalo, USA t) World Health Organisation, Switzerland u) Finnish Medical Society Duodecim, Finland v) Duke University Medical Center, USA w) Centers for Disease Control and Prevention, USA Opinions do not necessarily represent those of the institutions with which the members of the GRADE Working Group are affiliated. GRADE Working Group
What do you know about GRADE? • Have prepared a guideline • Read the BMJ paper • Have prepared a systematic review and a summary of findings table • Have attended a GRADE meeting, workshop or talk
Why bother about grading? • People draw conclusions about the • quality of evidence • strength of recommendations • Systematic and explicit approaches can help • protect against errors • resolve disagreements • facilitate critical appraisal • communicate information • However, there is wide variation in currently used approaches
Evidence Recommendation II-2 B C+ 1 Strong Strongly recommended Organization USPSTF ACCP GCPS Who is confused?
EvidenceRecommendation B Class I C+ 1 IV C Organization AHA ACCP SIGN Still not confused? Recommendation for use of oral anticoagulation in patients with atrial fibrillation and rheumatic mitral valve disease
Quality of evidence The extent to which one can be confident that an estimate of effect or association is correct. It depends on the: • study design (e.g. RCT, cohort study) • study quality/limitations (protection against bias; e.g. concealment of allocation, blinding, follow-up) • consistency of results • directness of the evidence including the • populations (those of interest versus similar; for example, older, sicker or more co-morbidity) • interventions (those of interest versus similar; for example, drugs within the same class) • outcomes (important versus surrogate outcomes) • comparison (A - C versus A - B & C - B)
Quality of evidence The quality of the evidence (i.e. our confidence) may also be REDUCEDwhen there is: • Sparse or imprecise data • Reporting bias The quality of the evidence (i.e. our confidence) may be INCREASEDwhen there is: • A strong association • A dose response relationship • All plausible confounders would have reduced the observed effect • All plausible biases would have increased the observed lack of effect
Categories of quality • High: Further research is very unlikely to change our confidence in the estimate of effect. • Moderate: Further research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate. • Low: Further research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate. • Very low: Any estimate of effect is very uncertain.
Judgements about the overall quality of evidence • Most systems not explicit • Options: • strongest outcome • primary outcome • benefits • weighted • separate grades for benefits and harms • no overall grade • weakest outcome • Based on lowest of all the critical outcomes • Beyond the scope of a systematic review
Strength of recommendation The extent to which one can be confident that adherence to a recommendation will do more good than harm. • trade-offs (the relative value attached to the expected benefits, harms and costs) • quality of the evidence • translation of the evidence into practice in a specific setting • uncertainty about baseline risk
Judgements about the balance between benefits and harms • Before considering cost and making a recommendation • For a specified setting, taking into account issues of translation into practice
Clarity of the trade-offs between benefits and the harms • the estimated size of the effect for each main outcome • the precision of these estimates • the relative value attached to the expected benefits and harms • important factors that could be expected to modify the size of the expected effects in specific settings; e.g. proximity to a hospital
Balance between benefits and harm • Net benefits: The intervention does more good than harm. • Trade-offs: There are important trade-offs between the benefits and harms. • Uncertain net benefits: It is not clear whether the intervention does more good than harm. • Not net benefits: The intervention does not do more good than harm.
Judgements about recommendations This should include considerations of costs; i.e. “Is the net gain (benefits-harms) worth the costs?” • Do it • Probably do it No recommendation • Probably don’t do it • Don’t do it
Will GRADE lead to change Should healthy asymptomatic postmenopausal women have been given oestrogen + progestin for prevention in 1992? • Quality of evidence across studies for • CHD • Hip fracture • Colorectal cancer • Breast cancer • Stroke • Thrombosis • Gall bladder disease • Quality of evidence across critical outcomes • Balance between benefits and harms • Recommendations
Evidence profile: Quality assessmentOestrogen + progestin for prevention in 1992 (before WHI and HERS) Oestrogen + progestin versus usual care
Further developments • Diagnostic tests • Complexity • Costs • (Equity) • Empirical evaluations
Taking account of costs • Include important (disaggregated) costs in evidence summaries and balance sheets when relevant • May be useful to aggregate and value (in monetary terms) • Always include disaggregated resource utilisation • Note when important information is missing • Published cost-effectiveness analyses are rarely helpful • Assess the quality of the evidence for important costs (consumption of resources) as for other effects (Were quantities measured reliably?) • If costs are critical to a decision, low quality evidence can lower the overall quality of evidence • Costs are negotiable (the value of resources) • There are many possible criteria for making a recommendation
Should activated protein C be given to patients in severe sepsis? An example with costs
GRADE evidence profile. Activated Protein C for sepsis • Name: Jaeschke and Schunemann • Date: September 2004 • Question: Should APC be used for severe sepsis? • Setting: ICU in Copenhagen • Baseline risk: Severe sepsis or septic shock > 24 h • References: Effectiveness: Bernard 2001. Efficacy and safety of recombinant human activated protein C for severe sepsis. NEJM 2001; 344:699 and Manns 2002. An economic evaluation of activated protein C treatment for severe sepsis. NEJM 2002;347:993. • Cost-effectiveness: Manns 2002. An economic evaluation of activated protein C treatment for severe sepsis. NEJM 2002;347:993.
Possible criteria for making a recommendation • Treatment effect • Adverse effects • Cost • Cost-effectiveness • Equity • Seriousness of the problem • Administrative restrictions
Empirical evaluations • Critical appraisal of other systems • Pilot test + sensibility • “Case law” + practical experience • Guidance for judgements • Single studies • Sparse data or imprecise data • Agreement • Validity? • Comparisons with other systems • Alternative presentations
Comparison of GRADE and other systems • Explicit definitions • Explicit, sequential judgements • Components of quality • Overall quality • Relative importance of outcomes • Balance between health benefits and harms • Balance between incremental health benefits and costs • Consideration of equity • Evidence profiles • International collaboration • Consistent judgements? • Communication?
We will serve the public more responsibly and ethically when research designed to reduce the likelihood that we will be misled by bias and the play of chance has becomean expected element of professional and policy making practice, not an optional add-on. Iain Chalmers
A prerequisitePractitioners and policy makers must make much clearer that they need rigorous evaluative research to help ensure that they do more good than harm. Iain Chalmers