critical appraisal: systematic reviews and clinical practice ...

1. Critical appraisal:Systematic Reviews andClinical Practice Guidelines forDrug Therapy Nancy J. Lee, PharmD, BCPS Research fellow, Drug Effectiveness Review Project Oregon Evidence-based Practice Center Oregon Health and Science University To receive 1.25 AMA PRA Category 1 Credits� you must review this section and answer CME questions at the end. Release date: January 2009 Expiration date: January 2012 Hello and welcome to the module entitled: Critical appraisal of systematic reviews and clinical practice guidelines. My name is Nancy Lee and I am a research fellow with the Drug Effectiveness Review Project at the Oregon Evidence-based Practice Center. � To receive continuing medical education credit you must review this section and answer the CME questions at the end of this module with a passing score of 75%. � Keep in mind, the contents of this module build upon concepts already reviewed in a prior module titled: Critical Appraisal of Randomized Controlled trials. Review of this previous module is recommended but not required before proceeding. � Next slideHello and welcome to the module entitled: Critical appraisal of systematic reviews and clinical practice guidelines. My name is Nancy Lee and I am a research fellow with the Drug Effectiveness Review Project at the Oregon Evidence-based Practice Center. � To receive continuing medical education credit you must review this section and answer the CME questions at the end of this module with a passing score of 75%. � Keep in mind, the contents of this module build upon concepts already reviewed in a prior module titled: Critical Appraisal of Randomized Controlled trials. Review of this previous module is recommended but not required before proceeding. � Next slide

2. The attachments tab in the upper right had corner contains documents that supplement this presentation. Handouts of slides and a glossary of terms can be found under this tab and are available to print out for your use. The URL to online resources are also located here as well. Next slide The attachments tab in the upper right had corner contains documents that supplement this presentation. Handouts of slides and a glossary of terms can be found under this tab and are available to print out for your use. The URL to online resources are also located here as well. Next slide

3. This work was made possible by a grant from the state Attorney General Consumer and Prescriber Education program�which is funded by the multi-state settlement of consumer fraud claims regarding the marketing of the prescription drug Neurontin. Next slideThis work was made possible by a grant from the state Attorney General Consumer and Prescriber Education program�which is funded by the multi-state settlement of consumer fraud claims regarding the marketing of the prescription drug Neurontin. Next slide

4. This program has been planned and implemented in accordance with the Essential Areas and Policies of the Accreditation Council for CME jointly sponsored by the University of Texas Southwestern Medical Center and the Federation of State Medical Board�s Research and Education Foundation. Next slideThis program has been planned and implemented in accordance with the Essential Areas and Policies of the Accreditation Council for CME jointly sponsored by the University of Texas Southwestern Medical Center and the Federation of State Medical Board�s Research and Education Foundation. Next slide

5. CME information The course director is Barbara Schneidman, MD, MPH with the Federation of State Medical Boards Research and Education Foundation, and the The program directors are as follows: David Pass, MD Director of Health Resources Commission at the Oregon Office for Health Policy and Research Dean Haxby, PharmD Associate professor of pharmacy practice at the Oregon State University College of Pharmacy Daniel Hartung, PharmD, MPH Assistant professor of pharmacy practice also at the Oregon State University College of Pharmacy This educational activity is intended for persons who are involved in committees dealing with medication use policies and for health care professionals who are involved with medication prescribing. The educational objectives are described below but will be reviewed in detail in the following slides. Next slide The course director is Barbara Schneidman, MD, MPH with the Federation of State Medical Boards Research and Education Foundation, and the The program directors are as follows: David Pass, MD Director of Health Resources Commission at the Oregon Office for Health Policy and Research Dean Haxby, PharmD Associate professor of pharmacy practice at the Oregon State University College of Pharmacy Daniel Hartung, PharmD, MPH Assistant professor of pharmacy practice also at the Oregon State University College of Pharmacy This educational activity is intended for persons who are involved in committees dealing with medication use policies and for health care professionals who are involved with medication prescribing. The educational objectives are described below but will be reviewed in detail in the following slides. Next slide

6. CME policies The continuing education sponsors require us to report the policies for this program which are described below. Next slide.The continuing education sponsors require us to report the policies for this program which are described below. Next slide.

7. Each speaker for this program has completed and signed a conflict of interest statement. The faculty members relationships to any commercial entities are listed on this slide. Next slideEach speaker for this program has completed and signed a conflict of interest statement. The faculty members relationships to any commercial entities are listed on this slide. Next slide

8. Learning objectives I. Systematic reviews Recognize benefits and limitations Assess quality of systematic reviews Identify the differences between systematic reviews, narrative reviews, and meta-analyses Recognize components of forest and funnel plots used in systematic reviews with meta-analyses II. Guidelines Identify strengths and weaknesses Assess and recognize quality components Review grading of the strength of evidence used in guidelines For this module we will be reviewing systematic reviews and clinical practice guidelines. By the end of both sessions, our objectives are for you to be able to recognize the benefits and limitations of systematic reviews, be able to assess quality of systematic reviews, appreciated differences among systematic reviews, narrative reviews, and meta-analyses, and recognize components of forest and funnel plots that may be used in systematic reviews with meta-analyses. Similarly, for guidelines, our objectives will be able for you to appreciate strengths and weaknesses of guidelines, be able to assess and recognize quality factors, and review grading of the strength of evidence used in guideline development. � I would also like to mention that our focus for this module is regarding systematic reviews and guidelines of drug therapy and not diagnostic tests or on prognosis�although key concepts on quality assessment can be used for these different areas. � Next slide. �For this module we will be reviewing systematic reviews and clinical practice guidelines. By the end of both sessions, our objectives are for you to be able to recognize the benefits and limitations of systematic reviews, be able to assess quality of systematic reviews, appreciated differences among systematic reviews, narrative reviews, and meta-analyses, and recognize components of forest and funnel plots that may be used in systematic reviews with meta-analyses. Similarly, for guidelines, our objectives will be able for you to appreciate strengths and weaknesses of guidelines, be able to assess and recognize quality factors, and review grading of the strength of evidence used in guideline development. � I would also like to mention that our focus for this module is regarding systematic reviews and guidelines of drug therapy and not diagnostic tests or on prognosis�although key concepts on quality assessment can be used for these different areas. � Next slide. �

9. I. Systematic Reviews:Outline Why, When, What? Benefits and limitations Steps in conducting Systematic Reviews Scientific process Quality assessment of Systematic Reviews Tools and checklists Before we begin, we will first address the why, when, and whats? Of systematic reviews and also review the benefits and limitations. Next slide Before we begin, we will first address the why, when, and whats? Of systematic reviews and also review the benefits and limitations. Next slide

10. Why are systematic reviews needed? Too much information Not enough time More than 2 million articles published yearly from more than 200 biomedical journals Results can often be contradicted by subsequent trials Taken together, a clearer picture can emerge Minimize biases Increase statistical power Improve generalizability Improve allocation of resources for other needed trials = minimize funding of unnecessary trials So, why are systematic reviews needed? Over the years we�ve been taught hat RCT design is the �gold standard��which is more often, than not, a true statement when evaluating drug therapy�.But quite often we are faced with too much information and not enough time (animate) to get through all the trials that are published every year. It is estimated (animate) that more than 2 million articles are published annually from more than 200 different biomedical journals and keeping up with such literature can be overwhelming. In addition, not all trials show similar findings and results of isolated trials (animate) are frequently contradicted by subsequent studies which can lead to confusion. In fact, a study by Ioannidis (2005) and colleagues, the authors found that about 1/3 of the studies evaluated were either contradicted or reported stronger effects than subseq studies�suggesting that relying on a single high profile trials can have the potential to misrepresent the �true effect� and can therefore be harmful in decision making. Well-conducted systematic reviews can help synthesize information from various trials and when appropriate, observational studies can also be included�when these publications are taken together, (animate) a clearer picture can emerge. There is the potential for minimizing biases observed among trials, potential for increasing statistical power esp when only smaller trials exist, potential for improving generalizability, and potential for better allocation of resources by minimizing the funding of unnecessary or duplicate trials. Next slide Ioannidis JP (2005) Contradicted and initially stronger effects in highly cited clinical research. JAMA 294:218�28So, why are systematic reviews needed? Over the years we�ve been taught hat RCT design is the �gold standard��which is more often, than not, a true statement when evaluating drug therapy�.But quite often we are faced with too much information and not enough time (animate) to get through all the trials that are published every year. It is estimated (animate) that more than 2 million articles are published annually from more than 200 different biomedical journals and keeping up with such literature can be overwhelming. In addition, not all trials show similar findings and results of isolated trials (animate) are frequently contradicted by subsequent studies which can lead to confusion. In fact, a study by Ioannidis (2005) and colleagues, the authors found that about 1/3 of the studies evaluated were either contradicted or reported stronger effects than subseq studies�suggesting that relying on a single high profile trials can have the potential to misrepresent the �true effect� and can therefore be harmful in decision making. Well-conducted systematic reviews can help synthesize information from various trials and when appropriate, observational studies can also be included�when these publications are taken together, (animate) a clearer picture can emerge. There is the potential for minimizing biases observed among trials, potential for increasing statistical power esp when only smaller trials exist, potential for improving generalizability, and potential for better allocation of resources by minimizing the funding of unnecessary or duplicate trials. Next slide Ioannidis JP (2005) Contradicted and initially stronger effects in highly cited clinical research. JAMA 294:218�28

11. Fergusson D, et al. Clin Trials 2005; 2:218-32 One example of where results from systematic reviews could have had beneficial impact on resource allocation was in a study by Fergusson and colleagues titled: �randomized controlled trials of aprotinin in cardiac surgery: could clinical equipoise have stopped the bleeding?� In the context of this particular study, clinical equipoise was defined as whether or not there was awareness of the existence of medical evidence that could indicate whether randomization of individuals to competing drug therapies or placebo was justified. Briefly, aprotinin is a serine protease inhibitor used to limit perioperative bleeding and therefore reduce the need for allogeneic or donor red blood cell transfusions. Randomized trials studying aprotinin' s effects on the proportion of patients receiving at least 1 transfusion in cardiac surgery have been published since 1987. And since this time, 2 systematic reviews one in 1992 and one in 1997 were conducted. Both reviews showed that aprotinin was more effective than the control group in reducing the need for transfusions. Despite these results from the systematic reviews, 19 more RCTs were funded and conducted. Begging the question, did trialists review all the literature before conducting their study? Next slide One example of where results from systematic reviews could have had beneficial impact on resource allocation was in a study by Fergusson and colleagues titled: �randomized controlled trials of aprotinin in cardiac surgery: could clinical equipoise have stopped the bleeding?� In the context of this particular study, clinical equipoise was defined as whether or not there was awareness of the existence of medical evidence that could indicate whether randomization of individuals to competing drug therapies or placebo was justified. Briefly, aprotinin is a serine protease inhibitor used to limit perioperative bleeding and therefore reduce the need for allogeneic or donor red blood cell transfusions. Randomized trials studying aprotinin' s effects on the proportion of patients receiving at least 1 transfusion in cardiac surgery have been published since 1987. And since this time, 2 systematic reviews one in 1992 and one in 1997 were conducted. Both reviews showed that aprotinin was more effective than the control group in reducing the need for transfusions. Despite these results from the systematic reviews, 19 more RCTs were funded and conducted. Begging the question, did trialists review all the literature before conducting their study? Next slide

12. As you may have guessed, trialists did not systematically review the medical literature before submitting their proposal for their own trials. Fergusson and colleagues identified a total of 69 trials (N=8000 patients) on aprotinin and conducted a cumulative meta-analysis which was ordered by study trial date. This quantitative analysis produces an updated measure of the aprotinin' s effect on bleeding by pooling or combining trials after each study is completed. This forest plot depicts the cumulative analysis. As you can see, the very 1st trial (1987) showed a benefit with aprotinin (albeit with a large confidence interval). By around 1992, and after trial #12, the cumulative effect estimate stabilized around OR of 0.25-0.35 which indicates that fewer patients in the aprotinin group received at least 1 blood transfusion than the control group. Throughout the meta-analysis the upper limit of the confidence interval never crossed 0.65. Even after the largest trial (which enrolled 1784 patients) was published, only 16% of subsequent trials referenced this particular study. Overall, about 20% of trials conducted and published actually referenced results from previously published studies on aprotinin in cardiac surgery. The results of this review not only make us wonder about the necessity of all these subsequent trials but also make us wonder if all these trials really made a clinical impact? This is just 1 example of the importance of conducting high quality systematic reviews in influencing practice. There are more examples like this that have demonstrated the benefit of systematic reviews in answering and confirming medical practices that help save lives. Next slideAs you may have guessed, trialists did not systematically review the medical literature before submitting their proposal for their own trials. Fergusson and colleagues identified a total of 69 trials (N=8000 patients) on aprotinin and conducted a cumulative meta-analysis which was ordered by study trial date. This quantitative analysis produces an updated measure of the aprotinin' s effect on bleeding by pooling or combining trials after each study is completed. This forest plot depicts the cumulative analysis. As you can see, the very 1st trial (1987) showed a benefit with aprotinin (albeit with a large confidence interval). By around 1992, and after trial #12, the cumulative effect estimate stabilized around OR of 0.25-0.35 which indicates that fewer patients in the aprotinin group received at least 1 blood transfusion than the control group. Throughout the meta-analysis the upper limit of the confidence interval never crossed 0.65. Even after the largest trial (which enrolled 1784 patients) was published, only 16% of subsequent trials referenced this particular study. Overall, about 20% of trials conducted and published actually referenced results from previously published studies on aprotinin in cardiac surgery. The results of this review not only make us wonder about the necessity of all these subsequent trials but also make us wonder if all these trials really made a clinical impact? This is just 1 example of the importance of conducting high quality systematic reviews in influencing practice. There are more examples like this that have demonstrated the benefit of systematic reviews in answering and confirming medical practices that help save lives. Next slide

13. When are systematic reviews needed? When an important question needs to be addressed Gaps in the literature or conflicting results When there is uncertainty regarding an intervention Uncertainty may lie in: Population, Intervention, Outcomes When several primary studies exist Lack of strong evidence So, we�ve just reviewed reasons why systematic reviews are needed and also discussed some of their potential benefits. But, of course the next question is: when are they needed? Systematic reviews are not needed for every clinical scenario. Systematic reviews are needed: when an important question needs to be addressed�such as in areas where there are gaps or when there are conflicting results. Another area is when there is uncertainty regarding an intervention particularly within a population, dose, or type of outcome. And last, systematic reviews are needed when several smaller primary studies exist where there is a lack of strong evidence. Pause��. Next slide Lancet�requiring SR before trial publication (Young and Horton 2005)So, we�ve just reviewed reasons why systematic reviews are needed and also discussed some of their potential benefits. But, of course the next question is: when are they needed? Systematic reviews are not needed for every clinical scenario. Systematic reviews are needed: when an important question needs to be addressed�such as in areas where there are gaps or when there are conflicting results. Another area is when there is uncertainty regarding an intervention particularly within a population, dose, or type of outcome. And last, systematic reviews are needed when several smaller primary studies exist where there is a lack of strong evidence. Pause��. Next slide Lancet�requiring SR before trial publication (Young and Horton 2005)

14. Limitations of systematic reviews Only as good as what is available and what is included Issue of publication bias Restricted to published results Quality of individual trials �Garbage In, Garbage Out� Good quality systematic reviews typically do not address all the issues relevant for decision making Evidence outside the scope of the review may be relevant and needed for decision making Cost and implementation implications may not always be addressed Although systematic reviews are helpful in synthesizing large bodies of evidence and in answering or confirming therapies in clinical practice, systematic reviews of course, have their limitations. One of which is that: systematic reviews are only as good as what is available and what is included. With regard to publication bias�if �negative� trials are not published and therefore not available for synthesis, the results may be biased towards �positive� published studies. Publication bias is a significant issue that is difficult to control�and we will further discuss this issue later in this module. The second issue deals with quality of included trials. If the quality of the included studies is poor then final product is also poor regardless of how well the systematic review was conducted�this is the same as the �garbage in, garbage out� concept. Another limitation is that good quality reviews cannot include all the issues relevant or necessary for decision-making for clinicians and policy makers. Evidence outside the scope of the review may be needed and often times cost information specific for each different decision-making scenario may not be addressed and may need to be considered separately. Next slideAlthough systematic reviews are helpful in synthesizing large bodies of evidence and in answering or confirming therapies in clinical practice, systematic reviews of course, have their limitations. One of which is that: systematic reviews are only as good as what is available and what is included. With regard to publication bias�if �negative� trials are not published and therefore not available for synthesis, the results may be biased towards �positive� published studies. Publication bias is a significant issue that is difficult to control�and we will further discuss this issue later in this module. The second issue deals with quality of included trials. If the quality of the included studies is poor then final product is also poor regardless of how well the systematic review was conducted�this is the same as the �garbage in, garbage out� concept. Another limitation is that good quality reviews cannot include all the issues relevant or necessary for decision-making for clinicians and policy makers. Evidence outside the scope of the review may be needed and often times cost information specific for each different decision-making scenario may not be addressed and may need to be considered separately. Next slide

15. Limitations of systematic reviews Unrealistic expectations What if results conflict with a good quality large landmark trial? About 10-23% of large trials disagreed with meta-analyses* May not always include the most up to date studies When was the last literature search conducted? Estimate: 3-5 years** Does not make decisions for the user These are not guidelines The reader uses their own judgment *Ioannidis, et al. JAMA 1998; 279:1089-93. **Shojoania, et al. Ann Intern Med 2007; 147:224-33. Other limitations result from readers unrealistic expectations that results from systematic reviews must always be �right��what if results from systematic reviews conflict with good quality large landmark trials? It is estimated that about 10-20% large clinical trials differ from MA�not to say that one study design is better than the other but that discrepancies can occur. Another important limitation of systematic reviews is that systematic reviews may not always include the most up to date studies. Therefore, it is always important to ask, when was the literature last searched? Depending on the topic area, it is estimated that the information contained in systematic reviews become out of date within 3-5 years. And so, it is wise to check the date of the last search and date of publication. And finally, systematic reviews do not make decisions for users�these are not guidelines. The user must use their judgment based on the evidence to make decisions. Next slideOther limitations result from readers unrealistic expectations that results from systematic reviews must always be �right��what if results from systematic reviews conflict with good quality large landmark trials? It is estimated that about 10-20% large clinical trials differ from MA�not to say that one study design is better than the other but that discrepancies can occur. Another important limitation of systematic reviews is that systematic reviews may not always include the most up to date studies. Therefore, it is always important to ask, when was the literature last searched? Depending on the topic area, it is estimated that the information contained in systematic reviews become out of date within 3-5 years. And so, it is wise to check the date of the last search and date of publication. And finally, systematic reviews do not make decisions for users�these are not guidelines. The user must use their judgment based on the evidence to make decisions. Next slide

16. What it is and isn�t Adapted from Cook DJ, et al. Ann Intern Med 1997; 126:376-80. Before going any further, I would like to take some time to highlight some of the differences between systematic reviews and narrative/or traditional reviews. First, systematic reviews are different from narrative reviews (also referred to as traditional reviews). Generally speaking, systematic reviews refers to the entire process of collecting, reviewing, and presenting all evidence. Unlike narrative reviews, systematic reviews tend to have a more focused clinical question than narrative reviews which are often broad in scope and can cover information from pharmacokinetics, pharmacodynamics, and pathophysiology in addition to summarizing trial results. Although more narrative reviews are searching bibliographic databases such as Medline or Embase�prespecified and comprehensive search methods are typically not performed. Study eligibility are often not specified in narrative reviews as well and if they are reported, the criteria for study inclusion or exclusion may not be uniformly applied�which can introduce several biases such as study selection bias. You may have come across very biased narrative reviews that seemed to have picked certain studies that supported the authors preformed conclusions and omitted other studies that did not support their conclusions. The key difference between narrative and systematic reviews, however, is in the appraisal of the literature (shaded highlight)�more specifically, the quality assessment of the included trials. Authors of narrative reviews may appraise the evidence well but more often than not, authors of narrative reviews do not consistently assess methodologic rigor of included trials or the internal validity. Next slide Adapted from Cook DJ, Ann Intern Med 1997; 126:376-380Before going any further, I would like to take some time to highlight some of the differences between systematic reviews and narrative/or traditional reviews. First, systematic reviews are different from narrative reviews (also referred to as traditional reviews). Generally speaking, systematic reviews refers to the entire process of collecting, reviewing, and presenting all evidence. Unlike narrative reviews, systematic reviews tend to have a more focused clinical question than narrative reviews which are often broad in scope and can cover information from pharmacokinetics, pharmacodynamics, and pathophysiology in addition to summarizing trial results. Although more narrative reviews are searching bibliographic databases such as Medline or Embase�prespecified and comprehensive search methods are typically not performed. Study eligibility are often not specified in narrative reviews as well and if they are reported, the criteria for study inclusion or exclusion may not be uniformly applied�which can introduce several biases such as study selection bias. You may have come across very biased narrative reviews that seemed to have picked certain studies that supported the authors preformed conclusions and omitted other studies that did not support their conclusions. The key difference between narrative and systematic reviews, however, is in the appraisal of the literature (shaded highlight)�more specifically, the quality assessment of the included trials. Authors of narrative reviews may appraise the evidence well but more often than not, authors of narrative reviews do not consistently assess methodologic rigor of included trials or the internal validity. Next slide Adapted from Cook DJ, Ann Intern Med 1997; 126:376-380

17. The advantage of using carefully done, systematic reviews becomes clear when we observe how often mistakes are made when research is reviewed non-systematically, whether by experts or others. The costs of mistaken conclusions based on non-systematic reviews can be high. I believe Andrew Oxman, Director of the Health Services Research Unit at the National Institute of public Health in Oslo Norway best summarized the differences between narrative or traditional reviews with well-conducted systematic reviews by stating, �the advantage of using carefully done systematic reviews becomes clear when we observe how often mistakes are made when research is reviewed non-systematically, whether by experts or others. The costs of mistaken conclusions based on non-systematic reviews can be high.� Next slideI believe Andrew Oxman, Director of the Health Services Research Unit at the National Institute of public Health in Oslo Norway best summarized the differences between narrative or traditional reviews with well-conducted systematic reviews by stating, �the advantage of using carefully done systematic reviews becomes clear when we observe how often mistakes are made when research is reviewed non-systematically, whether by experts or others. The costs of mistaken conclusions based on non-systematic reviews can be high.� Next slide

18. I. Systematic Reviews:Outline Why, When, What? Benefits and limitations Steps in conducting Systematic Reviews Scientific process Quality assessment of Systematic Reviews Tools and checklists Conducting a good-or high quality systematic review is based on a scientific process similar to concepts observed in good trial design�and requires us to take a closer look at the steps involved in conducting a systematic review before going over how to assess its quality. Next slide Conducting a good-or high quality systematic review is based on a scientific process similar to concepts observed in good trial design�and requires us to take a closer look at the steps involved in conducting a systematic review before going over how to assess its quality. Next slide

19. Systematic Reviews: A scientific process Figure 1. Copyright �1997 BMJ Publishing Group Ltd. from Greenhalgh T. BMJ 1997;315:672-5. Performing systematic reviews is a scientific process and involves multiple steps as shown in this figure. We will go over each step in detail in the following slides but just like a well-conducted clinical trial, well done systematic reviews also try and minimize potential sources of bias by prespecifying the key questions, inclusion/exclusion criteria, performing dual review of the selection of studies, data abstraction, and quality assessment�in addition to evaluating appropriateness of quantitative methods. Next slide. Greenhalgh T, BMJ 1997; 315:672-75 Performing systematic reviews is a scientific process and involves multiple steps as shown in this figure. We will go over each step in detail in the following slides but just like a well-conducted clinical trial, well done systematic reviews also try and minimize potential sources of bias by prespecifying the key questions, inclusion/exclusion criteria, performing dual review of the selection of studies, data abstraction, and quality assessment�in addition to evaluating appropriateness of quantitative methods. Next slide. Greenhalgh T, BMJ 1997; 315:672-75

20. Developed a priori Most important Relevant and sensible to practitioners and patients? Typically not changed during the review process What are we asking? Efficacy Effectiveness Well-defined? PICOS Any exclusions? Language restrictions or type of study design What�s the purpose and question? Before formulating a question, a clear purpose for conducting a systematic review should be specified�for instance: what will this review address? Will this fill any gaps? Or will this add new information? Formulating a relevant, clear and focused question is probably the most important step in the process and is developed a priori�The key question or questions of the systematic review sets the tone or scope of the report and the question should be relevant and sensible to clinicians and patients. The key question is typically not changed in the middle of a systematic review unless there is a significant and important reason for doing so�reasons for changing a question should be explicitly reported in the review for transparency. In the case of comparative drug effectiveness reviews, there are generally, 2 types of questions that are addressed:1) can it work or 2) how well it works over time? The first question addresses efficacy of a treatment while the 2nd question deals more with the effectiveness of treatment. In both types of questions, harms of the drug therapy should also be assessed Good questions that are well-defined also include information about pop, intervention, outcome, in some instances study setting. It is also a good idea to specify if there are any exclusions such as any language restrictions or any type of study design restrictions. Next slideBefore formulating a question, a clear purpose for conducting a systematic review should be specified�for instance: what will this review address? Will this fill any gaps? Or will this add new information? Formulating a relevant, clear and focused question is probably the most important step in the process and is developed a priori�The key question or questions of the systematic review sets the tone or scope of the report and the question should be relevant and sensible to clinicians and patients. The key question is typically not changed in the middle of a systematic review unless there is a significant and important reason for doing so�reasons for changing a question should be explicitly reported in the review for transparency. In the case of comparative drug effectiveness reviews, there are generally, 2 types of questions that are addressed:1) can it work or 2) how well it works over time? The first question addresses efficacy of a treatment while the 2nd question deals more with the effectiveness of treatment. In both types of questions, harms of the drug therapy should also be assessed Good questions that are well-defined also include information about pop, intervention, outcome, in some instances study setting. It is also a good idea to specify if there are any exclusions such as any language restrictions or any type of study design restrictions. Next slide

21. What was the study eligibility? Determines what studies get included in a systematic review Formed a priori Applied uniformly by at least 2 reviewers (dual review) Study inclusion and exclusion criteria should relate to the areas defined by PICO(S) Population Intervention Comparator Outcome Setting/study design The next step is determining the study eligibility�or determining what studies get included in a systematic review. Study eligibility criteria is also often written a priori to minimize study selection bias and is applied uniformly by at least 2 reviewers�again, to minimize bias. The study eligibility is written in relation to the areas defined by PICOS. Next slide. The next step is determining the study eligibility�or determining what studies get included in a systematic review. Study eligibility criteria is also often written a priori to minimize study selection bias and is applied uniformly by at least 2 reviewers�again, to minimize bias. The study eligibility is written in relation to the areas defined by PICOS. Next slide.

22. Study eligibility What are the consequences of being too inclusive or exclusive? Too inclusive Scope is too large Lose focus of question Main point may be lost May be difficult to interpret Too exclusive Scope is too narrow Potential to exclude important trials May end up not having enough evidence If unaware, could lead to biased conclusions Although determining study eligibility seems easy enough, there are consequences to criteria that are too inclusive or exclusive. If the study eligibility criteria are too broad or inclusive�this could impact the scope of the review and as a consequence muddle the focus of the key question resulting in a loss in the main point of the review�which can make the review difficult to interpret and use. If the study eligibility criteria are too stringent or exclusive�this could significantly narrow the scope of the review and exclude potentially important studies. Sometimes being too exclusive can also lead to biased conclusions or potentially create a situation the reviewer has to make assumptions regarding missing information. Next slideAlthough determining study eligibility seems easy enough, there are consequences to criteria that are too inclusive or exclusive. If the study eligibility criteria are too broad or inclusive�this could impact the scope of the review and as a consequence muddle the focus of the key question resulting in a loss in the main point of the review�which can make the review difficult to interpret and use. If the study eligibility criteria are too stringent or exclusive�this could significantly narrow the scope of the review and exclude potentially important studies. Sometimes being too exclusive can also lead to biased conclusions or potentially create a situation the reviewer has to make assumptions regarding missing information. Next slide

23. Example: Study eligibility Here is an example of an eligibility criteria that was adapted from a systematic review that assessed the efficacy, effectiveness, and harms of sitagliptin (a new DPP-4 inhibitor) compared with placebo and compared with other antihyperglycemic agents in patients with type 2 diabetes. As you can see the population includes: adults and children; the intervention spells out which treatments will be included. The outcomes of interest are listed�generally, the outcomes should be health outcomes or patient-oriented outcomes rather than intermediate or surrogate markers of health outcomes. The study design can be specified here as well�for instance, for efficacy and effectiveness�RCTs and existing good quality systematic reviews can be used�for harms�RCT, systematic reviews, and large comparative cohort observational studies were allowed. A minimum duration was specified as >12 weeks based on the minimum time period for a change in A1c or glycemic control to be observed. And finally, if any studies were to be excluded�this should be specified here. Arbitrarily excluding relevant studies or certain studies because the results conflict with a preformed conclusion is a serious flaw!! Reasons for study exclusion (again) should be explicitly reported and should be reasonable. Next slideHere is an example of an eligibility criteria that was adapted from a systematic review that assessed the efficacy, effectiveness, and harms of sitagliptin (a new DPP-4 inhibitor) compared with placebo and compared with other antihyperglycemic agents in patients with type 2 diabetes. As you can see the population includes: adults and children; the intervention spells out which treatments will be included. The outcomes of interest are listed�generally, the outcomes should be health outcomes or patient-oriented outcomes rather than intermediate or surrogate markers of health outcomes. The study design can be specified here as well�for instance, for efficacy and effectiveness�RCTs and existing good quality systematic reviews can be used�for harms�RCT, systematic reviews, and large comparative cohort observational studies were allowed. A minimum duration was specified as >12 weeks based on the minimum time period for a change in A1c or glycemic control to be observed. And finally, if any studies were to be excluded�this should be specified here. Arbitrarily excluding relevant studies or certain studies because the results conflict with a preformed conclusion is a serious flaw!! Reasons for study exclusion (again) should be explicitly reported and should be reasonable. Next slide

24. Finding all relevant studies:Search strategy Medical librarian important Key search terms should at the very least be reported Were any significant studies missing? If yes, why? Once study eligibility has been laid out�the next step is to search for the evidence by creating a net of search terms to use to cast out into various bibliographic databases. Development of a comprehensive search strategy is another important step in systematic reviews and involving a medical librarian in this step is highly recommended. Remember, the development of comprehensive searches minimizes the potential for missing potentially relevant trials. At a minimum key search terms should be reported in systematic reviews and an assessment of any missing studies should be routinely performed by both the authors and readers of the systematic review. Next slide.Once study eligibility has been laid out�the next step is to search for the evidence by creating a net of search terms to use to cast out into various bibliographic databases. Development of a comprehensive search strategy is another important step in systematic reviews and involving a medical librarian in this step is highly recommended. Remember, the development of comprehensive searches minimizes the potential for missing potentially relevant trials. At a minimum key search terms should be reported in systematic reviews and an assessment of any missing studies should be routinely performed by both the authors and readers of the systematic review. Next slide.

25. Example: Search strategy This is just a little snap shot of 1 part of a comprehensive search strategy. The first line reports which database was searched and the time period it was searched�so if you notice that a large trial was missed you may want to check the dates to see if this is a possible reason. As you can see, Medline was searched in this snap shot from 1950 to 2007 and then was updated to 2008. The next lines with numbers is the actual searching strategy with terms used�the first search term is the CAS number which is a unique numerical identifier for the exenatide compound before the drug was even named. The next search terms are the various names for exenatide or terms used to describe exenatide. Line 8 here shows that the search was limited to human subjects and publications in English Next slideThis is just a little snap shot of 1 part of a comprehensive search strategy. The first line reports which database was searched and the time period it was searched�so if you notice that a large trial was missed you may want to check the dates to see if this is a possible reason. As you can see, Medline was searched in this snap shot from 1950 to 2007 and then was updated to 2008. The next lines with numbers is the actual searching strategy with terms used�the first search term is the CAS number which is a unique numerical identifier for the exenatide compound before the drug was even named. The next search terms are the various names for exenatide or terms used to describe exenatide. Line 8 here shows that the search was limited to human subjects and publications in English Next slide

26. Finding all relevant studies: Sources Electronic databases MEDLINE (Ovid/PubMed) Cochrane Library EMBASE PsychINFO CINAHL Hand searching Reference lists of trials and/or reviews Journals Sources for unpublished information FDA website Clinical Trials.gov Registeries Industry dossiers Part of accomplishing a comprehensive search involves searching multiple sources of information . This entails searching more than 1 bibliographic database. In fact, searching multiple databases can still miss about 15% of relevant trials and thus hand searching and searching websites and dossiers is necessary to find as many potentially relevant studies as possible. Next slide.Part of accomplishing a comprehensive search involves searching multiple sources of information . This entails searching more than 1 bibliographic database. In fact, searching multiple databases can still miss about 15% of relevant trials and thus hand searching and searching websites and dossiers is necessary to find as many potentially relevant studies as possible. Next slide.

27. Selection of studies Review titles and abstracts from initial search Review of full text articles Uniform application of study eligibility criteria Dual review for each step Disagreements resolved by consensus Once the search for the evidence is completed, studies need to be selected. Typically, systematic reviewers receive long lists of titles and abstracts. These citations are reviewed by at least 2 reviewers. Then full text articles are retrieved for another round of study selection�this step is also performed in dual review. During the study selection process, study eligibility criteria should be uniformly applied and disagreements are usually resolved by consensus or with input from a third party. Next slideOnce the search for the evidence is completed, studies need to be selected. Typically, systematic reviewers receive long lists of titles and abstracts. These citations are reviewed by at least 2 reviewers. Then full text articles are retrieved for another round of study selection�this step is also performed in dual review. During the study selection process, study eligibility criteria should be uniformly applied and disagreements are usually resolved by consensus or with input from a third party. Next slide

28. Issue of publication bias Adapted from Cochrane Open Learning. Module 15. Publication bias 2002. As previously mentioned, another aspect of accomplishing a comprehensive search strategy and minimizing bias in the study selection process is the issue of publication bias. Publication bias is probably one of the larger challenges to reviewers because it is hard to manage and control�and most systematic reviews will be subject to some publication bias. Publication bias refers to the tendency for more �positive� studies to be published in biomedical journals than �negative� studies. In general, positive trials are likely to be published more rapidly, are more likely published in English, and are likely to be published more than once. All of these issues may exaggerate treatment effects observed in systematic reviews and meta-anlayses. The failure to publish negative results by peer reviewers, journal editors, investigators, and pharma may knowingly or unknowingly influence the results of systematic reviews and meta-analyses toward the positive trials. Next slide _______________________ As previously mentioned, another aspect of accomplishing a comprehensive search strategy and minimizing bias in the study selection process is the issue of publication bias. Publication bias is probably one of the larger challenges to reviewers because it is hard to manage and control�and most systematic reviews will be subject to some publication bias. Publication bias refers to the tendency for more �positive� studies to be published in biomedical journals than �negative� studies. In general, positive trials are likely to be published more rapidly, are more likely published in English, and are likely to be published more than once. All of these issues may exaggerate treatment effects observed in systematic reviews and meta-anlayses. The failure to publish negative results by peer reviewers, journal editors, investigators, and pharma may knowingly or unknowingly influence the results of systematic reviews and meta-analyses toward the positive trials. Next slide _______________________

29. Scargle. J of Scientific Explor 2000; 14(1):91-106. The issue of publication bias is not new and has been around for about 2 decades and as Rosenthal put it�.�----� (type 1 errors or false positive findings)�� Next slideThe issue of publication bias is not new and has been around for about 2 decades and as Rosenthal put it�.�----� (type 1 errors or false positive findings)�� Next slide

30. Investigating for presence of publication bias Visually check for asymmetry in funnel plots NOT a tool to �diagnose� bias Potential sources of asymmetry True heterogeneity Data irregularities Chance Other statistical methods Ask a biostatistician Egger, et al. BMJ 1997; 315:629-34. Figure 1 from Peters, et al. JAMA 2006; 295:676-80. A method that is commonly used to assess publication bias in systematic reviews that include meta-analyses is thru a funnel plot�which plots each trial�s result against the sample size�allowing for a visual check for asymmetry. I include this slide not for you to be able to understand every detail of funnel plots�but for you to be aware of this method and their limitations esp if it is reported in a systematic review with meta-analysis. Ideally, the funnel plot should look balanced�or as a symmetric inverted funnel as shown in this figure. Unfortunately, funnel plotting is not the most reliable method to check for or �diagnose� publication bias. There are other potential sources that can cause asymmetry in funnel plots. For instance, true clinical heterogeneity among the included trials in a review can skew the plot as well as data irregularities and by chance. There are other statistical methods to further assess the asymmetry but these are typically just further exercises that may not help answer the question. So, the bottomline with funnel plots is that these need to be interpreted with caution. Next slideA method that is commonly used to assess publication bias in systematic reviews that include meta-analyses is thru a funnel plot�which plots each trial�s result against the sample size�allowing for a visual check for asymmetry. I include this slide not for you to be able to understand every detail of funnel plots�but for you to be aware of this method and their limitations esp if it is reported in a systematic review with meta-analysis. Ideally, the funnel plot should look balanced�or as a symmetric inverted funnel as shown in this figure. Unfortunately, funnel plotting is not the most reliable method to check for or �diagnose� publication bias. There are other potential sources that can cause asymmetry in funnel plots. For instance, true clinical heterogeneity among the included trials in a review can skew the plot as well as data irregularities and by chance. There are other statistical methods to further assess the asymmetry but these are typically just further exercises that may not help answer the question. So, the bottomline with funnel plots is that these need to be interpreted with caution. Next slide

31. Ways to minimize publication bias in the review process Identify duplicate publications Contact study authors or manufacturer Often difficult to obtain information Time intensive Check sources for grey literature FDA review documents Clinical trial registries Databases Check for any language restrictions Rising, et al. PLoS Med 5(11):e217. Not all is lost with publication bias however, there are a few ways to minimize its potential: The first is for the author and reader to critically evaluate and check whether there are duplicate publications of the same study. Double counting results in meta-analyses can further skew the treatment effect and therefore it is essential that duplicate publications be accounted for. Next, authors of systematic reviews can contact lead investigators or the manufacturer for unpublished trial data. Often, this is a very time intensive process and in the end information may not be provided; however, it may be worth the time in some cases. Thirdly, authors of reviews can check sources for grey literature such as FDA review documents, clinical registeries, or databases for unpublished information. And finally, in some instances, authors may need to consider whether language restrictions may have an impact on publication bias..is this a topic area where I suspect there will be more trials published in non-english journals? If this is the case, it might be necessary to invest the funds to hire someone to translate non-english studies. Next slideNot all is lost with publication bias however, there are a few ways to minimize its potential: The first is for the author and reader to critically evaluate and check whether there are duplicate publications of the same study. Double counting results in meta-analyses can further skew the treatment effect and therefore it is essential that duplicate publications be accounted for. Next, authors of systematic reviews can contact lead investigators or the manufacturer for unpublished trial data. Often, this is a very time intensive process and in the end information may not be provided; however, it may be worth the time in some cases. Thirdly, authors of reviews can check sources for grey literature such as FDA review documents, clinical registeries, or databases for unpublished information. And finally, in some instances, authors may need to consider whether language restrictions may have an impact on publication bias..is this a topic area where I suspect there will be more trials published in non-english journals? If this is the case, it might be necessary to invest the funds to hire someone to translate non-english studies. Next slide

32. Quality assessment of included studies >25 different tools Jadad scale, Risk of Bias tool, DERP method (for trials) Other scales or checklists (for observational studies) How were poor-or low quality trials handled in the review? Were these excluded? Sensitivity analyses? Now that we�ve touched on comprehensive search methods, study selection, and the issue of publication bias�each article should be assessed for its quality or internal validity�.was quality assessment conducted and reported? As mentioned in a previous module, there are more than 25 different quality assessment tools and checklists. These tools evaluate the different components of internal validity which include: randomization, allocation concealment, blinding, statistical analysis, and attrition. Evaluating each component for potential risk of bias is important because inclusion of poorer quality trials can inflate overall treatment effects observed in systematic reviews. These concepts are discussed in more detail in the module for randomized controlled trials. In addition to quality assessment of individual trials, it is important to evaluate �how were poor or lower quality trials were handled in the systematic review. Generally, authors of systematic reviews should report what they did with poor quality studies. Were these excluded? Or were sensitivity analyses that included and excluded poor quality studies conducted and explored? Next slideNow that we�ve touched on comprehensive search methods, study selection, and the issue of publication bias�each article should be assessed for its quality or internal validity�.was quality assessment conducted and reported? As mentioned in a previous module, there are more than 25 different quality assessment tools and checklists. These tools evaluate the different components of internal validity which include: randomization, allocation concealment, blinding, statistical analysis, and attrition. Evaluating each component for potential risk of bias is important because inclusion of poorer quality trials can inflate overall treatment effects observed in systematic reviews. These concepts are discussed in more detail in the module for randomized controlled trials. In addition to quality assessment of individual trials, it is important to evaluate �how were poor or lower quality trials were handled in the systematic review. Generally, authors of systematic reviews should report what they did with poor quality studies. Were these excluded? Or were sensitivity analyses that included and excluded poor quality studies conducted and explored? Next slide

33. Example Bjelakovic, et al. Lancet 2004; 364:1219-28. This table was included in a systematic review of antioxidant supplements for the prevention of GI cancers. This nice feature of this table is that it reports the overall quality of each included study and also reports the results for each component of internal validity that the authors assessed, making it easy for the reader to evaluate what trials were included and help assess the overall quality of the review and implications of the final results. Next slide.This table was included in a systematic review of antioxidant supplements for the prevention of GI cancers. This nice feature of this table is that it reports the overall quality of each included study and also reports the results for each component of internal validity that the authors assessed, making it easy for the reader to evaluate what trials were included and help assess the overall quality of the review and implications of the final results. Next slide.

34. Data abstraction Dual abstraction and review Types of data abstracted: Study design Setting Population characteristics (age, sex, ethnicity) Inclusion/exclusion criteria Interventions Comparisons Number screened, eligible, enrolled Number withdrawn Method of outcome ascertainment Results Adverse events At the same time studies are being evaluated for quality, data abstraction of study characteristics and the results can occur. Again, well-conducted systematic reviews will tend to use at least 2 reviewers to data abstract or at 1 person to abstract while another checks the work. Types of data abstracted are listed on this slide: this list may vary depending on the topic area. Authors of systematic reviews may pilot their abstraction tool using a few studies, however, changing the data abstraction components frequently during the review is not recommended. Frequently changing what data to abstract may introduce bias into the review. Next slideAt the same time studies are being evaluated for quality, data abstraction of study characteristics and the results can occur. Again, well-conducted systematic reviews will tend to use at least 2 reviewers to data abstract or at 1 person to abstract while another checks the work. Types of data abstracted are listed on this slide: this list may vary depending on the topic area. Authors of systematic reviews may pilot their abstraction tool using a few studies, however, changing the data abstraction components frequently during the review is not recommended. Frequently changing what data to abstract may introduce bias into the review. Next slide

35. Data synthesis Two methods: qualitative and quantitative Qualitative Discussion of results (synthesis) in relation to each other in relation to study quality Not a reporting of results from each study Adapted from Cochrane Collaboration open learning materials for reviewers 2002-2003. After data has been abstracted�the information must be synthesized. There are 2 general methods of synthesis�qualitative and quantitative. Systematic reviews are qualitative in nature and authors of reviews should at a minimum discuss the results of studies in relation to each other and in relation to the quality of the study. This is different from just reporting results from each study which narrative reviews tend to do�systematic reviews should try and synthesize and summarize the major findings. Next slide After data has been abstracted�the information must be synthesized. There are 2 general methods of synthesis�qualitative and quantitative. Systematic reviews are qualitative in nature and authors of reviews should at a minimum discuss the results of studies in relation to each other and in relation to the quality of the study. This is different from just reporting results from each study which narrative reviews tend to do�systematic reviews should try and synthesize and summarize the major findings. Next slide

36. Data synthesis Quantitative or meta-analyses Statistical method for combining results from >1 study Advantage: provides an estimate of treatment effect Disadvantage: misleading estimate if used inappropriately Misuse of terminology Systematic review and Meta-analysis = NOT the same Adapted from Cochrane Collaboration open learning materials for reviewers 2002-2003. The second method of data synthesis is: Quantitative analyses also referred to as meta-analyses. Meta-analysis can be used in addition to qualitative synthesis when combining or pooling results from more than 1 study. The advantage of meta-analysis is that it provides a numerical estimate of the treatment effect (in the form of RR, OR, NNT, or mean difference). The disadvantage of this type of analysis is that the numerical estimate can be misleading if trials are combined or pooled inappropriately. Another complication with meta-analysis, is with the terminology. Oftentimes, people misuse the terms systematic review and meta-analysis and use these interchangeably. The Venn diagram shows that you can have systematic reviews with or without meta-analyses�and you can have meta-analyses that are not a part of systematic reviews at all. Someone could easily find a few studies that support their preformed conclusions and calculate a result with no attempt to be systematic. These types of meta-analyses are very biased and should not be confused or referred to as being systematic in nature. Remember, meta-analysis is a statistical method for combining results! Next slideThe second method of data synthesis is: Quantitative analyses also referred to as meta-analyses. Meta-analysis can be used in addition to qualitative synthesis when combining or pooling results from more than 1 study. The advantage of meta-analysis is that it provides a numerical estimate of the treatment effect (in the form of RR, OR, NNT, or mean difference). The disadvantage of this type of analysis is that the numerical estimate can be misleading if trials are combined or pooled inappropriately. Another complication with meta-analysis, is with the terminology. Oftentimes, people misuse the terms systematic review and meta-analysis and use these interchangeably. The Venn diagram shows that you can have systematic reviews with or without meta-analyses�and you can have meta-analyses that are not a part of systematic reviews at all. Someone could easily find a few studies that support their preformed conclusions and calculate a result with no attempt to be systematic. These types of meta-analyses are very biased and should not be confused or referred to as being systematic in nature. Remember, meta-analysis is a statistical method for combining results! Next slide

37. Meta-analysis The review should provide enough information about the included studies for you to judge whether combining results was appropriate. Two types of heterogeneity Clinical heterogeneity Does it make clinical sense to combine these studies? Statistical heterogeneity Are there inconsistencies in the results? Calculation of Q-or I-squared statistic Common sources of heterogeneity Clinical diversity between studies, conflicts of interest, and differences in study quality Adapted from Cochrane Collaboration open learning materials for reviewers 2002-2003. The decision to combine or not combine studies in meta-analyses involves an element of judgment and authors of systematic reviews and meta-analyses should discuss their reasons for combining or not combining results across studies. Authors of systematic review and meta-analysis should also provide the reader with enough information to judge whether or not combining results was reasonable. There are 2 types of heterogeneity and the assessment of these sources of heterogeneity such as clinical and statistical heterogeneity are very important concepts to consider when determining appropriate combining of studies. Clinical heterogeneity assesses whether the studies were reasonably similar with regard to population, intervention, and outcome�and also whether the combining of studies makes clinical sense at all. Statistical heterogeneity evaluates whether there are inconsistencies in the results among or between the included studies. The q-or I-squared statistic is usually used to determine statistical heterogeneity. Common sources of heterogeneity include clinical diversity between studies such as patient characteristics, settings, interventions, or outcomes, conflicts of interest, and differences in study quality. Next slideThe decision to combine or not combine studies in meta-analyses involves an element of judgment and authors of systematic reviews and meta-analyses should discuss their reasons for combining or not combining results across studies. Authors of systematic review and meta-analysis should also provide the reader with enough information to judge whether or not combining results was reasonable. There are 2 types of heterogeneity and the assessment of these sources of heterogeneity such as clinical and statistical heterogeneity are very important concepts to consider when determining appropriate combining of studies. Clinical heterogeneity assesses whether the studies were reasonably similar with regard to population, intervention, and outcome�and also whether the combining of studies makes clinical sense at all. Statistical heterogeneity evaluates whether there are inconsistencies in the results among or between the included studies. The q-or I-squared statistic is usually used to determine statistical heterogeneity. Common sources of heterogeneity include clinical diversity between studies such as patient characteristics, settings, interventions, or outcomes, conflicts of interest, and differences in study quality. Next slide

38. Example: Clinical heterogeneity? Here is an example of clinical heterogeneity from a meta-analysis by Nissen and colleagues from the NEJM. Part of the introduction reads:----- Based on this introduction, we anticipate that studies included in this meta-analysis include patients with type 2 diabetes who used rosiglitazone. However, when looking at the actual included populations, you can see that other populations such as those with chronic psoriasis, Alzheimer's disease, or impaired glucose were included. Inclusion of these very different populations with those with diabetes adds significant heterogeneity and could bias or lead to confusing or meaningless results. Next slide.Here is an example of clinical heterogeneity from a meta-analysis by Nissen and colleagues from the NEJM. Part of the introduction reads:----- Based on this introduction, we anticipate that studies included in this meta-analysis include patients with type 2 diabetes who used rosiglitazone. However, when looking at the actual included populations, you can see that other populations such as those with chronic psoriasis, Alzheimer's disease, or impaired glucose were included. Inclusion of these very different populations with those with diabetes adds significant heterogeneity and could bias or lead to confusing or meaningless results. Next slide.

39. How to read a Forest plot Forest plot adapted from Bjelakovic, et al. Lancet 2004; 364:1219-28. Another method used to help assess clinical heterogeneity (more specifically for meta-analyses) is to visually inspect a forest plot. Each of the horizontal lines with boxes are the included trials in a meta-analysis. The center vertical line represents the line of no effect�and for this review this line is at 1 for RR. This line also divides the 2 treatment groups �on the right favors--- and on the left, favoring ---. For each trial, the square box indicates the point estimate or the result. The size of the square is proportional to the weight or size of the study�the larger the trial�the more precise the estimate. The length of the horizontal line represents the confidence interval for the point estimate. And finally the diamonds represent the pooled or combined results across the trials. The width of the diamond represents the confidence interval while the vertical tips indicate the point estimate. Next slide Another method used to help assess clinical heterogeneity (more specifically for meta-analyses) is to visually inspect a forest plot. Each of the horizontal lines with boxes are the included trials in a meta-analysis. The center vertical line represents the line of no effect�and for this review this line is at 1 for RR. This line also divides the 2 treatment groups �on the right favors--- and on the left, favoring ---. For each trial, the square box indicates the point estimate or the result. The size of the square is proportional to the weight or size of the study�the larger the trial�the more precise the estimate. The length of the horizontal line represents the confidence interval for the point estimate. And finally the diamonds represent the pooled or combined results across the trials. The width of the diamond represents the confidence interval while the vertical tips indicate the point estimate. Next slide

40. Two common methods Fixed effects model Assumes homogeneity Random effects model Assumes heterogeneity Use both methods and select 1 to present Should briefly discuss why a certain method was selected What statistical method was used for the meta-analysis? Once, you�ve deemed that it is reasonable to combine studies, the next step is to determine which statistical method was used to combine the results? For this module, we will not be reviewing in detail the differences in these methods. But instead, we will just define these terms and what they assume so you are at least aware of these methods. The 2 commonly reported statistical methods for combining results across studies are: fixed and random effects model. A fixed effect model is based on a mathematical assumption that every study is evaluating a common treatment effect. This model assumes that results within studies differ by chance alone�and therefore assumes homogeneity. Random effects model or Der Simonian-Laird method is an alternative approach that does not assume that a common or fixed treatment effect exists. This model assume that the true treatment effect in the studies may be different from each other. And therefore, assumes heterogeneity. Typically, both models are conducted and 1 method is selected. The authors of the analysis should briefly discuss why a certain method was reported. Next slideOnce, you�ve deemed that it is reasonable to combine studies, the next step is to determine which statistical method was used to combine the results? For this module, we will not be reviewing in detail the differences in these methods. But instead, we will just define these terms and what they assume so you are at least aware of these methods. The 2 commonly reported statistical methods for combining results across studies are: fixed and random effects model. A fixed effect model is based on a mathematical assumption that every study is evaluating a common treatment effect. This model assumes that results within studies differ by chance alone�and therefore assumes homogeneity. Random effects model or Der Simonian-Laird method is an alternative approach that does not assume that a common or fixed treatment effect exists. This model assume that the true treatment effect in the studies may be different from each other. And therefore, assumes heterogeneity. Typically, both models are conducted and 1 method is selected. The authors of the analysis should briefly discuss why a certain method was reported. Next slide

41. Invalid methods of synthesis Picking and choosing Pick what you like, ignore what you don�t like Searching for proof Data dredging or data mining Vote counting Counting the number of studies with positive and negative results without considering study quality We just reviewed 2 methods of data synthesis: qualitative and quantitaive methods. However, it is also common to observe invalid methods for data synthesis. Often these invalid methods of synthesis are seen in poor quality systematic reviews and narrative reviews. The methods can include: picking and choosing which evidence to report on while ignoring or minimizing what you don�t like; The next is�searching for proof which is similar to data dredging or data mining�that is, digging long and hard enough to find something that is of statistical significance And finally vote counting which is counting the number of studies with positive and negative results without the consideration of study quality. Next slideWe just reviewed 2 methods of data synthesis: qualitative and quantitaive methods. However, it is also common to observe invalid methods for data synthesis. Often these invalid methods of synthesis are seen in poor quality systematic reviews and narrative reviews. The methods can include: picking and choosing which evidence to report on while ignoring or minimizing what you don�t like; The next is�searching for proof which is similar to data dredging or data mining�that is, digging long and hard enough to find something that is of statistical significance And finally vote counting which is counting the number of studies with positive and negative results without the consideration of study quality. Next slide

42. Bridging the resultsto the conclusion Do conclusions reflect the uncertainty in the evidence? Are gaps identified and recommendations for future research provided? Now that we�ve finished going through each of the steps involved in the systematic review process and methods for data synthesis and analysis, the next step is to summarize the findings. For most clinicians and decision makers, the conclusion section is probably the most sought and read section of studies and systematic reviews. Therefore, it is crucial for authors and readers of these reviews to evaluate whether the conclusions were clear and accurately reflective of the results. Do the conclusions reflect any uncertainty in the evidence that was found? Are there still gaps in the evidence that need to be addressed in future research? Next slideNow that we�ve finished going through each of the steps involved in the systematic review process and methods for data synthesis and analysis, the next step is to summarize the findings. For most clinicians and decision makers, the conclusion section is probably the most sought and read section of studies and systematic reviews. Therefore, it is crucial for authors and readers of these reviews to evaluate whether the conclusions were clear and accurately reflective of the results. Do the conclusions reflect any uncertainty in the evidence that was found? Are there still gaps in the evidence that need to be addressed in future research? Next slide

43. I. Systematic Reviews:Outline Why, When, What? Benefits and limitations Steps in conducting Systematic Reviews Scientific process Quality assessment of Systematic Reviews Tools and checklists So, we just completed reviewing the steps involved in conducting systematic reviews. Evaluating whether each step of the systematic review process was followed with as little as bias as possible serves as the foundation for assessing its quality. Since we�ve already reviewed much of the process in detail and discussed some quality aspects�we will just briefly review a few tools and resources that can help you in assessing systematic review quality. Next slideSo, we just completed reviewing the steps involved in conducting systematic reviews. Evaluating whether each step of the systematic review process was followed with as little as bias as possible serves as the foundation for assessing its quality. Since we�ve already reviewed much of the process in detail and discussed some quality aspects�we will just briefly review a few tools and resources that can help you in assessing systematic review quality. Next slide

44. Key questions to ask when assessing quality of systematic reviews Is there a clear, focused, clinically relevant question? Were study eligibility criteria reported and rationale provided (if needed)? Was the search for relevant studies detailed and exhaustive? Were included trials assessed for quality and were the assessments reproducible? How was data synthesized and was this appropriate? Are the conclusion statements clear and reflect the results from the evidence that was reviewed? To assess the quality of systematic reviews, a few questions need to be addressed. These include the questions, we just reviewed which are: Is there a clear, focused and clinically relevant question How were the studies included in the review selected? Were reasons reported as to why certain studies were not included? How systematic was the search for studies? Was the strategy fairly exhaustive? Were the individual studies assess for their quality and are the assessments reproducible? How was the data from the studies synthesized? And finally, do the concluding statements make sense? Does it reflect the results from the evidence that was reviewed? Next slideTo assess the quality of systematic reviews, a few questions need to be addressed. These include the questions, we just reviewed which are: Is there a clear, focused and clinically relevant question How were the studies included in the review selected? Were reasons reported as to why certain studies were not included? How systematic was the search for studies? Was the strategy fairly exhaustive? Were the individual studies assess for their quality and are the assessments reproducible? How was the data from the studies synthesized? And finally, do the concluding statements make sense? Does it reflect the results from the evidence that was reviewed? Next slide

45. Tools and lists for assessing systematic review quality >10 different scales and checklists Oxman and Guyatt Sacks, et al DERP method As with assessing internal validity of randomized controlled trials, there are multiple different tools and checklists available to help guide you in evaluating quality. The commonly referenced tools include: oxman and guyatt tool, sacks, et al tool, and the DERP method or tool for assessing systematic review quality. It is important to note that there is no 1 tool that is considered the �gold standard.� There may be some quality assessment tools or checklists that have been validated by researchers but for the most part well thought out tools evaluate the key concepts in assessing systematic review quality. Also, remember that similar to assessing individual study quality or internal validity, this requires some judgment and has an element of subjectivity�therefore, it is prudent to also have dual review for evaluating quality assessment of systematic reviews as well. Next slideAs with assessing internal validity of randomized controlled trials, there are multiple different tools and checklists available to help guide you in evaluating quality. The commonly referenced tools include: oxman and guyatt tool, sacks, et al tool, and the DERP method or tool for assessing systematic review quality. It is important to note that there is no 1 tool that is considered the �gold standard.� There may be some quality assessment tools or checklists that have been validated by researchers but for the most part well thought out tools evaluate the key concepts in assessing systematic review quality. Also, remember that similar to assessing individual study quality or internal validity, this requires some judgment and has an element of subjectivity�therefore, it is prudent to also have dual review for evaluating quality assessment of systematic reviews as well. Next slide

46. Oxman and Guyatt Shea B, et al. Eval Health Prof 2002; 25(1):116-29. This is the Oxman and Guyatt tool. Each question is answered and a total score is given to rate the overall quality. As you can see on this slide, each question evaluates whether the steps in the review process were conducted in a manner that was uniform and that would minimize potential biases. Next slideThis is the Oxman and Guyatt tool. Each question is answered and a total score is given to rate the overall quality. As you can see on this slide, each question evaluates whether the steps in the review process were conducted in a manner that was uniform and that would minimize potential biases. Next slide

47. Using Oxman and Guyatt method From DERP report. http://www.ohsu.edu/drugeffectiveness Here is an example of the oxman and guyatt method in action. Dual review was conducted and this is the final product. The top of the form lists the abridged version of the questions that were on the previous slide. Each question was answered and a final score out of 7 is given. As you can see the systematic review that was evaluated reported that searches were conducted in medline from 1966 to may 2007. The sources that were search were reported�and as you can see more than 5 different sources of literature were searched�suggesting a comprehensive search. Study eligibility criteria also referred to as inclusion criteria was documented�and the authors of this specific systematic review also reported criteria for exclusion. Next, dual review was performed in the selection of studies and each included trial was evaluated for its internal validity. The systematic review provided a table with their assessments. Next slide Here is an example of the oxman and guyatt method in action. Dual review was conducted and this is the final product. The top of the form lists the abridged version of the questions that were on the previous slide. Each question was answered and a final score out of 7 is given. As you can see the systematic review that was evaluated reported that searches were conducted in medline from 1966 to may 2007. The sources that were search were reported�and as you can see more than 5 different sources of literature were searched�suggesting a comprehensive search. Study eligibility criteria also referred to as inclusion criteria was documented�and the authors of this specific systematic review also reported criteria for exclusion. Next, dual review was performed in the selection of studies and each included trial was evaluated for its internal validity. The systematic review provided a table with their assessments. Next slide

48. Using Oxman and Guyatt method(continued) Questions 6 and 7 were evaluated�and were found to be adequately fulfilled although there were a few areas of uncertainty relating to how poor quality studies were handled in the systematic review. As you may have inferred by the mention of random effects model and I-squared test�this systematic review also conducted a meta-analysis. And based on the provided study information in the systematic review and justifications provided by the authors, it was deemed that the findings were combined appropriately. And finally, the summary or conclusions reflected the results without over stepping the available evidence. An overall score of 6 out of 7 was given. Next slideQuestions 6 and 7 were evaluated�and were found to be adequately fulfilled although there were a few areas of uncertainty relating to how poor quality studies were handled in the systematic review. As you may have inferred by the mention of random effects model and I-squared test�this systematic review also conducted a meta-analysis. And based on the provided study information in the systematic review and justifications provided by the authors, it was deemed that the findings were combined appropriately. And finally, the summary or conclusions reflected the results without over stepping the available evidence. An overall score of 6 out of 7 was given. Next slide

49. Sacks, et al 1. Prospective design a. Protocol b. Literature search c. Lists of trials analyzed d. Log of rejected trials e. Treatment assignment f. Ranges of patients g. Ranges of treatment h. Ranges of diagnosis 2. Combinability a. Criteria b. Measurement 3. Control of bias a. Selection bias b. Data-extraction bias c. Interobserver agreement d. Source of support 4. Statistical analysis a. Statistical methods b. Statistical errors c. Confidence intervals d. Subgroup analysis 5. Sensitivity analysis a. Quality assessment b. Varying methods c. Publication bias 6. Application of results a. Caveats b. Economic impact 7. Language This is the Sacks and colleagues tool. Compared with the Oxman and Guyatt tool, this tool is more detailed for each area such as the design of the review, assessment of combinability of trials, control of bias, statistical analyses, sensitivity analyses, application of results, and language. This tool also asks the reader to assess whether publication bias was evaluated by the authors of the systematic review. Next slide.This is the Sacks and colleagues tool. Compared with the Oxman and Guyatt tool, this tool is more detailed for each area such as the design of the review, assessment of combinability of trials, control of bias, statistical analyses, sensitivity analyses, application of results, and language. This tool also asks the reader to assess whether publication bias was evaluated by the authors of the systematic review. Next slide.

50. DERP method 50 From DERP report. http://www.ohsu.edu/drugeffectiveness This is another example of a quality assessment tool for systematic reviews used by the Drug Effectiveness Review Project. This tool is similar to the 2 previous tools but unlike the oxman and guyatt tool which uses a �scoring� system�this is based on a more qualitative approach�of rating the systematic review as good, fair, or poor. The rating of good requires an almost perfect fulfillment of each criteria. A poor rating is either based on a serious fatal flaw or a combination of flaws�while the fair rating includes the rest. Each quality rating lies on a spectrum of good, fair, and poor and again, dual review is performed. Feel free to take a closer look at each component. Next slideThis is another example of a quality assessment tool for systematic reviews used by the Drug Effectiveness Review Project. This tool is similar to the 2 previous tools but unlike the oxman and guyatt tool which uses a �scoring� system�this is based on a more qualitative approach�of rating the systematic review as good, fair, or poor. The rating of good requires an almost perfect fulfillment of each criteria. A poor rating is either based on a serious fatal flaw or a combination of flaws�while the fair rating includes the rest. Each quality rating lies on a spectrum of good, fair, and poor and again, dual review is performed. Feel free to take a closer look at each component. Next slide

51. Database of Abstracts of Reviews of Effects Another great resource for readers of systematic reviews who desire a quick quality assessment of systematic reviews is The database of abstracts of reviews and effects or DARE . DARE is a service performed by the centre for reviews and dissemination. It�s focus is on providing readers/clinicians/policy makers with brief, clear, and concise summaries of systematic reviews on health and social interventions�especially highlighting strengths and weaknesses. Each month thousands of citations are screened to identify potential systematic reviews. These are independently assessed by two researchers for inclusion on the DARE website. To ensure quality of the assessment of systematic reviews, abstracts are written and independently checked by health service researchers with in-depth knowledge and experience of systematic review methods. A copy of the abstract is sent to the original authors of the systematic review for any additional information. Authors are invited to reply with corrections to factual errors, further information and other relevant research. Where applicable, this information is added to the records. The DARE database is updated monthly�but due to the shear volume of reviews�there appears to be a time lag. Next slideAnother great resource for readers of systematic reviews who desire a quick quality assessment of systematic reviews is The database of abstracts of reviews and effects or DARE . DARE is a service performed by the centre for reviews and dissemination. It�s focus is on providing readers/clinicians/policy makers with brief, clear, and concise summaries of systematic reviews on health and social interventions�especially highlighting strengths and weaknesses. Each month thousands of citations are screened to identify potential systematic reviews. These are independently assessed by two researchers for inclusion on the DARE website. To ensure quality of the assessment of systematic reviews, abstracts are written and independently checked by health service researchers with in-depth knowledge and experience of systematic review methods. A copy of the abstract is sent to the original authors of the systematic review for any additional information. Authors are invited to reply with corrections to factual errors, further information and other relevant research. Where applicable, this information is added to the records. The DARE database is updated monthly�but due to the shear volume of reviews�there appears to be a time lag. Next slide

52. Here is an example of an abstract that evaluated a systematic review on estrogen therapy for the treatment of hotflashes in postmenopausal women. Next slideHere is an example of an abstract that evaluated a systematic review on estrogen therapy for the treatment of hotflashes in postmenopausal women. Next slide

53. Here is the CRD summary/appraisal of the review with concluding remarks. Next slideHere is the CRD summary/appraisal of the review with concluding remarks. Next slide

54. Summary: Systematic Reviews Advantages and disadvantages Can minimize biases that exist in individual studies May not answer all questions of interest Systematic reviews and meta-analyses are not synonymous Meta-analysis is a statistical method of combining studies Each step of the process should be questioned Comprehensive search of evidence Quality assessment of individual trials Appropriate method of synthesis So, this concludes the segment on appraising systematic reviews. We�ve discussed some advantages and disadvantages of this type of evidence. One advantage of systematic reviews is that these can help minimize potential biases observed in individual studies when taken together. One disadvantage is that systematic reviews have their limitations and may not answer all questions of interest. Hopefully you understand that systematic reviews and meta-analyses are not the same. Meta-Analysis is a statistical method used for combining study results And finally I hope you understand that a well-done high quality systematic review follows a rigorous scientific process and that each step of the process should be questioned. In particular, you should evaluate whether a comprehensive search of the evidence was performed, quality assessment of individual trials was achieved, and whether appropriate methods of synthesis were used. Next slide.So, this concludes the segment on appraising systematic reviews. We�ve discussed some advantages and disadvantages of this type of evidence. One advantage of systematic reviews is that these can help minimize potential biases observed in individual studies when taken together. One disadvantage is that systematic reviews have their limitations and may not answer all questions of interest. Hopefully you understand that systematic reviews and meta-analyses are not the same. Meta-Analysis is a statistical method used for combining study results And finally I hope you understand that a well-done high quality systematic review follows a rigorous scientific process and that each step of the process should be questioned. In particular, you should evaluate whether a comprehensive search of the evidence was performed, quality assessment of individual trials was achieved, and whether appropriate methods of synthesis were used. Next slide.

55. Appraisal of guidelines Welcome to the critical appraisal of clinical practice guidelines! Next slide Welcome to the critical appraisal of clinical practice guidelines! Next slide

56. II. Guidelines: Outline What is the purpose and what are the potential benefits and limitations? Why do we need to critically assess guidelines? Quality assessment of guidelines Tools to help evaluate guidelines In this section, we will review the purpose of clinical practice guidelines, their benefits and limitations. We will also review a few reasons why critical evaluation is warranted, review of the key concepts in assessing quality of guidelines, and finally end with a review of a few tools that you can use to assess the quality of clinical guidelines. Next slideIn this section, we will review the purpose of clinical practice guidelines, their benefits and limitations. We will also review a few reasons why critical evaluation is warranted, review of the key concepts in assessing quality of guidelines, and finally end with a review of a few tools that you can use to assess the quality of clinical guidelines. Next slide

57. Guidelines: steps beyond a review Incorporates the judgments and values involved in making recommendations Addresses larger spectrum of issues relevant for clinical decision making Purpose: Provide clinical practice recommendations Improve quality of care and outcomes Seek to influence change in clinical practice Reduce inappropriate variation in practice Shed light on gaps in the evidence As mentioned before, systematic reviews are not guidelines�and guidelines are not systematic reviews even though this body of evidence can be used in the process. Development of clinical practice guidelines requires a few more steps beyond just a systematic review of the literature. It incorporates judgments and values of clinicians, patients, and other stakeholders by addressing broader clinically important issues. The most obvious purpose of creating a guideline is to provide clinical practice recommendations resulting in the improvement of care and outcomes. Developing guidelines also try to influence a change in the current practice possibly by reducing inappropriate variations in practice. Because large bodies of evidence are considered in guideline development, gaps in the evidence may be revealed which could generate scientific research and progress. Next slideAs mentioned before, systematic reviews are not guidelines�and guidelines are not systematic reviews even though this body of evidence can be used in the process. Development of clinical practice guidelines requires a few more steps beyond just a systematic review of the literature. It incorporates judgments and values of clinicians, patients, and other stakeholders by addressing broader clinically important issues. The most obvious purpose of creating a guideline is to provide clinical practice recommendations resulting in the improvement of care and outcomes. Developing guidelines also try to influence a change in the current practice possibly by reducing inappropriate variations in practice. Because large bodies of evidence are considered in guideline development, gaps in the evidence may be revealed which could generate scientific research and progress. Next slide

58. Guidelines are not intended to� Provide a black and white answer for complex situations Substitute for clinical insight and judgment Be a legal resource in malpractice cases Prompt providers to withdraw availability or coverage of therapies Hinder or discourage scientific progress Woolf S, et al. BMJ 1999; 318:527-30. Guidelines however, are not intended to provide clinicians or decision makers with a black and white or cookbook answer for complex medical situations. Nor are they meant to replace clinical insight and judgment. Guidelines are meant provide a clinicians with as much evidence as available to make decisions. Guidelines are also not mean to be used as a legal or binding document in malpractice cases. Often, guidelines can be misconstrued by both policy makers and clinicians as the only �right way� to practice medicine. In the same light, guidelines are not intended for providers or health groups to withdraw availability or coverage of therapies�again, guidelines are guides. And finally, recommendations from guidelines are not meant to hinder or discourage scientific progress or research. Next slide Guidelines however, are not intended to provide clinicians or decision makers with a black and white or cookbook answer for complex medical situations. Nor are they meant to replace clinical insight and judgment. Guidelines are meant provide a clinicians with as much evidence as available to make decisions. Guidelines are also not mean to be used as a legal or binding document in malpractice cases. Often, guidelines can be misconstrued by both policy makers and clinicians as the only �right way� to practice medicine. In the same light, guidelines are not intended for providers or health groups to withdraw availability or coverage of therapies�again, guidelines are guides. And finally, recommendations from guidelines are not meant to hinder or discourage scientific progress or research. Next slide

59. Why is it necessary tocritically assess clinical practice guidelines? There are > 2,500 published guidelines Multiple guidelines with differing recommendations Not all guidelines are of good/high quality Consensus-based �Evidence-based� (systematic methods, transparent) Many �stakeholders� who are invested in the influence of their guidelines Government organizations and healthcare systems Professional societies Pharmaceutical industry So, why is it necessary to critically assess clinical practice guidelines? Isn�t it enough that we�ve critically looked at trials and systematic reviews? Evaluating clinical practice guidelines with scrutiny is needed because of the wide variability in the quality of how guidelines are produced, used, and abused. There are more than 2500 guidelines most of which are consensus-based. These are more often than not heavily influenced by expert opinion rather than based on a systematic and transparent approach of using evidence and are not as time and resource dependent. Evidence-based or evidence-informed clinical practice guidelines on the other hand are limited in number compared with consensus-based methods because of its time and resource intensive approach�these types of guidelines are definitely preferred and are less likely to be biased and abused relative to consensus guidelines. It is imperative that clinical guidelines be scrutinized because of their significant impact on clinical practice. Many stakeholders are invested and recommendations in guidelines affect patients, physicians, policies, professional societies, and industry. Next slideSo, why is it necessary to critically assess clinical practice guidelines? Isn�t it enough that we�ve critically looked at trials and systematic reviews? Evaluating clinical practice guidelines with scrutiny is needed because of the wide variability in the quality of how guidelines are produced, used, and abused. There are more than 2500 guidelines most of which are consensus-based. These are more often than not heavily influenced by expert opinion rather than based on a systematic and transparent approach of using evidence and are not as time and resource dependent. Evidence-based or evidence-informed clinical practice guidelines on the other hand are limited in number compared with consensus-based methods because of its time and resource intensive approach�these types of guidelines are definitely preferred and are less likely to be biased and abused relative to consensus guidelines. It is imperative that clinical guidelines be scrutinized because of their significant impact on clinical practice. Many stakeholders are invested and recommendations in guidelines affect patients, physicians, policies, professional societies, and industry. Next slide

60. A glance at guidelines from 1988-1998 3 items assessed Description of professionals involved Search undertaken Explicit grading of evidence for recommendation Grilli, et al. Lancet 2000; 355:103-6. Another important reason for the need to critically evaluate clinical practice guidelines is because of the increasing rate of clinical practice guidelines that are published by professional societies who have been taking active roles in the development process. The large growth in guidelines without critical evaluation of the process could undermine the credibility and benefits of having guidelines. Grilli and colleagues wanted to empirically evaluate 3 aspects that affect quality of guidelines put forth by specialty groups over a 10 year period between 1988 and 1998. 431 guidelines were retrieved and evaluated on 3 aspects: 1) Type of professionals involved in guideline groups, 2) Sources of information used to retrieve evidence, and 3) whether there was explicit grading or quality assessment of the included evidence in guidelines. As you can see, a large proportion of guidelines were from cardiology, oncology, and neurology specialty groups. More than 1/3rd of the guidelines were regarding treatment recommendations. And the largest surge of the guidelines were published between 1994-1998 over the 10 year span. Next slideAnother important reason for the need to critically evaluate clinical practice guidelines is because of the increasing rate of clinical practice guidelines that are published by professional societies who have been taking active roles in the development process. The large growth in guidelines without critical evaluation of the process could undermine the credibility and benefits of having guidelines. Grilli and colleagues wanted to empirically evaluate 3 aspects that affect quality of guidelines put forth by specialty groups over a 10 year period between 1988 and 1998. 431 guidelines were retrieved and evaluated on 3 aspects: 1) Type of professionals involved in guideline groups, 2) Sources of information used to retrieve evidence, and 3) whether there was explicit grading or quality assessment of the included evidence in guidelines. As you can see, a large proportion of guidelines were from cardiology, oncology, and neurology specialty groups. More than 1/3rd of the guidelines were regarding treatment recommendations. And the largest surge of the guidelines were published between 1994-1998 over the 10 year span. Next slide

61. Results from Grilli, et al Grilli and colleagues found that 67% of the guidelines from 1988-1998 did not describe the type of professionals involved in guideline groups,. Only 28% of guidelines included at least 1 professional or representative outside of the specialty. Astonishingly, 87% of the guidelines did not specify whether systematic search for evidence was undertaken. And less than 20% of guidelines provided information describing how the strength of the body of evidence was graded. Only 5% of guidelines fulfilled all 3 criteria compared with 54% who did not meet any of the 3 criteria�this example provides compelling evidence for the need to critically evaluate and scrutinize the quality of guidelines. Next slideGrilli and colleagues found that 67% of the guidelines from 1988-1998 did not describe the type of professionals involved in guideline groups,. Only 28% of guidelines included at least 1 professional or representative outside of the specialty. Astonishingly, 87% of the guidelines did not specify whether systematic search for evidence was undertaken. And less than 20% of guidelines provided information describing how the strength of the body of evidence was graded. Only 5% of guidelines fulfilled all 3 criteria compared with 54% who did not meet any of the 3 criteria�this example provides compelling evidence for the need to critically evaluate and scrutinize the quality of guidelines. Next slide

62. Assessing quality Who were involved in the decision making process? Were all relevant perspectives considered? To what extent were the funders of the guideline involved in the process? Conflicts of interests declared for each participant? Were all important practice options and clinically relevant outcomes considered? What was excluded and was rationale provided? How were the relative values of the outcomes weighed in terms of importance? When assessing guideline quality, a few key questions need to be assessed. The first is, Who were involved in decision making and were all relevant perspectives included and considered during the guideline development process? Well-conducted and thoughtful guidelines try and involve as many disciplines as possible. When a single specialty group is formed, inherently the group will be biased in favor of performing procedures or treatments in which they have a vested interest. Individual biases may be better balanced in multidisciplinary groups which may produce more valid guidelines� Next �to what extent were the funders of the guideline involved in the process? And were conflicts of interests declared for each of the participants? Reporting of such information opens up the process to become more transparent to any potential sources of bias especially if there appears to be an imbalance in maybe one group being heavily represented relative to others. It is also important to assess whether all important practice options and clinically relevant outcomes were considered. If certain outcomes were not considered, members of the guideline committee should provide rationale as to why certain items were excluded. Once a list of relevant outcomes has been decided upon, it is worth asking �how were the outcomes weighed in terms of importance?� Is there a reason why this outcome is considered to be more important than another? Is the outcome preference influenced by the funders, one specialty society, or a reflection of a multidisciplinary group? This information is nice to have but may not always be available to readers of clinical practice guidelines. Next slideWhen assessing guideline quality, a few key questions need to be assessed. The first is, Who were involved in decision making and were all relevant perspectives included and considered during the guideline development process? Well-conducted and thoughtful guidelines try and involve as many disciplines as possible. When a single specialty group is formed, inherently the group will be biased in favor of performing procedures or treatments in which they have a vested interest. Individual biases may be better balanced in multidisciplinary groups which may produce more valid guidelines� Next �to what extent were the funders of the guideline involved in the process? And were conflicts of interests declared for each of the participants? Reporting of such information opens up the process to become more transparent to any potential sources of bias especially if there appears to be an imbalance in maybe one group being heavily represented relative to others. It is also important to assess whether all important practice options and clinically relevant outcomes were considered. If certain outcomes were not considered, members of the guideline committee should provide rationale as to why certain items were excluded. Once a list of relevant outcomes has been decided upon, it is worth asking �how were the outcomes weighed in terms of importance?� Is there a reason why this outcome is considered to be more important than another? Is the outcome preference influenced by the funders, one specialty society, or a reflection of a multidisciplinary group? This information is nice to have but may not always be available to readers of clinical practice guidelines. Next slide

63. Consensus guidelines: Assessment, diagnosis, and treatment of diabetic peripheral neuropathic pain. Mayo Clinic Proceedings 2006; 81(4):S1-36. This is an example of a front page of a consensus guideline for the treatment of diabetic neuropathy. 11 pain specialists were invited and involved in the development of this guideline over just a 2 day period in New Orleans. Looking at the list of participants, there were no primary care physicians involved, no internal medicine physicians, or geriatricians�for whom was the target audience. Financial and commercial conflicts of interest were reported for 10 out of 11 members. All of whom had received honoraria or research funding from at least 1 of the manufacturers of medications listed in the �first-tier� or �recommended� category. 8 members had some funding from Eli Lilly who provided an educational grant for the production of this guideline. Next slide.This is an example of a front page of a consensus guideline for the treatment of diabetic neuropathy. 11 pain specialists were invited and involved in the development of this guideline over just a 2 day period in New Orleans. Looking at the list of participants, there were no primary care physicians involved, no internal medicine physicians, or geriatricians�for whom was the target audience. Financial and commercial conflicts of interest were reported for 10 out of 11 members. All of whom had received honoraria or research funding from at least 1 of the manufacturers of medications listed in the �first-tier� or �recommended� category. 8 members had some funding from Eli Lilly who provided an educational grant for the production of this guideline. Next slide.

64. Assessing quality How was evidence retrieved? Was it comprehensive? Was there explicit description of how �evidence� was used? Systematic reviews? Was there an approach to the hierarchy of evidence? Was quality of the evidence assessed and reported? How was the body of evidence graded? The next series of questions that should be asked revolves around the evidence-base and the quality of the evidence used in guidelines which is probably the crux to determining more credible guidelines. First, �how was evidence retrieved?� Was the process of retrieving the evidence explicit, transparent, and comprehensive?� Was the process for using different levels evidence (such as expert opinion) or systematic reviews? Was there a specific approach to using the hierarchy of evidence? Next, how was the quality of the evidence assessed in the guideline development process? Was this information reported? Based on the quality of the evidence how the body of evidence graded so that clinical recommendations can be made? Or was this based more on expert opinion and experience? Next slideThe next series of questions that should be asked revolves around the evidence-base and the quality of the evidence used in guidelines which is probably the crux to determining more credible guidelines. First, �how was evidence retrieved?� Was the process of retrieving the evidence explicit, transparent, and comprehensive?� Was the process for using different levels evidence (such as expert opinion) or systematic reviews? Was there a specific approach to using the hierarchy of evidence? Next, how was the quality of the evidence assessed in the guideline development process? Was this information reported? Based on the quality of the evidence how the body of evidence graded so that clinical recommendations can be made? Or was this based more on expert opinion and experience? Next slide

65. Chou, et al. Ann Intern Med 2007; 147:505-14. Here is a snap shot of the methods section of a guideline for low back pain sponsored by the American Pain Society and American College of Physicians. This guideline specified who determined the expert panel involved in the process, reported the methods used in retrieving evidence, how evidence was used based on the hierarchy of evidence, what was excluded from the review, and so on� Next slideHere is a snap shot of the methods section of a guideline for low back pain sponsored by the American Pain Society and American College of Physicians. This guideline specified who determined the expert panel involved in the process, reported the methods used in retrieving evidence, how evidence was used based on the hierarchy of evidence, what was excluded from the review, and so on� Next slide

66. -ing the strength of the evidence and recommendations reported in guidelines To provide a systematic and explicit approach to making judgments involved in a guideline process that can be used by all guideline developers One method ofassessing the body of evidence One recent new method in assessing the body of evidence for guideline development is the GRADE method. The GRADE method was developed by a multidisciplinary group of clinicians and methodologists who were interested in improving the old and variable methods of assessing the body of literature in guidelines into something more transparent, systematic, and replicable. Next slideOne recent new method in assessing the body of evidence for guideline development is the GRADE method. The GRADE method was developed by a multidisciplinary group of clinicians and methodologists who were interested in improving the old and variable methods of assessing the body of literature in guidelines into something more transparent, systematic, and replicable. Next slide

67. The approach considers: Strength of the body of evidence Study design Risk of bias or limitations Consistency of results Precision Directness of evidence Strength of recommendation: Strong vs. Weak Briefly, the GRADE approach considers a minimum of 4 domains�here we list 5: study design, risk of bias across trials, consistency of results across multiple trials, precision, and directness of the evidence from multiple trials. An overall GRADE of the body of evidence is categorized as high, moderate, low, or very low quality. Grading the body of evidence in this manner helps with determining whether strong or weak recommendations will be made. Next slide Briefly, the GRADE approach considers a minimum of 4 domains�here we list 5: study design, risk of bias across trials, consistency of results across multiple trials, precision, and directness of the evidence from multiple trials. An overall GRADE of the body of evidence is categorized as high, moderate, low, or very low quality. Grading the body of evidence in this manner helps with determining whether strong or weak recommendations will be made. Next slide

68. Example: GRADE table This is just an example of a GRADE table with all the factors that were considered for determining the effectiveness of pneumococcal vaccine for treating invasive pneumococcal disease. http://www.who.int/immunization/pp_pneumo_grade_tables/en/index.html This is just an example of a GRADE table with all the factors that were considered for determining the effectiveness of pneumococcal vaccine for treating invasive pneumococcal disease. http://www.who.int/immunization/pp_pneumo_grade_tables/en/index.html

69. Finally, This is the AGREE instrument or checklist for critically assessing clinical practice guidelines. This instrument was developed by researchers and policy makers who wanted to improve the quality of clinical practice guidelines by establishing a minimum framework for the process. This tool evaluates and questions each step of the guideline development process which include: scope and purpose, stakeholder involvement, rigor of development, clarity of recommendations, applicability, and editorial independence. The key concepts however continue to revolve around who was involved, how was the evidence retrieved and assessed, and how was the body of the evidence considered. Next slide Finally, This is the AGREE instrument or checklist for critically assessing clinical practice guidelines. This instrument was developed by researchers and policy makers who wanted to improve the quality of clinical practice guidelines by establishing a minimum framework for the process. This tool evaluates and questions each step of the guideline development process which include: scope and purpose, stakeholder involvement, rigor of development, clarity of recommendations, applicability, and editorial independence. The key concepts however continue to revolve around who was involved, how was the evidence retrieved and assessed, and how was the body of the evidence considered. Next slide

70. Summary: Guidelines Incorporates values and judgments Can improve care by reducing variation in practice Not meant to provide black and white answers for complex problems Not all guidelines are the same Consensus-based approach Evidence-informed approach Important to question each step of the process Who was involved? How was evidence retrieved, synthesized, and graded? How were recommendation decisions made? In summary, unlike systematic reviews clinical practice guideline incorporate values and judgments of those involved in the development process. Clinical practice guidelines can be helpful in situations where there is wide variability in practice in light of good evidence with the caveat that guidelines are not meant to provide black and white answers for all situations or complex problems. It is important to remember that not all guideline are created equally�there are significant differences and implications to consensus-based approaches as well as evidence-based or evidence-informed approaches. And it is always important to question each step of the guideline development process: who was involved? How was evidence retrieved, synthesized, and graded? And how were the recommendations made? This concludes this module. Next slide.In summary, unlike systematic reviews clinical practice guideline incorporate values and judgments of those involved in the development process. Clinical practice guidelines can be helpful in situations where there is wide variability in practice in light of good evidence with the caveat that guidelines are not meant to provide black and white answers for all situations or complex problems. It is important to remember that not all guideline are created equally�there are significant differences and implications to consensus-based approaches as well as evidence-based or evidence-informed approaches. And it is always important to question each step of the guideline development process: who was involved? How was evidence retrieved, synthesized, and graded? And how were the recommendations made? This concludes this module. Next slide.

71. Acknowledgements Attorney General Consumer and Prescriber Education Program Members of the technical advisory committee of this grant Office for Oregon Health Policy and Research The University of Texas Southwestern Medical Center The Federation of State Medical Board�s Research and Education Foundation I would like to acknowledge the attorney general consumer and prescriber program, the members of the technical advisory committee for review of the module, the office for Oregon health policy and research, the univ of Texas southwestern med center, and the federation of state medical board�s research and education foundation. Next slideI would like to acknowledge the attorney general consumer and prescriber program, the members of the technical advisory committee for review of the module, the office for Oregon health policy and research, the univ of Texas southwestern med center, and the federation of state medical board�s research and education foundation. Next slide

72. CME instructions Please complete the survey, CME questions, and program evaluation after this slide Don�t forget to click the finish button at the end of the CME questions You should be directly linked to a CME form which you will need to fill out and fax, email, or mail in order to receive credit hours Please complete the survey, CME questions, and program evaluation before exiting the session. After you have completed the CME questions, don�t forget to click he finish button�this will directly link you to a CME form which you will need to fill out and fax, email, or mail in order to receive your credit hours.Please complete the survey, CME questions, and program evaluation before exiting the session. After you have completed the CME questions, don�t forget to click he finish button�this will directly link you to a CME form which you will need to fill out and fax, email, or mail in order to receive your credit hours.

73. systematic_quiz

critical appraisal: systematic reviews and clinical practice ...

critical appraisal: systematic reviews and clinical practice ...

Presentation Transcript

Introduction to Critical Appraisal

CLINICAL RESEARCH CURRICULUM Critical appraisal of the medical literature

Systematic Software Reviews

Critical appraisal of clinical research evidence

Rapid Critical Appraisal of Randomised Controlled Trials

Systematic reviews, meta-analysis and critical reading of medical literature: Evidence-based medicine

Clinical Epidemiology Boot Camp: Systematic Reviews

Examples of systematic reviews

Conducting Lit Searches for Systematic Reviews

Introducing systematic reviews

Critical Appraisal of the Biomedical Literature

Teresa Burgess Department of General Practice

Critical appraisal of research Sarah Lawson sarah.lawson@kcl.ac.uk

Supporting systematic reviews for STEM researchers & educators

Table of Contents – Part C

SYSTEMATIC REVIEWS AND META-ANALYSIS

Introduction to Critical Appraisal

Aims and rationale of the systematic critical appraisal training module

Critical Appraisal of Clinical Practice Guidelines

Conducting systematic reviews for development of clinical guidelines 8 August 2013

Systematic Reviews of Drugs Within Classes: Bringing Clinical Evidence to State Policy Makers

critical appraisal: systematic reviews and clinical practice ...