510 likes | 513 Views
This training provides an overview of systematic reviews, their importance in informing policy and practice, and their role in providing directions for further research. The course covers topics such as the reliability and validity of review methods, as well as the Cochrane Collaboration and Campbell Collaboration. It also explores the efficacy and effectiveness of research synthesis methods.
E N D
C2 Training: May 9 – 10, 2011 Introduction to Systematic Reviews
Part 1: The science of research synthesis • Summarize existing empirical research to • Inform policy and practice • Provide directions for further research • Using empirical evidence about the reliability and validity of review methods • Cochrane Collaboration (methodological reviews), Campbell Collaboration
Evidence for practice and policy Adapted from: Gibbs (2003), Davies (2004)
Evidence for practice and policy Adapted from: Gibbs (2003), Davies (2004)
Efficacy and Effectiveness • Much attention paid to these issues • Not more important than other topics, but effects matter • Much room for improvement in how we analyze, synthesize, and understand treatment effects • Even though many reviews deal with these topics
The problem: Studies pile up • “What can you build with thousands of bricks?” (Lipsey, 1997) • Many studies are conducted on the same topic • Which one(s) do we use? How do we use them?
Rationale for Research Synthesis Combining results of multiple studies 1. Provides more compelling evidence than results of any single study • Single studies can have undue influence on practice and policy • We don’t use single subject (N=1) designs to assess public opinion, shouldn’t rely on single studies to answer important questions (e.g., about treatment effects)
Rationale for Research Syntheses (cont.) 2. Provides new opportunities to investigate • What works best for whom under what conditions • Why results may vary across studies that differ in • Research designs • Sample characteristics (populations/problems) • Intervention/implementation • Comparison conditions • Measures • Geo/cultural context/setting • Using analyses that capitalize on natural variations across studies
How do we build evidence? • What are our blueprints? • What are the raw materials? • What methods are used to combine results across studies?
Blue prints • Plans for review • Reviews vary in amount of planning, transparency, rigor • Three approaches: • Traditional, narrative reviews (still very common in social and behavioral sciences) • Systematic reviews • Meta-analysis
Traditional reviews • Convenience samples of published studies • Narrative description of studies • Cognitive algebra or “vote counting” to synthesize results • Relies on statistical significance in primary studies, which may be “underpowered” (too small or too weak to detect effects) • Decision rules are not transparent • Vulnerable to many sources of bias…
Publication bias • Studies with statistically significant, positive results are approx. 3 times more likely to be published than similar studies with null or negative results (Song et al., 2009, inception cohort) • i.e., likelihood of publication is related to direction and significance of results--net of influence of other variables • (Dickersin, 2005; Scherer et al., 2004; Song et al., 2009; Torgerson, 2006) • Sources of publication bias are complex • Investigators less likely to submit null results for conference presentations (Song et al., 2009) & publication (Dickersin, 2005; Song et al., 2009) • Peer reviewers & editors less likely to accept/publish null results? (Mahoney, 1977 vs. Song et al., 2009)
Dissemination biases • Studies with significant results are • Published faster (Hopewell et al., 2001) • Cited and reprinted more often (Egger & Smith) • More likely to be published in English than other languages (Egger, Zellweger-Zahner et al.) • Easier to locate (esp. in English)
Outcome reporting bias • Within studies with mixed results, significant results are more likely to be • reported (mentioned at all) • fully reported (i.e., data provided) • Chan et al., 2004, 2005; Williamson et al., 2006 • Recent article in the New York Review of Books on Tamiflu
Confirmation bias • Tendency to seek and accept information that confirms prior expectations (hypotheses) and ignore evidence to the contrary • (Bacon 1621/1960, Watson 1960, 1968; Mahoney, 1977; Fugelsang et al., 1994; Nickerson, 1998; Schrag, 1999 • Allegiance bias • Researchers’ preferences predict results (Luborsky et al., 1999)
Other sources of bias • Selection bias • Trivial properties of studies or reports affect recall and evaluation of information • Memorable titles (Bushman & Wells, 2001)
Problems • Publication and reporting biases are cumulative (Altman, 2006) • Tend to inflate estimates of effects • Serve to maintain orthodoxy (popular theories/treatments) • These biases are ubiquitous, but often ignored
Better blueprints: Systematic reviews • Aim to minimize bias and error in the review process • Develop & follow pre-determined plan (protocol) • Use transparent (well-documented, replicable) procedures to locate, analyze, and synthesize results of previous studies
Systematic reviews (SRs) Steps to reduce bias and error: • Set explicit inclusion/exclusion criteria • Develop and document strategies for locating all relevant studies (regardless of publication status) • Inter-rater agreement (reliability) on key decisions, data extraction, coding • Formal study quality assessment (risk of bias) • Meta-analysis (when possible) to synthesize results across studies
Meta-analysis (MA) Set of statistical procedures used to assess • Averages across studies • Variations across studies • Potential sources of variation (moderators) • Risk of bias (e.g., tests for publication & small sample bias)
Systematic reviews don’t always include meta-analysis • Might include narrative synthesis (or no synthesis) • Can include multiple meta-analyses • Meta-analyses are not always based on systematic reviews • Many use convenience sample of published studies • Vulnerable to publication and dissemination biases
Some “systematic reviews” aren’t • Evidence-based standards for SRs & MA • based on methodological research (Cochrane Library) • Standards for conduct of SRs • developed by Cochrane and Campbell Collaborations (Higgins & Green, 2009) • Standards for reporting SRs & MA • PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses; Moher et al., 2009) • Standards not followed by US Evidence-based Practice Centers, most peer-reviewed journals, etc.
Quality of raw materials matters • High quality materials (studies) needed to produce a strong, reliable product • “Best” materials depend what we are building (aims)
What are studies made of? Multiple concepts and measures
Building evidence • A simple example: one study, 30 outcome measures • What did investigators make of results? • What did reviewers make of results? • (Littell, 2008)
An Example: (Brunk et al. 1987)Parent training vsMultisystemic Therapy • 43 families of abused/neglected children randomly assigned to • Parent training (PT) groups or Multisystemic Therapy (MST) • 33 /43 families completed treatment and provided data on outcomes immediately after treatment • 30 outcomes (scales and subscales)
Results obtained (Brunk et al. 1987)Parent Training vsMultisystemic Therapy Outcome data
What did the investigators make of these results? • Data provided on • all (7) statistically significant results • 12/22 non-significant results • Outcome reporting bias Data provided Outcome data
What did the investigators make of these results? • Both groups showed decreased psychiatric symptoms, reduced stress, and reduced severity of identified problems. • MST was more effective than PT at restructuring parent-child relations. • PT was more effective than MST at reducing identified social problems. Abstract Data provided Outcome data
What did the investigators build?A balanced report (with some missing data) Abstract Data provided Outcome data
What did the published reviews build? “Parents in both groups reported decreases in psychiatric [symptoms] and reduced overall stress....both groups demonstrated decreases in the severity of the identified problems....[MST] improved [parent-child] interactions, implying a decreased risk for maltreatment of children in the MST condition” (p. 293). Outcome data
What did the published reviews build? Outcome data
What did the published reviews build? Outcome data
What did the published reviews build? Outcome data
Summary: Published reviews describing Brunk et al. Outcome data
Summary: Published reviews describing Brunk et al. • Most reviews used a single phrase to characterize results of this study, highlighting advantages of one approach (MST) • Ignoring valuable information on relative advantages, disadvantages, and equivalent results of different approaches Outcome data
Reviews include multiple studies How do reviewers add them up? (synthesize results)
What methods do reviewers use? • Analysis of reviews of research on effects of MST published after 1996 • 86+ reviews • more reviews than studies! • Assessed 66 reviews • Many “lite” reviews (rely on other reviews) • 37 reviews cited one or more primary studies (Littell, 2008)
What methods do reviewers use? • 37 reviews cited one or more primary studies • Most were traditional, narrative summaries of convenience samples of published reports (Littell, 2008) • Most conclude that MST “works” (is consistently more effective than alternatives) • Some conclude that MST is effective across problems, populations and settings • citing Brunk et al. (1987) [only] as the evidence for effects in cases of child abuse and neglect (Burns et al., 2000; Kazdin, 2003; Kazdin & Weisz, 1998; Lansverk, 2007)
Better ways to build evidence • Using the science of research synthesis • An example…
A Cochrane/Campbell review • Multisystemic Therapy for social, emotional, and behavioral problems in youth aged 10-17 (Littell, Popa, Forsythe, 2004, 2005) • Protocol published in 2004, review published in 2005 in the Cochrane Library and the Campbell Library • Series of articles in Children and Youth Services Review (includes debate with MST developers) • Update underway now
Campbell/Cochrane review • Search for relevant studies regardless of publication status • Information retrieval specialists • Multiple strategies for locating grey literature • Search for missing data from published studies • Contacts with investigators • Formal data extraction & coding • Reliability checks • Formal study quality assessment • Identified methodological problems that had not be mentioned in the literature • Ranked studies by methodological quality • Separate syntheses (meta-analyses) for conceptually-distinct outcomes
Summary: SR of MST • Effects are not consistent across studies • Few studies, most conducted by program developers in USA • All studies have mixed results across outcomes, except those that have null results on all outcomes • Contrary to conclusions of most published reviews • Which suggest the effectiveness of MST is well established, consistent across studies
Why traditional reviews and well-meaning experts can be misleading • Scholars are human • Rely on “natural” methods to filter and synthesize data • Human brain • Good at detecting patterns, maintaining homeostasis, defending territory • Bad at complex math, revising beliefs (Runciman, 2007) • Research synthesis is too complex for informal methods, “cognitive algebra”