610 likes | 1.04k Views
Using meta-analyses in your literature review. BERA Doctoral Workshop 3rd September 2008 Professor Steven Higgins Durham University s.e.higgins@durham.ac.uk. Acknowledgements.
E N D
Using meta-analyses in your literature review BERA Doctoral Workshop 3rd September 2008 Professor Steven Higgins Durham University s.e.higgins@durham.ac.uk
Acknowledgements • This presentation is an outcome of the work of the ESRC-funded Researcher Development Initiative: “Training in the Quantitative synthesis of Intervention Research Findings in Education and Social Sciences” which ran from 2008-2011. • The training was designed by Steve Higgins and Rob Coe (Durham University), Carole Torgerson (Birmingham University) and Mark Newman and James Thomas, Institute of Education, London University. • The team acknowledges the support of Mark Lipsey, David Wilson and Herb Marsh in preparation of some of the materials, particularly Lipsey and Wilson’s (2001) “Practical Meta-analysis” and David Wilson’s slides at: http://mason.gmu.edu/~dwilsonb/ma.html (accessed 9/3/11). • The materials are offered to the wider academic and educational community community under a Creative Commons licence: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License • You should only use the materials for educational, not-for-profit use and you should acknowledge the source in any use.
Aims • To support understanding of meta-analysis of intervention research findings in education; • To extend understanding of reviewing quantitative research literature; • To describe the techniques and principles of meta-analysis involved to support understanding of its benefits and limitations; • To provide references and examples to support further work.
ESRC Researcher Development Initiative • Quantitative synthesis of intervention research findings in education • Collaboration between • Durham University • York University • Institute of Education, London
Why review? • Ask the person next to you what the purpose of the literature review is in their thesis • See how many different purposes you can think of • Join another pair and identify which are the 3 you think are the most important
Why review? • Summarise existing knowledge • What we know, and how we know it • For what purpose? • Expectation • Scenery • State of the art (summary) • Positioning (conceptual) • Progressing knowledge (logic)
The PhD literature review • Narrative summary of the area • Grand tour of the concepts and terminology • Synthesis of empirical findings • Background to the study
A systematic review • is usually more comprehensive; • is normally less biased, being the work of more than one reviewer; • is transparent and replicable (Andrews, 2005)
Examples of systematic reviews • EPPI Centre • UK based - wide range of educational topics • The Campbell Collaboration • 5 education reviews • Best Evidence Encyclopedia • John’s Hopkins’ - aimed at practice
Systematic reviewing • Key question • Search protocol • Inclusion/exclusion criteria • Coding and Mapping • In-depth review (sub-question) • Techniques for systematic synthesis
Systematic reviews • Research and policy • Specific reviews to answer particular questions • What works? - impact and effectiveness research with a tendency to focus on quantitative and experimental designs
Literature reviewing - conceptual relations Narrative review Systematic review Meta-analysis
Meta-analysis • Synthesis of quantitative data • Cumulative • Comparative • Correlational • “Surveys” educational research (Lipsey and Wilson, 2001)
Origins 1952: Hans J. Eysenck concluded that there were no favorable effects of psychotherapy, starting a raging debate which 25 years of evaluation research and hundreds of studies failed to resolve 1978: To proved Eysenck wrong, Gene V. Glass statistically aggregated the findings of 375 psychotherapy outcome studies Glass (and colleague Smith) concluded that psychotherapy did indeed work - “the typical therapy trial raised the treatment group to a level about two-thirds of a standard deviation on average above untreated controls; the average person received therapy finished the experiment in a position that exceeded the 75th percentile in the control group on whatever outcome measure happened to be taken”(Glass, 2000). Glass called the method “meta-analysis” ( adapted from Lipsey & Wilson, 2001)
Historical background • Underpinning ideas can be identified earlier: • K. Pearson (1904) Averaged correlations for typhoid mortality after inoculation across 5 samples • R. A. Fisher (1944) “When a number of quite independent tests of significance have been made … although few or none can be claimed individually as significant, yet the aggregate gives an impression that the probabilities are on the whole lower than would often have been obtained by chance” (p. 99). Source of the idea of cumulating probability values • W. G. Cochran (1953) Discusses a method of averaging means across independent studies Set out much of the statistical foundation for meta-analysis (e.g., Inverse variance weighting and homogeneity testing) ( adapted from Lipsey & Wilson, 2001)
Significance versus effect size • Traditional test is of statistical ‘significance’ • The difference is unlikely to have occurred by chance • However it may not be: • Large • Important, or even • Educationally ‘significant’
The rationale for using effect sizes • Traditional reviews focus on statistical significance testing • Highly dependent on sample size • Null finding does not carry the same “weight” as a significant finding • Meta-analysis focuses on the direction and magnitude of the effects across studies • From “Is there a difference?” to “How big is the difference?” • Direction and magnitude represented by “effect size”
Effect size • Comparison of impact • Same AND different measures • Significance vs effect size • Does it work? vs How well does it work?
‘Effect size’ • Standardised way of looking at gain scores • Different methods for calculation • Experimental group mean - Control mean/ Standard deviation
What is “effect size”? • Standardised way of looking at difference • Different methods for calculation • Odds Ratio • Correlational (Pearson’s r) • Standardised mean difference • Difference between control and intervention group as proportion of the dispersion of scores
Calculating effect size • Control group gain minus experimental group gain divided by the standard deviation of the groups
Effect size and impact From: Marzano, R. J. (1998) A Theory-Based Meta-Analysis of Research on Instruction. Aurora, Colorado, Mid-continent Regional Educational Laboratory. Available at: http://www.mcrel.org:80/topics/products/83/ (accessed 2/9/08).
Interpreting effect sizes • Relative effects - average is about 0.37 - 0.4(Sipe and Curlette, 1997; Hattie, Biggs and Purdie, 1996) • Doing something different makes a difference • Visualising the difference
How much is the impact? 0.1 = percentile gain of 6 points ie a class ranked 50th in a league table of 100 schools would move from 50th to about 44th place 0.5 = percentile gain of 20 points ie move from 50th to 30th place 1.0 = percentile gain of 34 points ie move from 50th to 16th place
Other interpretations 0.2 “small” = difference in height between 15-16 year olds 0.5 “medium” = difference in height between 14 and 18 year olds 0.8 “large” = difference in height between 13 and 18 year olds Cohen 1969
Meta-analysis • Key question • Search protocol • Inclusion/exclusion criteria • Coding • Statistical exploration of findings • Mean • Distribution • Sources of variance
Some findings from meta-analysis Pearson et al. 2005 • 20 research articles, 89 effects ‘related to digital tools and learning environments to enhance literacy acquisition’. Weighted effect size of 0.489 indicating technology can have a positive impact on reading comprehension Bernard et al. 2004 • Distance education and classroom instruction - 232 studies, 688 effects - wide range of effects (‘heterogeneity’); asynchronous DE more effective than synchronous
More findings Hattie and Timperley, 2007 • ‘The Power of Feedback’, synthesis of other meta-analyses on feedback to provide a conceptual review 196 studies, 6972 effects - average effect of feedback on learning 0.79
Rank (or guess) some effect sizes… Formative assessment CASE (Cognitive Acceleration Through Science Education) Individualised instruction ICT Homework Direct instruction
Rank order of effect sizes 1. 04 CASE (Cognitive Acceleration Through Science Education) (Boys science GCSE - Adey & Shayer, 1991) 0.6 Direct instruction (Sipe & Curlette, 1997) 0.43 Homework (Hattie, 1999) 0.32 Formative assessment (KMOFAP) 0.31 ICT (Hattie, 1999) 0.1 Individualised instruction (Hattie, 1999)
‘Super-syntheses’ • Syntheses of meta-analyses • Relative effects of different interventions • Assumes variation evens out across studies with a large enough dataset (Marzano/Hattie) or attempts to control for the variation statistically (Sipe & Curlette)
Hattie Biggs and Purdie, 1996 Synthesis of study skills interventions Meta-analysis of 51 studies of study skills interventions. Categorised the inverventions using the SOLO model (Biggs & Collis, 1982), classified studies into four hierarchical levels of structural complexity and as either ‘near’ or ‘far’ transfer. The results support situated cognition, and that training for other than simple mnemonic tasks should be in context, use tasks within the same domain as the target content, and promote a high degree of learner activity and metacognitive awareness. (average effect 0.4)
Sipe and Curlette, 1997 • “A metasynthesis of factors relating to educational achievement” - testing Walberg’s ‘educational productivity’ model - synthesis of 103 meta-analyses
Marzano, 1998 ‘Theory driven’ Self system - metacognition - cognition/ knowledge Self - 0.74 Metacogntive 0.72 Cognitive 0.55
Discussion • Work with a colleague to put the statements in order of how comparable you think the research findings are • Join another pair (or pairs) and decide how comfortable would you be with comparing the findings
Issues and challenges in meta-analysis • Conceptual • Reductionist - the answer is 42 • Comparability - apples and oranges • Atheoretical - ‘flat-earth’ • Technical • Heterogeneity • Publication bias • Methodological quality
Reductionist or ‘flat earth’ critique The “flat earth” criticism is based on Lee Cronbach’s assertion that a meta-analysis looks at the “big picture” and provides only a crude average. According to Cronbach, “… some of our colleagues are beginning to sound like a Flat Earth Society. They tell us that the world is essentially simple: most social phenomena are adequately described by linear relations; one-parameter scaling can discover coherent variables independent of culture and population; and inconsistencies among studies of the same kind will vanish if we but amalgamate a sufficient number of studies…The Flat Earth folk seek to bury any complex hypothesis with an empirical bulldozer…” (Cronbach, 1982, in Glass, 2000).
Comparability • Apples and oranges • Same test • Different measures of the same construct • Different measures of different constructs • What question are you trying to answer? • How strong is the evidence for this? “Of course it mixes apples and oranges; in the study of fruit, nothing else is sensible; comparing apples and oranges is the only endeavor worthy of true scientists; comparing apples to apples is trivial” (Glass, 2000).
Empirical not theoretical? • What is your starting point? • Conceptual/ theoretical critique • Marzano • Hattie • Sipe and Curlette
Technical issues • Interventions • Publication bias • Methodological quality • Sample size • Homogeneity/ heterogeneity
Interventions • “Super-realisation bias” (Cronbach & al. 1980) • Small-scale interventions tend to get larger effects • Enthusiasm, attention to detail, quality of personal relationships
Publication bias • Statistically significant (positive) findings • Smaller studies need larger effect size to reach significance • Larger effects • ‘Funnel plot’ sometimes used to explore this Scatterplot of the effects from individual studies (horizontal axis) against a study size (vertical axis)
Methodological quality • Traditional reviews privilege methodological rigour • Low quality studies higher effect sizes (Hattie Biggs & Purdie, 1996) • No difference (Marzano, 1998) • High quality studies, higher effect sizes (Lipsey & Wilson, 1993) • Depends on your definition of quality
Sample size “Median effect sizes for studies with sample sizes less than 250 were two to three times as large as those of larger studies.” (Slavin & Smith, 2008)
Heterogeneity • Variation in effect sizes • Investigate to find clusters (moderator variables) • Assumption that the effect will be consistent
Questions and reactions • With a colleague see if you can identify a question arising from the presentation so far • What is your reaction to the technique • How useful is it • Generally • To your own work?
Strengths of Meta-Analysis Uses explicit rules to synthesise research findings Can find relationships across studies which may not emerge in qualitative reviews Does not (usually) exclude studies for methodological quality to the same degree as traditional methods Statistical data used to determine whether relationships between constructs need clarifying Can cope with large numbers of studies which would overwhelm traditional methods of review