The suitability of using RCTs in educational research

ESRC Conference: Methodological challenges for the 21st Century, Nov 22nd, 2007 The suitability of using RCTs in educational research Dr Carole Torgerson Senior Research Fellow Institute for Effective Education University of York

“A careful look at randomized experiments will make clear that they are not the gold standard. But then, nothing is. And the alternatives are usually worse.” Berk RA. (2005) Journal of Experimental Criminology 1, 417-433.

History of RCTs • First known RCT in humans was a study in 1932 looking at counselling for improving academic achievement in undergraduates; several other RCTs in educational settings followed (e.g., 1933 trial of efficacy of examinations for undergraduate students). • First health trials patulin study in 1944 followed by 1948 streptomycin trial. Forsetlund et al, Econ. Innov. New Techn. 2007: 371.

Background • Intense opposition to RCTs from some educational researchers and policy makers • No tradition of UK-funding of large-scale randomised field trials in education evaluating policy initiatives • Large rigorous trials are possible in education research • For important policy issues RCTs could and should be used

Large randomised field trials • ‘Project Star’: Phase 1 of the Tennessee Class-Size Experiment, US • Computers and literacy learning, US • Vouchers for private schools, US • Minimally qualified, minimally trained classroom assistants, India • Comparing briefly trained with fully qualified teachers, US

Computers and literacy • Governments across the world have made massive investments in educational computer technology (£1.7 billion in UK, since 1997). Few trials have been undertaken. • Rouse et al, US, evaluated Fast forword in a trial of 512 children – no noticeable effects were observed. Rouse et al, 2004, NBER working paper, 10315

UK Computer trial We recently completed a trial of 156 children looking at the effect of computers for literacy learning In a typical English secondary school we randomly allocated all Year 7 pupils (aged 11) to have 10 hours of computer teaching over two weeks The control group received ‘normal’ teaching Randomisation was performed independently by the York Trials Unit Sample size had > 80% power to show an effect size of 0.5 (moderate) of the computer programme We measured two outcomes: spelling and reading ability (spelling was our a priori primary outcome measure) Similar results to the US study obtained - no evidence of any benefit of using computers for literacy learning Brooks et al, Ed Studies, 32(2), 2006.

Results P = 0.74 P = 0.001

Conclusion This trial, which is the largest ever undertaken in the UK, shows no evidence of benefit of computer technology on spelling progression and supports the large field trial in the US. The small difference in spelling scores is not statistically significantly different. There is a statistically significant difference in reading scores; however, this favours the control group. The use of software packages for literacy learning in schools could and should be tested in large rigorously designed RCTs. Brooks et al, (2006). Educational Studies 2006;32:133-43.

School vouchers • New York ‘voucher lottery’ • Initial results suggested a positive benefit of educational vouchers; however, using intention to treat analysis found little or no benefit of attending a private school (chosen by the parents). Kreuger and Zhu 2002; NBER working paper, 9418

Classroom assistants • India - due to resource scarcity classroom assistants with 2 weeks of training were introduced using random allocation. • A RCT of > 15,000 students was used to evaluate the intervention. The programme was effective, increasing maths and English scores (effect sizes - 0.14 and 0.28 in the first and second years) • Note - small to modest effect size, but cost effective due to low cost of intervention. Banerjee et al, 2005; NBER working paper 11904

Teachers for America (TFA) • In 1989 TFA programme introduced: • Selected highly qualified people and gave them 5 weeks of training over the summer; • TFA teachers placed in schools in poor neighbourhoods; • RCT comparing TFA teachers vs control teachers (traditionally and alternatively certified, and uncertified), published in 2004; • 1800 pupils were randomised to 100 classes: 44 with TFA and 56 with conventional teacher.

TFA Evaluation • On admission to primary school, children randomised to be taught by TFA teachers or control teachers. • For maths scores: children’s gains were significantly better when taught by TFA teachers compared with control teachers. • For literacy scores: no differences. Decker et al. Mathematica Policy Research, 2004

Some characteristics of a high quality pragmatic/field trial • Large number of schools, or classes, at least 40-50, with 30 or so children per school. • Long intervention with post- and follow-up tests. • Randomisation independent from the researchers who develop the intervention and collect the data. • Data collection and testing undertaken by researchers or teachers blind to group allocation. • Such trials are often the ‘norm’ for health education – why not for non-health education?

Conclusion? • Large RCTs of important educational policy are possible, and regularly undertaken in the US and other countries. • Are there important policy areas that need evaluating in the UK? E.g. Sure Start • What about the use of systematic synthetic phonics teaching? All children aged 5 from September 2007 (> 500,000 children). What is the evidence for this policy?

Systematic review of phonics instruction • How effective are different approaches to phonics teaching in comparison to each other (including the specific area of analytic versus synthetic phonics)?

Findings • 12 individually randomised trials were identified. • All were very small and only one was from the UK. • Putting all the trials together in a meta-analysis found a small, statistically significant effect, on reading accuracy (moderate weight of evidence). • 3 trials directly compared synthetic with analytic phonics instruction. No difference between the two approaches was found, although this was based on weak evidence.

Meta-analysis: Forest plot

Sensitivity analysis • Phonics meta-analysis had significant heterogeneity. • Removal of a single, small, outlier reduced this heterogeneity. However, removal of this study resulted in the advantage of phonics instruction being no longer statistically significant.

Different types of phonics • Three small trials compared synthetic vs analytical phonics. • A meta-analysis of these trials found no difference between the two approaches.

Synthesis: synthetic vs analytic

“I am clear that synthetic phonics should be the first strategy in teaching all children to read.” • Ruth Kelly – Times Mar 21st 2006. • “The case for synthetic phonics is overwhelming”. • Jim Rose – Times Mar 21st 2006.

What could have happened? • A large field trial could have been undertaken. This could have taken the form of a ‘waiting list’ control-group design, with half of schools implementing a synthetic phonics programme in 2007-2008 and the other half starting in 2008-2009. Or alternatively, a ‘stepped wedge design’ could have been used.

Discussion • Large rigorous trials are possible in educational research. • Intense opposition to RCTs from many educational researchers. • Lack of UK funding of large-scale randomised trials in education. • For important policy issues RCTs could and should be used.

Dr Carole Torgerson Senior Research Fellow IEE, University of York, Heslington, YORK YO10 5DD Telephone: 01904 328152 Email: cjt3@york.ac.uk

The suitability of using RCTs in educational research

The suitability of using RCTs in educational research

Presentation Transcript

Quality in Educational Research

The Basics of Educational Research

Issues in Using SPSS in an Educational Research Environment

Ethics in Educational Research

Sources of bias in RCTs

The role of RCTs in Government: The experience of DWP

Applications of MAPSAT in Educational Research

Educational Research in Uttarakhand

( Suitability )

Suitability of Instruments

“The Lighter Side of Educational Research”

Employment RCTs in France

Ethics in Educational Research

Avoiding bias in RCTs

Unearthing the forces of globalisation in educational research

Statistics in Medical Research RCTs and Cohort

Statistics in Educational Research

Baseline Measurements in RCTs

The Basics of Educational Research

Unearthing the forces of globalisation in educational research

Using the Instrumental Variables Technique in Educational Research

SOME PRACTICAL ISSUES IN THE IMPLEMENTATION OF RCTs