Before and After Studies: A Reminder

Before and After Studies: A Reminder Introduction to the Design and Analysis of Trials can be found on: http://www-users.york.ac.uk/~djt6/

Background • Many researchers (?) use before and after studies – they are, of course, nearly completely useless. • Why? This is because of: • Regression to the mean • Temporal changes

Which Researchers (?) use before and after? • Clinicians, teachers assessing individuals. • Action researchers. • Audit.

Temporal Change • Things change, people get better, policy changes all of which may make a difference. • A before and after study CANNOT possibly cope with these temporal events.

Regression to the Mean • Is a group phenomenon applies when we measure a group of people and re-measure them. • Those with values below or above the mean will tend to regress back towards the mean on re-measurement.

Before and after treatment for neck pain Improvement highly significant p < 0.0001

Plot of difference scores • A symptom of regression to the mean is if you plot change scores (baseline – follow up) against baseline scores. A correlation indicates RTM. • Thus, those with the lowest baseline improve the most and those with the highest improve the least.

Scatterplot showing RTM Correlation of Change Score with baseline values = 0.33 p < 0.0001

Some benefit of vaccination is due to regression to mean

Meningitis • After vaccination new cases of meningitis fell from about 240 to 35 an 85% decrease. HOWEVER, of the 205 cases that were ‘prevented’ the majority 120 were due to ‘regression to the mean’ effects ONLY 41% were probably due to the efficacy of the vaccine.

Education intervention • Wheldall selected 40 pupils whose reading was at least 2 years behind their peers. • Half were exposed to an intervention. Wheldall Educational Review 2000;52:29.

Before and after reading programme Difference highly statistically significant p < 0.001

Before and after reading programme Differences between groups NOT statistically significant

RTM misunderstanding • “the mean gain scores translated to impressive effect sizes of 0.6.” • “It could be argued that it is asking too much of any program to demonstrate enhanced efficacy on top of such high existing efficacy” • “…control group gains were largely attributable to pre-existing …literacy programme..” • Perhaps, BUT much of the gain will be due to RTM.

Evaluation of School intervention • A secondary school routinely offered children who scored badly on a reading test an ICT intervention. • This was shown to improve children’s literacy.

ICT and Reading

Did it work? • Impossible to tell. Regression to the mean and temporal effects does not allow us to find this out. • Fortunately, we are doing a RCT of ICT and reading.

RTM and Policy Decisions • Government policy targets 10 worst areas for street crime. 1 year later 17% fall in crime – some or all due to RTM. • 40% increase in gun crime results in a month’s amnesty for fire arms – will probably work through RTM.

Annual Increase in offences with firearms Amnesty

Exam marking • In MSc double blind marking. Two markers disagree at the extremes of the distribution. • We might fool ourselves that one marker is ‘hard’ and the other a ‘softie’ but really it is RTM.

RTM and exam scripts

Policy Changes • Regression to the mean is an excellent method of ‘proving’ something works; • Failing schools or hospitals can have an ‘expensive’ management change and there is a good chance that regression to the mean will do the job.

Proving ‘Effective’ Treatments • RTM is an excellent phenomenon to ‘prove’ to doubting clinicians the value of a new treatment. • Choose an outcome measure with a high variance (e.g., single BP measure, FEV). Identify patients with extreme values (preferably only measured once), treat and re-measure. The group mean ought to decline (not all patients will improve but most will).

Dealing with RTM • Sequential measurements taking an average (e.g., 3 BP measurements averaged out) will reduce the problem. • The only way to reliably deal with the problem is through randomised trials. • Which is why before and after data are generally regarded as almost USELESS.

Ceiling and Floor Effects • As well as RTM before and after studies are blighted by ceiling and floor problems. • Often measurement instruments have a floor (e.g., 0) or a ceiling (e.g., 100%), which means if someone’s value is close to either of these extremes they cannot change much except towards the mean.

League Tables • Classic problem of RTM with ceiling and floor effects. For example, schools that get close to 100% 5 GCSEs cannot do any better, whereas schools with very low levels can only go upwards. This phenomenon is skillfully exploited by politicians to show an effect. Similarly with hospital league tables. • Same problem applies to quality of life measures. EuroQol for example, has ceiling problems.

Summary • Before and after studies are the weakest evaluative method of proving something does or does not work. • To control for temporal changes and regression to the mean controlled trials are required.

Conclusion • You can prove virtually any ‘crackpot’ theory using RTM. • NEED a control group.

Before and After Studies: A Reminder