360 likes | 492 Views
What makes a good quality trial?. Professor David Torgerson York Trials Unit. Background. Whilst the RCT is the most rigorous research design some are better than others. It is important that trials use the best methods and report these. Reporting Guidelines.
E N D
What makes a good quality trial? Professor David Torgerson York Trials Unit
Background • Whilst the RCT is the most rigorous research design some are better than others. • It is important that trials use the best methods and report these.
Reporting Guidelines • Because of a history of poor trial reporting a group of trial methodologists developed the CONSORT statement. Susequently, major medical journals (e.g. BMJ, Lancet, JAMA) have adopted this as editorial policy. • This sets out the minimum items that trials should report to be published in these journals.
Internal versus External Validity • Internal validity is most important: are the trial results correct for the sample used? • External validity less important: is the trial result applicable to the general population? • A trial cannot be externally valid if it is not also internally valid.
Important quality items • Allocation method; • method randomisation; • secure randomisation. • Intention to treat analysis. • Blinding. • Attrition. • Sample size.
Allocation Method • How was the allocation method devised? • Was secure allocation used? • Secure allocation means separate generation and allocation of participants from the person recruiting.
Secure allocation • Why do we need secure, preferably independent, allocation? • Because some researchers try to ‘subvert’ the allocation • In a survey of 25 researchers 4 (16%) admitted to keeping ‘a log’ of previous allocations to try and predict future allocations. Brown et al. Stats in Medicine, 2005,24:3715.
Subversion - evidence • Schulz has described, anecdotally, a number of incidents of researchers subverting allocation by looking at sealed envelopes through x-ray lights. • Researchers have confessed to breaking open filing cabinets to obtain the randomisation code. • In a surgical trial with 5 centres – 3 were found to be independtly subverting the allocation. Schulz JAMA 1995;274:1456.
Mean ages of groups Kennedy & Grant. 1997;Controlled Clin Trials 18,3S,77-78S
Recent Blocked Trial “This was a block randomised study (four patientsto each block) with separate randomisation at each of the threecentres. Blocks of four cards were produced, each containing twocards marked with "nurse" and two marked with "house officer."Each card was placed into an opaque envelope and the envelopesealed. The block was shuffled and, after shuffling, was placedin a box.” Kinley et al., BMJ 2002 325:1323.
Or did they do this? • “Randomisation was accomplished using a balanced block design (four patients to each block) with a separate randomisation process at each of the three centres. A separate series of consecutively numbered, opaque sealed envelopes was administered at each research centre” Kinley et al. 2001 Health Technology Assessment, Vol 5, no 20, p 4.
What is wrong here? Kinley et al., BMJ 325:1323.
Problem? • If block randomisation of 4 were used then each centre should not be different by more than 2 patients in terms of group sizes. • Two centres had a numerical disparity of 11. Either blocks of 4 were not used or the sequence was not followed.
More Evidence • Hewitt and colleagues examined the association between p values and adequate concealment in 4 major medical journals. • Inadequate concealment largely used opaque envelopes. • The average p value for inadequately concealed trials was 0.022 compared with 0.052 for adequate trials (test for difference p = 0.045). Hewitt et al. BMJ;2005: 330: 1057 - 1058
Intention to Treat Analysis • Were all allocated participants analysed in their original groups ? • Active treatment analysis, analysing by treatment received, can result in bias.
Non use of ITT - Example • “It was found in each sample that approximately 86% of the students with access to reading supports used them. Therefore, one-way ANOVAs were computed for each school sample, comparing this subsample with subjects who did not have access to reading supports.” (Feldman and Fish, J Educ Computing Res 1991, p 39-31).
Can it change findings? • In New York a randomised trial of vouchers for private schools was undertaken. Vouchers were offered to poor parents to enable them to send their child to a private school of their choice. Initial analysis was undertaken of the children using changes in their test scores. However, many pre-tests were missing and some post-tests. Complete case analysis indicated voucher children got better test scores than children in state schools.
BUT… • The initial analysis did not use ITT as some data were missing. A further analysis of post test scores (state exams) where there was nearly complete case ascertainment found NO difference in test scores between the groups. Krueger & Zhu 2002, NBER Working Paper 9418
Blinding • Who knew who got what when? • Was the participant blind? • Was practitioner blind? • Most IMPORTANT was outcome assessment blind? • This is particularly important for subjective outcomes or outcomes in a grey area – (e.g., marking an essay knowledge of group allocation may lead to better or lower scores)
Attrition • What was the final number of participants compared with the number randomised? • What happened to those lost along the way? • Was there equal attrition?
Attrition • Rule of thumb < 5% not really a problem. • >5% needs to be equal between groups otherwise potential bias. • Is information on the characteristics of lost participants presented and does this suggest that they are similar between groups?
Sample size • Was the sample size adequate to detect a ‘reasonable’ or credible difference? • How was the sample size calculated?
Sample Size • Small trials will miss important differences. • Bigger is better in trials. • Why was the number chosen? For example “given an incidence of 10% we wanted to have 80% power to show a halving to 5%” or “we enrolled 100 participants”. • Custom and practice in education trials tend around sample size of 30. • Trials should be large enough to detect at least 0.5 Effect Size (i.e., 128 or bigger)
A Quality Comparison of RCTs in Health & Education Carole Torgerson1, David Torgerson2, Yvonne Birks2, Jill Porthouse2 Departments of Educational Studies1 and Health Sciences2, University of York Torgerson et al. British Educational Research Journal, 2005, 761.
Are Trials of Good Quality? • We sought to ascertain whether there was a differential quality between health care and educational trials. • Are trials improving in quality? • We looked at a sample of trials from different journals from 1990 to 2001 and looked at before and after CONSORT adoption.
Change in concealed allocation P = 0.04 P = 0.70 NB No education trial used concealed allocation
Blinded Follow-up P = 0.03 P = 0.54 P = 0.13
Underpowered P = 0.22 P = 0.01 P = 0.76
Mean Change in Items P= 0.03 P= 0.001 P= 0.07
Has Consort had an Effect? • As trialists we KNOW that pre-test post-test or before and after data are the weakest form of quantitative evidence. • Evidence from this BEFORE and AFTER study does NOT support the view that CONSORT has had an effect on the quality of reporting. Need to look at time-series data. • Before CONSORT there was a strong trend to improving quality of reporting this trend has continued since CONSORT.
Quality Improvement • In a multiple regression analysis calendar year was a stronger predictor of the number of items scored than pre and post consort. • Journal quality was highly predictive with ‘good’ quality general journals reporting significantly more items than specialist health journals.
CONSORT Effect • Although our study seemed not to show an effect of CONSORT. Others have. Moher et al, compared the BMJ, Lancet, JAMA (CONSORT adopters) with the N Engl J Med (initial non-adopter) and found better quality reporting. Moher et al. JAMA 2001, 285:1992.
Quality and citations • Are better quality trials cited more often than poor quality trials? • Unfortunately, not – a recent citation review suggests that it is journal quality rather than trial quality which dominates citation rates. Nieminen et al. BMC 2006, 6:42
Conclusion • Evidence based policy demands good quality trials that are reported well. • Many health care trials are of poor quality, educational trials are worse. • Increasing the numbers of RCTs will not improve policy making UNLESS these trials are of good quality.