120 likes | 209 Views
Andrew Thomson. Analysis of Cluster Randomized Trials when the outcome is a rate or a proportion and the number of clusters is small. Recent quotes. “These problems suggest that additional research is required on the development of methodology for trials enrolling a small number of clusters”
E N D
Andrew Thomson Analysis of Cluster Randomized Trials when the outcome is a rate or a proportion and the number of clusters is small
Recent quotes • “These problems suggest that additional research is required on the development of methodology for trials enrolling a small number of clusters” • “Issues arising at the analysis stage of a trial such as the choice between population averaged and cluster specific approaches… deserve further attention” • While the results are encouraging, it would be of interest to see how the method performs with smaller groups”
Methods to be used • Standard chi-sq • Adjusted Chi-squared (Donner & Donald, Rao & Scott) • T-test, including logistic regression followed by residual analysis • Random Effects Modelling • GEEs • Small sample corrected GEEs (Mancl & DeRouen, Bell & McCaffrey) • Bayesian methods • Non-parametric methods
Outcome measures • Size • Coverage • Power • Bias of treatment effect • Only look at power of tests of suitable size?
Simulation parameters (1) • What is small? Eldridge published 200 CRTs, median number of clusters per arm was 17 • GEEs not suitable with approximately <20 clusters per arm • CREATE designs • Feasible designs. E.g. 1000 people per cluster, π1 = 0.02, OR=2, k=0.5 results in 14 clusters per arm. Unlikely to ever need more than this due to higher k being unlikely. K = 0.2 →8 clusters, k = 0.3 →10 clusters
Simulation parameters (2) • Cluster size • Variable cluster size • Baseline prevalence • Non-normal errors • Stratification / Matching? • Rates…
A tangent – K v ICC • K2 is defined as Var(πi) / E(πi)2 • ICC = Var(πi) / (E(πi) * (1 - E(πi) ) • It follows that k2 * π / ( 1- π) • This is assumed in the control arm • The assumption of common k across arms is not the same as a constant ICC across arms. • These different assumptions imply different variances of πi in the intervention arm
Variance in the intervention arm • Constant k. Risk ratio of 0.2 • Var (π2) = Var (π1) / 4 • Constant ICC. Risk ratio of 0.2 • Var (π2) = Var (π1) *(1- π/2 )/ (2*(1- π)) • Small π, this is approximately Var (π1) /2 • I.e. the variance in the intervention arm is half that for k, as opposed to ICC
Why does this matter ? • Discrepancies in the sample size calculations Or
More discrepancies • Obvious - +1 in the presence of no clustering • With small k or ICC, this difference still exists • As k increases, the “+1” term ‘mops up’ the effect of a smaller variance using k • As k gets very large, you start to get a slight difference, eg k = 0.5, π1 = 0.02, π2 = 0.01, requires 13 (14) clusters • With a small number of clusters, this can greatly increase the cost of a trial, which is better? Will look at this using simulation study and looking at closeness to the nominal power of 80%
Yet more problems… • What do I choose as my variance estimate in the intervention arm for simulating data? • Possible solution? Replace the k by the raw variance estimates in formula?
Other questions • Covariates. 1? 2?, magnitude of impact? • Bayesian models. Assume a hierarchical logistic model, different variances for each arm? Will we have enough data to do this and not end up with unfeasibly wide C.I.s • Can one assume this for RFX models? Issues of convergence…