170 likes | 288 Views
BCG REVAC- Cluster Randomization Trial. Data Analysis – Statistical Issues Bernd Genser, PhD Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador Email: bernd.genser@bgstats.com Slides available at: www.bgstats.com /port/links/downloads
E N D
BCG REVAC- Cluster Randomization Trial Data Analysis – Statistical Issues Bernd Genser, PhD Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador Email: bernd.genser@bgstats.com Slides available at:www.bgstats.com/port/links/downloads Seminário ABRASCO- Métodos em Epidemiologia: ESTUDOS DE COORTE, Rio de Janeiro, 01-AUG - 2005
The BGC-trial from a statistician‘s point of view Main Objective: Estimation of an unbiased consistent measure of Vaccine Efficacy (VE) incl. 95% CI of a BCG dose given to school children in a population with a high coverage of neonatal BCG vaccination Secondary objective: Identify effect modifiers (city, BCG scar, …)
Issues to be addressed in Statistical Analysis 1) Potential confounding and effect modification - Trial design:Complex multi-level covariate structure - Adjusting/controlling for confounding of fixed and time-varying (e.g. age) tb predictors - Heterogeneity of VE across covariatestrata expected 2) Cluster Randomization – Adjusting the estimates for potential intra-cluster correlation 3) Expected low incidence of tb: More clusters than cases expected => Traditional statistical methods for CRT could not applied
Analytical Solutions for the BCG trial 1) Issue 1: Dealing with potential confounding variables: • Controlled by study design Stratification/randomization: • Allocation groups were highly balanced in confounding variables => No statistical adjustment required for these covariates • Matching by size of school accounts additionally for effect of “cluster size” • Adjusted in Statistical Analysis • Tb incidence is well known strongly dependent on age => age modeled as time-varying variable
Dealing with covariates in the BCG trial Subgroup analysis Design: Strat. Subgroup analysis Design: Strat. Design: Random. Stat. Adjustment Design: Random. Subgroup analysis Design: Matching Subgroup analysis Design: Random. Subgroup analysis
total population recruited study children total allocation group allocation group intervention control total intervention control total covariate n=180655 n=173751 n=354406 n=124340 n=115594 n=239934 individual level city (% children in Salvador) 58.6% 53.0% 55.9% 58.8% 49.0% 54.1% age (mean, sd) 11.53 (2.16) 11.46 (2.17) 11.51 (2.16) 11.53 (2.08) 11.44 (2.10) 11.48 (2.09) age_group < 7 0.0% 0.3% 0.2% 0.0% 0.3% 0.1% 7-8 15.6% 16.3% 16.0% 14.3% 15.6% 14.9% 9-10 24.9% 24.6% 24.7% 25.5% 25.3% 25.4% 11-12 29.2% 29.0% 29.1% 31.0% 30.6% 30.8% 13-14 28.0% 28.3% 28.1% 28.1% 28.1% 28.1% > 14 2.3% 1.5% 1.9% 1.1% 0.1% 0.6% gender (% males) 49.5% 49.5% 49.5% 48.2% 48.5% 48.4% BCG scar reading total 76.0% 72.5% 74.3% 100.0% 100.0% 100% after excl. bec. Age 76.3% 72.5% 74.2% BCG scar count no scar 11.7% 10.8% 11.3% 16.6% 16.0% 16.3% one scar 58.4% 56.6% 57.5% 83.4% 84.0% 83.7% two scars 4.3% 3.9% 4.1% 0% 0% 0% no data 25.6% 28.7% 27.1% 0% 0% 0% vaccination (% vaccinated) 66.7% 0.5% 34.2% 94.6% 0.7% 49% Evaluation of the random allocation procedure
total population recruited study children total allocation group allocation group intervention control total intervention control total covariate n=180655 n=173751 n=354406 n=124340 n=115594 n=239934 cluster level schools count 388 (50.8%) 375 (49.2%) 763 (100%) 386 (51.3%) 365 (48.7%) 751 (100%) cluster size mean (sd) 465 (325) 463 (290) 464 (308) 322 (233) 317 (215) 319 (224) min, max 26; 2368 36; 1764 26; 2368 11; 1430 10; 1334 10; 1430 gender (% males) mean (sd) 49.7% (7.0%) 50.1% (5.4%) 49.9% (6.3%) 48.4% (7.4%) 49.3% (5.8%) 48.8% (6.7%) min, max 0%; 84.7% 35.2%; 99.1% 0%; 99.1% 0%; 83.3% 35%; 98.8% 0%; 98.8% Scar Read. (% yes) mean (sd) 75.1% (12.2%) 71.0% (17.0%) 73.1% (14.9%) 100% 100% 100% min, max 0%; 95.7% 0%; 95.1% 0%; 95.7% Scar Count (% 0 or 1) mean (sd) 69.2% (12.2%) 65.8% (16.3%) 67.5% (14.4%) 100% 100% 100% min, max 0%; 90.3% 0%; 91.3% 0%; 91.3% data available for Salvador only: soc. Eco. cond. 0-25 2.5% 3.1% 2.8% 26-50 10.0% 11.1% 10.5% 51-75 30.3% 29.6% 30.0% 76-HI 57.3% 56.2% 56.7% data available for Manaus only: inc. of tbc (mean, sd) 121.6 (91.3) 126.3 (74.8) 123.5 (84.8) mean (sd) 14.5; 618.0 14.5; 618 14.5; 618 min, max inc. of leprosy (mean, sd) 8.8 (9.8) 7.8 (7.2) 8.4 (8.8) mean (sd) 0.3; 66.9 0; 66.9 0; 66.9 min, max Evaluation of the random allocation procedure
Analytical Solutions for the BCG trial (2) • Issue 2: Dealing with effect modification: • Subgroup analyses conducted by • No. of BCG Scars (First or Second dose) • City (Salvador and Manaus) • Clinical form/Certainty level Strong evidence of effect heterogeneity found: - We decided to analyze children with 1 and 0 scar seperately: 1st, 2nd dose effect are completely different scientific questions =>No interaction model fitted! - All analyses were presented overall and by city and clinical form
Analytical Solutions for the BCG trial (3) • Issue 3: Adjusting the estimates for the “design effect” Statistical problem: between-cluster variation (=intra-cluster correlation), induced by unexplained dependence structure between children from the same school, usually caused by common unknow/unobserved risk factores => Consequence: standard statistical approaches can substantially underestimate the true variance of the effect estimators (Overdispersion)!!! – confidence intervals too narrow!
Analytical Solutions for the BCG trial (4) • Statistical approaches to deal with ICC: For binary or quantitative outcomes: Direct adjustment of confidence intervals possible by estimating intracluster (intraclass-) correlation (ICC) For count outcomes (Poisson distributed data): • Explicit estimation of ICC not possible! • Examining the magnitude of the design effect by comparing unadjusted and adjusted CI • Novel univariate approaches that directly adjust the CI and P-values for the clustering
Analytical Solutions for the BCG trial (5) Two basic approaches for CRT with Poisson data: A) Analyses at the cluster level: „Cluster summary statistic“, meta-analysis techniques: not recommended in our trial because of the very low cluster specific incidence – i.e. more clusters than cases!!! B) Analyses at the individual level New approach for univariate analysis: Ratio estimator approach for overdispersed Poisson data (Rao & Scott, Stat Med 1999, implemented in Software ACLUSTER): Direct adjustment of confidence intervals using an robust variance estimator
Analytical Solutions for the BCG trial (6) • Multivariate modeling - Poisson Regression • Basic Assumption: constant rate over the follow-up time • Could be relaxed by inclusion of time-varying variables (e.g. age) • Extensions for clustered data: • Parametric random effects or multi-level modelling: • intra-cluster correlation modeled by cluster specific random effect • Disadvantage: strong distributional assumptions! => Random effects models not recommended for that trial: - violation of distributional assumptions, - convergence problems, l - large bias in variance estimation of the random effect!!! • Better: Semi-parametric approach based onGeneralized Estimating Procedures (GEE): • calculate an adjusted variance estimator by an iterative algorithm • assuming a „working correlating structure“ • Advantage: No distributional assumptions! • Disadvantage: Very computer intensive for large datasets because of the calculation complexity: time for the BCG data: 1 hour! (1000
95% CI 95% CI 95% CI VE lb ub RR lb ub beta ln_lb ln_ub SE(beta) Wald P-Value All cases, Second dose Standard Poisson 9 -15 28 0.91 0.72 1.15 -0.09 -0.32850 0.13976 0.1195 -0.789 0.430 GEE Poisson 9 -16 29 0.91 0.71 1.16 -0.09 -0.33624 0.14762 0.1234 -0.764 0.445 Non Pulm., Second dose Standard Poisson 37 -4 61 0.63 0.39 1.04 -0.46 -0.94161 0.03922 0.2502 -1.847 0.065 GEE Poisson 37 -3 61 0.63 0.39 1.03 -0.46 -0.94896 0.02489 0.2484 -1.860 0.063 Pulm., Second dose Standard Poisson -1 -32 13 1.01 0.87 1.32 0.01 -0.13926 0.27763 0.1064 0.094 1.000 GEE Poisson -1 -24 18 1.01 0.82 1.24 0.01 -0.19657 0.21647 0.1054 0.094 1.000 Results of the Poisson Regression models Naive and robust variance estimations were very similar: No “design effect” observed
Statistical software for analysing/planning CRT • STATA 7/8/9, General Purpose Statistical Package, Stata Corporation www.stata.com • GLM with GEE, random effects or robust variance estimation to adjust for clustering • STATA 9, MLWin: Multi-level models www.multilevel.ioe.ac.uk • ACLUSTER - Software for the Design and Analysis of Cluster Randomized Trials www.update-software.com/acluster • Easy computation of the intraclass correlation coefficient • Direct adjustment approaches for univariate analysis • Power Analysis for the three types of cluster randomized study design
Literatur • Statistics in Medicine (2001); 20 (Special Issue): Design and Analysis of Cluster Randomized Trials • Donner A. Klar N. Design and analysis of cluster randomisation trials (2000). Arnold Publications, London.