1 / 17

Data Analysis – Statistical Issues Bernd Genser, PhD

BCG REVAC- Cluster Randomization Trial. Data Analysis – Statistical Issues Bernd Genser, PhD Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador Email: bernd.genser@bgstats.com Slides available at: www.bgstats.com /port/links/downloads

grady-tyler
Download Presentation

Data Analysis – Statistical Issues Bernd Genser, PhD

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BCG REVAC- Cluster Randomization Trial Data Analysis – Statistical Issues Bernd Genser, PhD Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador Email: bernd.genser@bgstats.com Slides available at:www.bgstats.com/port/links/downloads Seminário ABRASCO- Métodos em Epidemiologia: ESTUDOS DE COORTE, Rio de Janeiro, 01-AUG - 2005

  2. The BGC-trial from a statistician‘s point of view Main Objective: Estimation of an unbiased consistent measure of Vaccine Efficacy (VE) incl. 95% CI of a BCG dose given to school children in a population with a high coverage of neonatal BCG vaccination Secondary objective: Identify effect modifiers (city, BCG scar, …)

  3. Issues to be addressed in Statistical Analysis 1) Potential confounding and effect modification - Trial design:Complex multi-level covariate structure - Adjusting/controlling for confounding of fixed and time-varying (e.g. age) tb predictors - Heterogeneity of VE across covariatestrata expected 2) Cluster Randomization – Adjusting the estimates for potential intra-cluster correlation 3) Expected low incidence of tb: More clusters than cases expected => Traditional statistical methods for CRT could not applied

  4. Analytical Solutions for the BCG trial 1) Issue 1: Dealing with potential confounding variables: • Controlled by study design Stratification/randomization: • Allocation groups were highly balanced in confounding variables => No statistical adjustment required for these covariates • Matching by size of school accounts additionally for effect of “cluster size” • Adjusted in Statistical Analysis • Tb incidence is well known strongly dependent on age => age modeled as time-varying variable

  5. Dealing with covariates in the BCG trial Subgroup analysis Design: Strat. Subgroup analysis Design: Strat. Design: Random. Stat. Adjustment Design: Random. Subgroup analysis Design: Matching Subgroup analysis Design: Random. Subgroup analysis

  6. total population recruited study children total allocation group allocation group intervention control total intervention control total covariate n=180655 n=173751 n=354406 n=124340 n=115594 n=239934 individual level city (% children in Salvador) 58.6% 53.0% 55.9% 58.8% 49.0% 54.1% age (mean, sd) 11.53 (2.16) 11.46 (2.17) 11.51 (2.16) 11.53 (2.08) 11.44 (2.10) 11.48 (2.09) age_group < 7 0.0% 0.3% 0.2% 0.0% 0.3% 0.1% 7-8 15.6% 16.3% 16.0% 14.3% 15.6% 14.9% 9-10 24.9% 24.6% 24.7% 25.5% 25.3% 25.4% 11-12 29.2% 29.0% 29.1% 31.0% 30.6% 30.8% 13-14 28.0% 28.3% 28.1% 28.1% 28.1% 28.1% > 14 2.3% 1.5% 1.9% 1.1% 0.1% 0.6% gender (% males) 49.5% 49.5% 49.5% 48.2% 48.5% 48.4% BCG scar reading total 76.0% 72.5% 74.3% 100.0% 100.0% 100% after excl. bec. Age 76.3% 72.5% 74.2% BCG scar count no scar 11.7% 10.8% 11.3% 16.6% 16.0% 16.3% one scar 58.4% 56.6% 57.5% 83.4% 84.0% 83.7% two scars 4.3% 3.9% 4.1% 0% 0% 0% no data 25.6% 28.7% 27.1% 0% 0% 0% vaccination (% vaccinated) 66.7% 0.5% 34.2% 94.6% 0.7% 49% Evaluation of the random allocation procedure

  7. total population recruited study children total allocation group allocation group intervention control total intervention control total covariate n=180655 n=173751 n=354406 n=124340 n=115594 n=239934 cluster level schools count 388 (50.8%) 375 (49.2%) 763 (100%) 386 (51.3%) 365 (48.7%) 751 (100%) cluster size mean (sd) 465 (325) 463 (290) 464 (308) 322 (233) 317 (215) 319 (224) min, max 26; 2368 36; 1764 26; 2368 11; 1430 10; 1334 10; 1430 gender (% males) mean (sd) 49.7% (7.0%) 50.1% (5.4%) 49.9% (6.3%) 48.4% (7.4%) 49.3% (5.8%) 48.8% (6.7%) min, max 0%; 84.7% 35.2%; 99.1% 0%; 99.1% 0%; 83.3% 35%; 98.8% 0%; 98.8% Scar Read. (% yes) mean (sd) 75.1% (12.2%) 71.0% (17.0%) 73.1% (14.9%) 100% 100% 100% min, max 0%; 95.7% 0%; 95.1% 0%; 95.7% Scar Count (% 0 or 1) mean (sd) 69.2% (12.2%) 65.8% (16.3%) 67.5% (14.4%) 100% 100% 100% min, max 0%; 90.3% 0%; 91.3% 0%; 91.3% data available for Salvador only: soc. Eco. cond. 0-25 2.5% 3.1% 2.8% 26-50 10.0% 11.1% 10.5% 51-75 30.3% 29.6% 30.0% 76-HI 57.3% 56.2% 56.7% data available for Manaus only: inc. of tbc (mean, sd) 121.6 (91.3) 126.3 (74.8) 123.5 (84.8) mean (sd) 14.5; 618.0 14.5; 618 14.5; 618 min, max inc. of leprosy (mean, sd) 8.8 (9.8) 7.8 (7.2) 8.4 (8.8) mean (sd) 0.3; 66.9 0; 66.9 0; 66.9 min, max Evaluation of the random allocation procedure

  8. Analytical Solutions for the BCG trial (2) • Issue 2: Dealing with effect modification: • Subgroup analyses conducted by • No. of BCG Scars (First or Second dose) • City (Salvador and Manaus) • Clinical form/Certainty level Strong evidence of effect heterogeneity found: - We decided to analyze children with 1 and 0 scar seperately: 1st, 2nd dose effect are completely different scientific questions =>No interaction model fitted! - All analyses were presented overall and by city and clinical form

  9. Analytical Solutions for the BCG trial (3) • Issue 3: Adjusting the estimates for the “design effect” Statistical problem: between-cluster variation (=intra-cluster correlation), induced by unexplained dependence structure between children from the same school, usually caused by common unknow/unobserved risk factores => Consequence: standard statistical approaches can substantially underestimate the true variance of the effect estimators (Overdispersion)!!! – confidence intervals too narrow!

  10. Analytical Solutions for the BCG trial (4) • Statistical approaches to deal with ICC: For binary or quantitative outcomes: Direct adjustment of confidence intervals possible by estimating intracluster (intraclass-) correlation (ICC) For count outcomes (Poisson distributed data): • Explicit estimation of ICC not possible! • Examining the magnitude of the design effect by comparing unadjusted and adjusted CI • Novel univariate approaches that directly adjust the CI and P-values for the clustering

  11. Analytical Solutions for the BCG trial (5) Two basic approaches for CRT with Poisson data: A) Analyses at the cluster level: „Cluster summary statistic“, meta-analysis techniques: not recommended in our trial because of the very low cluster specific incidence – i.e. more clusters than cases!!! B) Analyses at the individual level New approach for univariate analysis: Ratio estimator approach for overdispersed Poisson data (Rao & Scott, Stat Med 1999, implemented in Software ACLUSTER): Direct adjustment of confidence intervals using an robust variance estimator

  12. Ratio estimator approach for overdispersed Poisson data

  13. Analytical Solutions for the BCG trial (6) • Multivariate modeling - Poisson Regression • Basic Assumption: constant rate over the follow-up time • Could be relaxed by inclusion of time-varying variables (e.g. age) • Extensions for clustered data: • Parametric random effects or multi-level modelling: • intra-cluster correlation modeled by cluster specific random effect • Disadvantage: strong distributional assumptions! => Random effects models not recommended for that trial: - violation of distributional assumptions, - convergence problems, l - large bias in variance estimation of the random effect!!! • Better: Semi-parametric approach based onGeneralized Estimating Procedures (GEE): • calculate an adjusted variance estimator by an iterative algorithm • assuming a „working correlating structure“ • Advantage: No distributional assumptions! • Disadvantage: Very computer intensive for large datasets because of the calculation complexity: time for the BCG data: 1 hour! (1000

  14. 95% CI 95% CI 95% CI VE lb ub RR lb ub beta ln_lb ln_ub SE(beta) Wald P-Value All cases, Second dose Standard Poisson 9 -15 28 0.91 0.72 1.15 -0.09 -0.32850 0.13976 0.1195 -0.789 0.430 GEE Poisson 9 -16 29 0.91 0.71 1.16 -0.09 -0.33624 0.14762 0.1234 -0.764 0.445 Non Pulm., Second dose Standard Poisson 37 -4 61 0.63 0.39 1.04 -0.46 -0.94161 0.03922 0.2502 -1.847 0.065 GEE Poisson 37 -3 61 0.63 0.39 1.03 -0.46 -0.94896 0.02489 0.2484 -1.860 0.063 Pulm., Second dose Standard Poisson -1 -32 13 1.01 0.87 1.32 0.01 -0.13926 0.27763 0.1064 0.094 1.000 GEE Poisson -1 -24 18 1.01 0.82 1.24 0.01 -0.19657 0.21647 0.1054 0.094 1.000 Results of the Poisson Regression models Naive and robust variance estimations were very similar: No “design effect” observed

  15. Statistical software for analysing/planning CRT • STATA 7/8/9, General Purpose Statistical Package, Stata Corporation www.stata.com • GLM with GEE, random effects or robust variance estimation to adjust for clustering • STATA 9, MLWin: Multi-level models www.multilevel.ioe.ac.uk • ACLUSTER - Software for the Design and Analysis of Cluster Randomized Trials www.update-software.com/acluster • Easy computation of the intraclass correlation coefficient • Direct adjustment approaches for univariate analysis • Power Analysis for the three types of cluster randomized study design

  16. Literatur • Statistics in Medicine (2001); 20 (Special Issue): Design and Analysis of Cluster Randomized Trials • Donner A. Klar N. Design and analysis of cluster randomisation trials (2000). Arnold Publications, London.

  17. Obrigado!

More Related