720 likes | 939 Views
Randomization: Too Important to Gamble with. A Presentation for the Delaware Chapter of the ASA Oct 18, 2012. Dennis Sweitzer, Ph.D., Principal Biostatistician Medidata Randomization Center of Excellence. Outline. Randomized Controlled Trials Basics Balance Randomization methods
E N D
Randomization: Too Important to Gamble with A Presentation for the Delaware Chapter of the ASAOct 18, 2012 Dennis Sweitzer, Ph.D., Principal Biostatistician Medidata Randomization Center of Excellence
Outline • Randomized Controlled Trials • Basics • Balance • Randomization methods • Complete Randomization • Strict Minimization • Permuted Block • Dynamic Allocation (Covariate-adaptive, not Response-Adaptive) • Randomization Metrics • Balance • Predictability • Loss of Power /Loss of Efficiency • Secondary Imbalance: drop-outs • Simulations comparing methods • Confounding site & treatment effects (small sites) • Overall performance • Discontinuing patients • Weighting stratification factors • Meta-Balance
Why randomize anyway? Some basic principles
Why Gold Standard? Randomized Controlled Trial • Trial: Prospective & Specific • Controlled: • Comparison with Control group • (placebo or active) • Controlled procedures ⇒ Only Test Treatment Varies • Randomization: Minimizes biases • Allocation bias • Selection bias • Permits blinding
Eliminating Bias ¿ The Fact of bias ? • (conscious, unconscious, or instinctive) ¿ The Question of bias ? • Always 2nd guessing • Critics will think of unanticipated things ¡ Solution! • Treat it as a game • 1 statistician vs N clinicians • Statistician generates a random sequence • Clinicians sequential guess at each assignment • Statistician wins if clinician guesses are no better than chance (NB: 75% wrong is just as bad as 75% right)
Randomization Metrics What do we want in a randomization sequence or system? Randomness Unpredictable ⟶ Reduce Allocation Bias (All studies) ⟶ Reduce Selection Bias (All studies) ⟶ Reduce placebo effects (Blinded studies) Balance “Loss of Efficiency” ⟶ Maximizes statistical power ⟶ Minimize Confounding ⟶ Enhance Credibility (Face Validity)
Balanced Study • Equal allocation between treatment arms • Maximizes Statistical Power Test Control
Imbalanced • Statistical power limited by smallest arm • 36 subject simulation with Complete Randomization • ⟶ • average loss ≈ 1 subject10% lose ≥2 subject • Can add 2 to compensate • BUT only large imbalances have much effect on statistical power • Severe Imbalances are rare in large studies • Pr{worse than 60:40 split} for: • n=25 ⟶ <42% n=100 ⟶ <4.4% n=400 ⟶ 0.006% Resulting in light weight results….
(NB: Planned Imbalance) 1:1 randomization maximizes power per patient But there are other considerations • Utility: • Need 100 patients on drug to monitor safety • Study only requires 60 (30/arm) • 2:1 randomization ⟶ 100 Test & 50 Placebo • Motivation: • Better enrollment if 75% chance of Test drug (3:1) • Ethics: • 85 Placebo + 255 Test vs. 125 Placebo + 125 Test
Imbalance • Overall balance • Only an issue for small studies • Subgroup Balance • Fixed size studies can have variable sized subgroups ⟶ Increased risk of underpowered subgroups
Effective Loss of Sample Size More than half of N=36 studies effectively “lost” 2 to 6 subjects because of imbalances Males Females Pla Test Test Con Effective Loss = Reduction of Power as Reduction in Sample Size Simulations of: 36 and 18 subjects, males as strata at 33% of population, randomized 1:1 (complete randomization)
Bad Imbalance! Males Females Leads to conversations like: Higher estrogen levels in patients on Test Treatment ?? ANCOVA showed no differences in estrogen levels due to treatment Pla Credibility….. Test Test Hmm…? Pla Treatment Imbalances within factors ⟶ spurious findings…..
Randomization Methods (See Animated Powerpoint Slides…)
Randomization 4 methods Complete Randomization (classic approach) Strict Minimization Permuted Block (frequently used) Dynamic Allocation (gaining in popularity)
Complete Randomization Every assignment Same probability for each assignment Ignore Treatment Imbalances No restrictions on treatment assignments Advantages: Simple Robust against selection & accidental bias Maximum Unpredictability Disadvantage High likelihood of imbalances (smaller samples) .
Minimization Strict Minimization randomizes to the imbalanced arm
Minimization • Strict Minimization rebalances the Arms • BUT at a cost in predictability • Random only when treatments are currently balanced
Permuted Block Blocks of Patients (1, 2, or 3 per treatment)Here: 2:2 Allocation T T P P T P P T P P T T T P P ? T P T P P T T P P T P T T P P T • Some Predictability (Unless Incomplete Blocks:More strata ⟶ More incomplete) T P P * • Balanced
Dynamic Allocation • Biases Randomization to the imbalanced arm • Unpredictable • Almost Balanced
Dynamic Allocation • Complete Randomization • Optimizes Unpredictability • Ignores Balance • Strict Minimization • Optimizes Balance • Ignores Predictability • Dynamic Allocation • 2nd Best Probability Parameter • Controls Balance vs. Predictability • Tradeoff
Dynamic Allocation Flexibility 2nd Best Probability= 0 ⟶Strict Minimization
Dynamic Allocation Flexibility 2nd Best Probability= 0.5 ⟶Complete Randomization (for 2 treatment arms)
Stratification Factors Factors ≣ Main Effects Strata ≣ 1st Order Interactions Randomizing a 25 yo Male: To PLA ⟶ Worsens Male balance To Test ⟶ Worsens 18-35yo balance Over both sexes Balance w/in 6 Strata? Males Females 18-35 yo 35-65 yo Marginal Balance >65 yo Pla Pla Pla Pla Marginal Balance Test Test Test Test Pla Pla Pla Pla Over all Ages: Test Test Test Test Test Test Test Test Overall Balance Pla Pla Pla Pla
Permuted Block Stratified Randomization • Only balances within strata • Most strata will have incomplete blocks • Imbalances accumulate at margins P P * * T P x x Over both sexes Males Females T P P T P * * * T T * * T * * * 18-35 yo T P T * T P P * P T * * T P T * 35-65 yo P T T * P T P * T T P * P* * * >65 yo Pla Pla T T P P P P T T Test Test Pla Pla Test Test Over all Ages: Test Test T C T C P T T P P T P T Pla Pla
Minimization & Marginal Balance • * Only balances on margins • * Useful if too many strata, e.g.: • * Appropriate for a main effects analysis (ie, no interactions) • * • * Over both sexes Balance w/in 6 Strata? Males Females 18-35 yo 35-65 yo Marginal Balance >65 yo Pla Pla Pla Pla Marginal Balance Test Test Test Test Pla Pla Pla Pla Over all Ages: Test Test Test Test Test Test Test Test Overall Balance Pla Pla Pla Pla
Stratification & Dynamic Allocation • DA: uses weighted combination of • Overall balance • Marginal balances • Strata balance • ⇒ Flexible Over both sexes Males Females Balance w/in 6 Strata? 18-35 yo 35-65 yo Marginal Balance >65 yo Pla Pla Pla Pla Marginal Balance Test Test Test Test Pla Pla Pla Pla Over all Ages: Test Test Test Test Test Test Test Test Overall Balance Pla Pla Pla Pla
Site as a Special Subgroup (Max 2 lines, 35 characters)
Imbalance • Overall balance • Only an issue for small studies • Subgroup Balance • Fixed size studies can have variable sized subgroups ⟶ Increased risk of underpowered subgroups • Site as special case of subgroup • Small sites ⟶ Increased risk of "monotherapy” at site ⟶ Confounding site & treatment effects ⟶ Effectively non-informative/”lost” patients • Actual vs Assumed distribution of site size
Enrollment per Center (Densities) • Data Sample • 13 Studies • 7.7 mo Average Enrollment period • 3953 Obs.Pts • 460 Listed Sites • 372 Active.Sites Size Categories:{0, 1, 2, 3, 4-7, 8-11, 12-15, 16-19, 20-29, 30-39, 40-49, 50-59, 60-79, 80-99, 100-149, 150-199, ≥200 }
Enrollment per Site (#Sites) • Data Sample • 13 Studies • 7.7 mo Average Enrollment period • 3953 Obs.Pts • 460 Listed Sites • 372 Active.Sites # Sites per Size Category {0, 1, 2, 3, 4-7, 8-11, 12-15, 16-19, 20-29, 30-39, 40-49, 50-59, 60-79, 80-99, 100-149, 150-199, ≥200 }
Site Enrollment Simulation Simulation based on Observations • 4 mo Enrollment Period • Enrollment ~ Poisson distribution μ = Obs. Pts/mo (active sites) or μ ≈ 0.5 / Enrollment period (non-active sites) • Randomize using CR, PB(2:2), or DA(0.15). • Confounded Pts ≣ Patients at centers with only one treatment ⇒ treatment & center effects are confounded
Results mean ±SD (80% C.I.) Affected studies had many sites with low enrollment Studies with fewer sites (and more pts at each) were rarely affected Dynamic Allocation reduced confounding slightly more effectively than permuted block
Randomization Metrics How do we measure “badness” of a randomization sequence or system? • Predictability • Goal: an observer can guess no better than chance ⟶ Score based on Blackwell-Hodges guessing rule • Easily calculated • Imbalance Imbalance ⟶ reduced statistical power⟶ “Loss of Efficiency” • Measure as effective loss in number of subjects
Blackwell-Hodges Use Blackwell-Hodges guessing rule • Directly corresponds to game interpretation • Investigator always guesses the most probable treatment assignment, based on past assignments • “ bias factor F” F ≣ abs(# Correct – Expected # Correct by chance alone) • Measures potential for selection bias • Modifications: • Limits on knowledge of investigator (eg, can only know prior treatment allocation on own site) • Score as percentage e.g., Score ≣ abs(% Correct – 50%)
Blackwell-Hodges Scoring (1) For treatment sequence “TCCC” Initial guess ⟶ Expectation = ½ “T” ⟶ Imbalance =+1 ⟶ Guess C ⟶ Correct “TC” ⟶ Imbalance=0 ⟶ Guess either ⟶ Expectation=½ “TCC” ⟶ Imbalance=-1 ⟶ Guess T ⟶ Wrong “TCCC” ⟶ # Correct= ½ + 1+ ½ +0 =2 Score = #Correct - 2 = 2-2 = 0
Blackwell-Hodges Scoring (2) For treatment sequence “TCCC” “TCCC” ⟶ # Correct= ½ + 1+ ½ +0 =2 Complete Randomization ⇒ Pr{“TCCC”} = 1/16 Dynamic Allocation (p=0.15) ⇒ Pr{“TCCC”}= 0.5 *0.85 * 0.5 * 0.15 = 0.031875 Permuted Block (length≤4) ⇒ PR{“TCCC”} = 0 Strict Minimization ⇒ Pr{“TCCC”}=0
Blackwell-Hodges Scoring (3) • Sequence “TCCT” • # Correct= ½ + 1 + ½ + 1 = 3 • Score = 3 – 2 = 1 • Complete Randomization⇒ Pr{TCCT}= 1/16 • Strict Minimization ⇒ Pr{TCCT} = ½*1*½*1 = ¼ • Permuted Block⇒ Pr{TCCT} = 1/6 • (NB: 6 permutations of TTCC) • Dynamic Allocation (2nd best prob.=0.15) ⇒ Pr{TCCT} = 0.5 * 0.85* 0.5 * 0.85 = 0.180625
Warning! Local PredictabilityONLY Blackwell-Hodges • Assesses potential selection bias ― Given known imbalance!¿¿ But which imbalance(s)??(Overall imbalance? Within strata? Within Factors?) • Henceforth: only use imbalance within strata • Proxy for center • Assume observer only knows imbalance within “his center” • Simple & unambiguous M Requires some caution in interpretation
Loss of Efficiency Inference in Covariate-Adaptive allocation Elsa ValdésMárquez & Nick Fieller EFSPI Adaptive Randomisation Meeting Brussels, 7 December 2006 http://www.efspi.org/PDF/activities/international/adaptive-rando-docs/2ValdesMarquez.pdf • Loss can be expressed as equivalent # Patients • In a 100 patient study:Loss of Efficiency= 5 ⇒ A perfectly designed study would require only 95
RCT vs DOE Designed Experiment (DOE): ⟶ Selectzand covariate values to minimize Ln RCT ⟶ Select only z (No control of covariates) X ≣ design matrix: ⟶n rows, 1 per pt ⟶K columns, 1 per covariate z ≣ Treatment assignments
Loss of Efficiency (Máquez & Fieller) Dynamic Allocation Sequentially assign Z to minimize
Note on Figures Plot B-H score vs Loss of Efficiency Median + 80% C.I. ⇒ 10% lower& 10% higher
Simulation Results(1) Both DA & PB are stratified. Simulation: 48 subjects, 2 stratification factors, 6 strata, uneven sizes(DA) Dynamic Allocation (PB)Permuted Block (CR) Completely RandomDA( 2nd Best Probability ), PB( Allocation Ratio )Simulated subjects were randomized by all 3 methods ⟵Averages of Metrics But for managing risk, need Worst Case 80% ⟶ Confidence Intervals
Randomizations Plotted by Metrics PB(1:1), DA(0) PB(2:2), DA(0.15) (Essentially Strict Minimization) PB(8:8) DA(0.5)CR DA(0.5) ≣ CR PB⟶CR PB(4:4)DA(0.33) CR