Randomization: Too Important to Gamble with

Randomization: Too Important to Gamble with A Presentation for the Delaware Chapter of the ASAOct 18, 2012 Dennis Sweitzer, Ph.D., Principal Biostatistician Medidata Randomization Center of Excellence

Outline • Randomized Controlled Trials • Basics • Balance • Randomization methods • Complete Randomization • Strict Minimization • Permuted Block • Dynamic Allocation (Covariate-adaptive, not Response-Adaptive) • Randomization Metrics • Balance • Predictability • Loss of Power /Loss of Efficiency • Secondary Imbalance: drop-outs • Simulations comparing methods • Confounding site & treatment effects (small sites) • Overall performance • Discontinuing patients • Weighting stratification factors • Meta-Balance

Why randomize anyway? Some basic principles

Why Gold Standard? Randomized Controlled Trial • Trial: Prospective & Specific • Controlled: • Comparison with Control group • (placebo or active) • Controlled procedures ⇒ Only Test Treatment Varies • Randomization: Minimizes biases • Allocation bias • Selection bias • Permits blinding

Eliminating Bias ¿ The Fact of bias ? • (conscious, unconscious, or instinctive) ¿ The Question of bias ? • Always 2nd guessing • Critics will think of unanticipated things ¡ Solution! • Treat it as a game • 1 statistician vs N clinicians • Statistician generates a random sequence • Clinicians sequential guess at each assignment • Statistician wins if clinician guesses are no better than chance (NB: 75% wrong is just as bad as 75% right)

Randomization Metrics What do we want in a randomization sequence or system? Randomness  Unpredictable ⟶ Reduce Allocation Bias (All studies) ⟶ Reduce Selection Bias (All studies) ⟶ Reduce placebo effects (Blinded studies) Balance  “Loss of Efficiency” ⟶ Maximizes statistical power ⟶ Minimize Confounding ⟶ Enhance Credibility (Face Validity)

Balancing

Balanced Study • Equal allocation between treatment arms • Maximizes Statistical Power Test Control

Imbalanced • Statistical power limited by smallest arm • 36 subject simulation with Complete Randomization • ⟶ • average loss ≈ 1 subject10% lose ≥2 subject • Can add 2 to compensate • BUT only large imbalances have much effect on statistical power • Severe Imbalances are rare in large studies • Pr{worse than 60:40 split} for: • n=25 ⟶ <42% n=100 ⟶ <4.4% n=400 ⟶ 0.006% Resulting in light weight results….

(NB: Planned Imbalance) 1:1 randomization maximizes power per patient But there are other considerations • Utility: • Need 100 patients on drug to monitor safety • Study only requires 60 (30/arm) • 2:1 randomization ⟶ 100 Test & 50 Placebo • Motivation: • Better enrollment if 75% chance of Test drug (3:1) • Ethics: • 85 Placebo + 255 Test vs. 125 Placebo + 125 Test

Imbalance • Overall balance • Only an issue for small studies • Subgroup Balance • Fixed size studies can have variable sized subgroups ⟶ Increased risk of underpowered subgroups

Effective Loss of Sample Size More than half of N=36 studies effectively “lost” 2 to 6 subjects because of imbalances Males Females Pla Test Test Con Effective Loss = Reduction of Power as Reduction in Sample Size Simulations of: 36 and 18 subjects, males as strata at 33% of population, randomized 1:1 (complete randomization)

Bad Imbalance! Males Females Leads to conversations like: Higher estrogen levels in patients on Test Treatment ?? ANCOVA showed no differences in estrogen levels due to treatment Pla Credibility….. Test Test Hmm…? Pla Treatment Imbalances within factors ⟶ spurious findings…..

Randomization Methods (See Animated Powerpoint Slides…)

Randomization 4 methods Complete Randomization (classic approach) Strict Minimization Permuted Block (frequently used) Dynamic Allocation (gaining in popularity)

Complete Randomization Every assignment Same probability for each assignment Ignore Treatment Imbalances No restrictions on treatment assignments Advantages: Simple Robust against selection & accidental bias Maximum Unpredictability Disadvantage High likelihood of imbalances (smaller samples) .

Minimization Strict Minimization randomizes to the imbalanced arm

Minimization • Strict Minimization rebalances the Arms • BUT at a cost in predictability • Random only when treatments are currently balanced

Permuted Block Blocks of Patients (1, 2, or 3 per treatment)Here: 2:2 Allocation T T P P T P P T P P T T T P P ? T P T P P T T P P T P T T P P T • Some Predictability (Unless Incomplete Blocks:More strata ⟶ More incomplete) T P P * • Balanced

Dynamic Allocation • Biases Randomization to the imbalanced arm • Unpredictable • Almost Balanced

Dynamic Allocation • Complete Randomization • Optimizes Unpredictability • Ignores Balance • Strict Minimization • Optimizes Balance • Ignores Predictability • Dynamic Allocation • 2nd Best Probability Parameter • Controls Balance vs. Predictability • Tradeoff

Dynamic Allocation Flexibility 2nd Best Probability= 0 ⟶Strict Minimization

Dynamic Allocation Flexibility 2nd Best Probability= 0.5 ⟶Complete Randomization (for 2 treatment arms)

Stratification Factors

Stratification Factors Factors ≣ Main Effects Strata ≣ 1st Order Interactions Randomizing a 25 yo Male: To PLA ⟶ Worsens Male balance To Test ⟶ Worsens 18-35yo balance Over both sexes Balance w/in 6 Strata? Males Females 18-35 yo 35-65 yo Marginal Balance >65 yo Pla Pla Pla Pla Marginal Balance Test Test Test Test Pla Pla Pla Pla Over all Ages: Test Test Test Test Test Test Test Test Overall Balance Pla Pla Pla Pla

Permuted Block Stratified Randomization • Only balances within strata • Most strata will have incomplete blocks • Imbalances accumulate at margins P P * * T P x x Over both sexes Males Females T P P T P * * * T T * * T * * * 18-35 yo T P T * T P P * P T * * T P T * 35-65 yo P T T * P T P * T T P * P* * * >65 yo Pla Pla T T P P P P T T Test Test Pla Pla Test Test Over all Ages: Test Test T C T C P T T P P T P T Pla Pla

Minimization & Marginal Balance • * Only balances on margins • * Useful if too many strata, e.g.: • * Appropriate for a main effects analysis (ie, no interactions) • * • * Over both sexes Balance w/in 6 Strata? Males Females 18-35 yo 35-65 yo Marginal Balance >65 yo Pla Pla Pla Pla Marginal Balance Test Test Test Test Pla Pla Pla Pla Over all Ages: Test Test Test Test Test Test Test Test Overall Balance Pla Pla Pla Pla

Stratification & Dynamic Allocation • DA: uses weighted combination of • Overall balance • Marginal balances • Strata balance • ⇒ Flexible Over both sexes Males Females Balance w/in 6 Strata? 18-35 yo 35-65 yo Marginal Balance >65 yo Pla Pla Pla Pla Marginal Balance Test Test Test Test Pla Pla Pla Pla Over all Ages: Test Test Test Test Test Test Test Test Overall Balance Pla Pla Pla Pla

Site as a Special Subgroup (Max 2 lines, 35 characters)

Imbalance • Overall balance • Only an issue for small studies • Subgroup Balance • Fixed size studies can have variable sized subgroups ⟶ Increased risk of underpowered subgroups • Site as special case of subgroup • Small sites ⟶ Increased risk of "monotherapy” at site ⟶ Confounding site & treatment effects ⟶ Effectively non-informative/”lost” patients • Actual vs Assumed distribution of site size

Enrollment per Center (Densities) • Data Sample • 13 Studies • 7.7 mo Average Enrollment period • 3953 Obs.Pts • 460 Listed Sites • 372 Active.Sites Size Categories:{0, 1, 2, 3, 4-7, 8-11, 12-15, 16-19, 20-29, 30-39, 40-49, 50-59, 60-79, 80-99, 100-149, 150-199, ≥200 }

Enrollment per Site (#Sites) • Data Sample • 13 Studies • 7.7 mo Average Enrollment period • 3953 Obs.Pts • 460 Listed Sites • 372 Active.Sites # Sites per Size Category {0, 1, 2, 3, 4-7, 8-11, 12-15, 16-19, 20-29, 30-39, 40-49, 50-59, 60-79, 80-99, 100-149, 150-199, ≥200 }

Site Enrollment Simulation Simulation based on Observations • 4 mo Enrollment Period • Enrollment ~ Poisson distribution μ = Obs. Pts/mo (active sites) or μ ≈ 0.5 / Enrollment period (non-active sites) • Randomize using CR, PB(2:2), or DA(0.15). • Confounded Pts ≣ Patients at centers with only one treatment ⇒ treatment & center effects are confounded

Results mean ±SD (80% C.I.) Affected studies had many sites with low enrollment Studies with fewer sites (and more pts at each) were rarely affected Dynamic Allocation reduced confounding slightly more effectively than permuted block

Randomization Metrics

Randomization Metrics How do we measure “badness” of a randomization sequence or system? • Predictability • Goal: an observer can guess no better than chance ⟶ Score based on Blackwell-Hodges guessing rule • Easily calculated • Imbalance Imbalance ⟶ reduced statistical power⟶ “Loss of Efficiency” • Measure as effective loss in number of subjects

Blackwell-Hodges Use Blackwell-Hodges guessing rule • Directly corresponds to game interpretation • Investigator always guesses the most probable treatment assignment, based on past assignments • “ bias factor F” F ≣ abs(# Correct – Expected # Correct by chance alone) • Measures potential for selection bias • Modifications: • Limits on knowledge of investigator (eg, can only know prior treatment allocation on own site) • Score as percentage e.g., Score ≣ abs(% Correct – 50%)

Blackwell-Hodges Scoring (1) For treatment sequence “TCCC” Initial guess ⟶ Expectation = ½ “T” ⟶ Imbalance =+1 ⟶ Guess C ⟶ Correct “TC” ⟶ Imbalance=0 ⟶ Guess either ⟶ Expectation=½ “TCC” ⟶ Imbalance=-1 ⟶ Guess T ⟶ Wrong “TCCC” ⟶ # Correct= ½ + 1+ ½ +0 =2 Score = #Correct - 2 = 2-2 = 0

Blackwell-Hodges Scoring (2) For treatment sequence “TCCC” “TCCC” ⟶ # Correct= ½ + 1+ ½ +0 =2 Complete Randomization ⇒ Pr{“TCCC”} = 1/16 Dynamic Allocation (p=0.15) ⇒ Pr{“TCCC”}= 0.5 *0.85 * 0.5 * 0.15 = 0.031875 Permuted Block (length≤4) ⇒ PR{“TCCC”} = 0 Strict Minimization ⇒ Pr{“TCCC”}=0

Blackwell-Hodges Scoring (3) • Sequence “TCCT” • # Correct= ½ + 1 + ½ + 1 = 3 • Score = 3 – 2 = 1 • Complete Randomization⇒ Pr{TCCT}= 1/16 • Strict Minimization ⇒ Pr{TCCT} = ½*1*½*1 = ¼ • Permuted Block⇒ Pr{TCCT} = 1/6 • (NB: 6 permutations of TTCC) • Dynamic Allocation (2nd best prob.=0.15) ⇒ Pr{TCCT} = 0.5 * 0.85* 0.5 * 0.85 = 0.180625

Warning! Local PredictabilityONLY Blackwell-Hodges • Assesses potential selection bias ― Given known imbalance!¿¿ But which imbalance(s)??(Overall imbalance? Within strata? Within Factors?) • Henceforth: only use imbalance within strata • Proxy for center • Assume observer only knows imbalance within “his center” • Simple & unambiguous M Requires some caution in interpretation

Loss of Efficiency Inference in Covariate-Adaptive allocation Elsa ValdésMárquez & Nick Fieller EFSPI Adaptive Randomisation Meeting Brussels, 7 December 2006 http://www.efspi.org/PDF/activities/international/adaptive-rando-docs/2ValdesMarquez.pdf • Loss can be expressed as equivalent # Patients • In a 100 patient study:Loss of Efficiency= 5 ⇒ A perfectly designed study would require only 95

RCT vs DOE Designed Experiment (DOE): ⟶ Selectzand covariate values to minimize Ln RCT ⟶ Select only z (No control of covariates) X ≣ design matrix: ⟶n rows, 1 per pt ⟶K columns, 1 per covariate z ≣ Treatment assignments

Loss of Efficiency (Máquez & Fieller) Dynamic Allocation Sequentially assign Z to minimize

Loss of Efficiency (Máquez & Fieller)

Randomization Performance Simulations

Simulation Set up

Note on Figures Plot B-H score vs Loss of Efficiency Median + 80% C.I. ⇒ 10% lower& 10% higher

Simulation Results(1) Both DA & PB are stratified. Simulation: 48 subjects, 2 stratification factors, 6 strata, uneven sizes(DA) Dynamic Allocation (PB)Permuted Block (CR) Completely RandomDA( 2nd Best Probability ), PB( Allocation Ratio )Simulated subjects were randomized by all 3 methods ⟵Averages of Metrics But for managing risk, need Worst Case 80% ⟶ Confidence Intervals

Randomizations Plotted by Metrics PB(1:1), DA(0) PB(2:2), DA(0.15) (Essentially Strict Minimization) PB(8:8) DA(0.5)CR DA(0.5) ≣ CR PB⟶CR PB(4:4)DA(0.33) CR

Randomization: Too Important to Gamble with

Randomization: Too Important to Gamble with

Presentation Transcript

Randomization

“Art is too important not to share.”

Designs with Randomization Restrictions

3. Randomization

Randomization Overview

Randomization: Too Important to Gamble With

Jazzlyn gamble

Don’t Gamble with Norovirus

Don’t gamble with your SEO !

Introducing Inference with Bootstrapping and Randomization

Randomization

Lisa Gamble

Randomization

Jazzlynn gamble

Jazzlynn gamble

Randomization

Managing Threats to Randomization

Randomization workshop

Randomization workshop

Optimization via (too much?) Randomization

Adaptive randomization

Randomization: