410 likes | 745 Views
and if you are a splitter, how should you split the data and when?. . Stratification. A procedure in which factors known to be associated with the response (prognostic factors) are taken into account in the design (e.g., randomization)Recall, permuted block randomization is used to achieve balan
E N D
1. Stratification: Are you a lumper or a splitter?
2. …and if you are a splitter, how should you split the data and when?
3. Stratification A procedure in which factors known to be associated with the response (prognostic factors) are taken into account in the design (e.g., randomization)
Recall, permuted block randomization is used to achieve balance on the number in each treatment arm over time. Stratification is used to achieve comparability between groups with respect to important prognostic factors.
Pre-stratification refers to a stratified design; post-stratification refers to the analysis
4. Note: This is different from stratified random sampling where the population might be divided up into strata, e.g., census tracts, and each stratum is sampled randomly for some pre-specified sample size.
5. Possible Stratification Scenarios Pre- plus post-stratification
Pre-stratification only
Post-stratification only
Neither pre- nor post- stratification
Regression adjustment with or without stratification
6. Advantages Prevents “accidental bias” resulting from mal-distribution of important prognostic variables
Increases precision (if stratifying variables are related to outcome)
Facilitates subgroup analysis
Results less subject to criticism
7. International Conference on Harmonization (ICH) Guideline (E-9 Document)
“Stratification by important prognostic factors measured at baseline (e.g., severity of disease, age, sex, etc.) may sometimes be valuable in order to promote balanced allocation within strata; this has greater potential benefit in small trials.”
8. Disadvantages Primarily relates to additional administrative burden of implementation of randomization.
May have several randomization schedules
Measurements to define stratum must be carefully made prior to randomization
9. What StratificationDoes Not Do 1. Guarantee adequate power to make within-stratum comparisons
2. Eliminate the need to carry out covariate-adjusted analysis
Chance imbalance on other covariates
Analysis consistent with design
10. Characteristics of Patients in Trial to Prevent Toxoplasmic Encephalitis CD4+ count (cells/mm3) 96.1 97.4
AIDS OI (%) 35.2 22.0
Karnofsky Score 89.5 89.7
Hemoglobin (g/dl) 12.6 12.7
11. “In view of the major imbalance between the groups in presentation at baseline with AIDS defining OIs, the rigorousness of the allocation procedures need to be supported in detail if the results are to be regarded as credible.”
12. Example How a small difference in an important prognostic variable can bias treatment differences.
13. Baseline Characteristics Age (years) 37.8 8.5 37.5 7.8
CD4+ 75.1 86.2 71.1 84.3
Karnofsky Score 87.2 11.9 85.3 11.9
Prior AIDS 64.8 66.7
Diagnosis (%)
14. Frequency Distribution of Karnofsky Score by Treatment Group ddI ddC
< 70 4.8 6.8
70 - 79 10.0 11.8
80 - 89 21.3 24.1
90 - 99 36.1 36.7
100+ 27.8 20.6
15. Death Rate by Karnofsky Level < 70 169.8
70 - 79 84.0
80 - 89 41.0
90 - 99 31.9
100+ 18.4
16. Comparison ofUnadjusted and AdjustedRelative Risk Estimates Unadjusted 0.79 0.11
Adjusted 0.66 0.006
17. A major problem with this study is the adjustment for the “small differences at baseline” between didanosine and zalcitabine. While there is a “small difference” noted, the variability for each of these variables is quite large. For example, the difference in CD4 count was 4 cells/mm3 between treatment groups; however, the standard error was over 86 cells/mm3. Similarly, for Karnofsky performance status, the difference between the two groups was 2, but the standard error was 11.9. And, finally, there was no difference in the presence of AIDS-defining illness between the two groups. In short, the conclusion that should be drawn is that there is, indeed, no difference between the two groups and attempting to adjust for these small differences is inappropriate. The discussion of Results on page 23, first paragraph, should be eliminated.
18. Summary Small differences in a very important prognostic variable (irrespective of significance) can bias treatment comparisons
Large, significant differences in unimportant variables will not bias treatment comparisons
Remember a p-value is a function of both sample size and effect size
Chance imbalances can occur with large sample sizes if there are many strata.
20. Considerations in the Decision to “Lump” or “Split” 1. Size of study
2. Homogeneity of study subjects
3. Strength of prognostic factors (between strata variability)
4. Administrative burden
5. Credibility
21. Usual Implementation Block randomization within stratum
i.e., prepare a separate randomization schedule for each stratum usually with relatively small block sizes
Makes no sense to use simple randomization
Note: The aim of this method is to ensure balance within strata formed by cross-classification of all factor levels.
22. Typical Stratifying Variables Clinical site
Baseline level for outcome of interest
Stage of disease
Combination of factors, e.g., a risk score
23. Stratification Example: TOMHS Multi-center (4 clinical sites) trial with two other strata defined by previous use of antihypertensive treatment (Rx) (Yes/No)
4 x 2 = 8 strata and randomization schedules – aim is to achieve the desired allocation ratio across all 8 groups
24. Post-stratification (def.) Classification of experimental units into strata after they have been randomized for the purpose of data analysis
e.g., stratified analysis of variance (normally distributed response), Mantel-Haenszel (binary response).
Often adjustment for baseline covariates is carried out using regression methods, e.g., linear regression or analysis of covariance (continuous), logistic regression (binary), or Cox regression (time to event)
25. General Problems/Issues with Post-Stratification Model dependence / data dredging
How were covariates (stratifying variables) selected?
How were cutpoints (metric) chosen?
Frequently covariates are not pre-specified
Partial solution: Analysis plan in the protocol that includes all covariates considered important (pre-stratification variables + others); updated analysis plan prior to unblinding the results of the study to investigators.
26. One can calculate the probability of obtaining a certain imbalance before the study begins. This can be used to decide whether to stratify the randomization.
p(t) is the prob. of randomizing t patients to group A when there are t1 patients in stratum 1. For a certain imbalance one can sum over all p(t) for t's that give that imbalance or worse.
27. Example: Na = 100, Nb = 100, t1 = 40, g = 0.16, h = 0.24 Group A 16 84 100
Group B 24 76 100
Total 40 160 200
Want the prob. of obtaining the imbalance given by g = 0.16, h = 0.24, or worse.
28. Probability of Given Imbalance or More Extreme .52 .48 1.0 .84 .23
.55 .45 .57 .42 .002
.60 .40 .25 .07 –
.70 .30 .01 – –
29. Estimates for the Size of Treatment Imbalance Let B = block size; K = number of strata; and D = imbalance.
Hallstrom and Davis (Cont Clin Trials, 1988) showed that the total trial imbalance for the number of patients assigned 2 treatments across all strata = D = KB/2 with variance = K(B+1)/6
Example: Cardiac arrhythmia trial with 270 strata (site, ejection fraction, time since MI) and block size of 4.
Max D = 540; Var (D) = 225; SD (D) = 15; 2 SD = 30.
In this trial, 4200 patients were to be randomized and an imbalance of 30 with probability = 0.05 was considered acceptable.
30. For small studies with a large number of strata, the use of random permuted blocks within strata can be self-defeating.
Example: A study of testicular cancer
• 2 treatments
• 3 stratifying variables
Stage: 2 levels
Histology: 3 levels
Age: 2 levels
No. of strata = 2 x 3 x 2 = 12.
31. Randomization Schedules for 12 Strata Teratocarcinoma A* A* A* B*
A* A* A* A*
B A* A* A*
A B B B
B B B B
B B B A
Embryonal carcinoma A* B B* A*
A* B B* B*
B A A B*
B A A B*
B A B A
A B A A
Choriocarcinoma B* B A* B*
B A B* B*
A A B* A
A B B* A
B B A A
A A A B
32. Marginal Totals for Strata Teratocarcinoma 10 1
Embryonal carcinoma 3 5
Choriocarcinoma 1 6
Stage I 7 1
Stage II 7 11
Age: < 15 8 6
= 15 6 6
TOTAL 14 12
33. Minimization A method of adaptive stratification which balances the marginal treatment totals for each stratification variable.
Interestingly, the European Committee for Proprietary Medicinal Products (CPMP) discourages use of minimization due to concerns about analysis. They note that the methods remain “highly controversial” and are “strongly discouraged”.
34. Some Notation Let Xik = number of patients already assigned treatment k
k = 1, 2 (A or B) for our example
i = 1, 2 …, f prognostic factors of a new patient
Xtik = Xik if t ? k and = Xik+1 if t = k
Xtik denotes the new allocation if the new patient is assigned to t.
t = 1, 2 (A, B)
35. Lack of Balance Functions
B(t) could be a function of Xik or Xtik which measures the “Lack of Balance”: 2 examples
Rule of assignment: Use the treatment with smallest B(t) with higher probability.
Note: Pocock and Simon’s approach is more general than Taves. It allows for variation among assignments to be considered (e.g., range) and non-deterministic assignment.
36. Characteristics of New Patient Performance status Ambulatory 30 31 x
Non-ambulatory 10 9
Age < 50 18 17 x
= 50 22 23
Disease-free interval < 2 years 31 32
= 2 years 9 8 x
Dominant metastatic Visceral 19 21 x
lesion Osseous 8 7
Soft tissue 13 12
37. Estimation of B (1) i) Factor 1, Level 1 k x x Range (x – x )
1 30 31 31 – 31 = 0
2 31 31
ii) Factor 2, Level 1 k x x Range (x – x )
1 18 19 19 - 17 = 2
2 17 17
iii) Factor 3, Level 2 k x x Range (x – x )
1 9 10 10 – 8 = 2
2 8 8
iv) Factor 4, Level 1 k x x Range (x – x )
1 19 20 20 – 21 = 1
2 21 21
B (1) = 0 + 2 + 2 + 1 = 5
39. Implementation
Need to continuously update marginal totals to determine B(t) therefore this is best done at a central coordinating/statistical center
40. Flexibility in allocation: Examples 1. P = 1 if B(1) ? B(2)
P = 1/2 if B(1) = B(2)
Simple randomization if equal, deterministic if unequal
2. P = 2/3 if B(1) ? B(2)
P = 1/2 if B(1) = B(2)
P denotes the: Prob (groups become “more equal”)
The more P deviates from 1 when B(1) ? B(2), the less effective the balancing
41. Theoretical Challenge Not true randomization – in some cases deterministic
Violation of randomization as basis for inference
If the site knows all the margins, then can predict
Reality: When done in a multi-center trial, with central randomization, impossible for sites to predict
Appears random to the sites
Basis for inference: We do inference all the time in non-randomized trials, doesn’t bother us then
42. Summary Unless a very small block size is used, over-stratification is likely with use of block randomization within strata if you have many strata relative to the total sample size.
Minimization should be considered for situations where you have several important prognostic factors and a small sample size (particularly if you are concerned about using a very small block size).
Therneau suggests that as the number of distinct groups (strata) approaches N/2, adaptive methods be considered.