Stratification: Are you a lumper or a splitter

1. Stratification: Are you a lumper or a splitter?

2. �and if you are a splitter, how should you split the data and when?

3. Stratification A procedure in which factors known to be associated with the response (prognostic factors) are taken into account in the design (e.g., randomization) Recall, permuted block randomization is used to achieve balance on the number in each treatment arm over time. Stratification is used to achieve comparability between groups with respect to important prognostic factors. Pre-stratification refers to a stratified design; post-stratification refers to the analysis

4. Note: This is different from stratified random sampling where the population might be divided up into strata, e.g., census tracts, and each stratum is sampled randomly for some pre-specified sample size.

5. Possible Stratification Scenarios Pre- plus post-stratification Pre-stratification only Post-stratification only Neither pre- nor post- stratification Regression adjustment with or without stratification

6. Advantages Prevents �accidental bias� resulting from mal-distribution of important prognostic variables Increases precision (if stratifying variables are related to outcome) Facilitates subgroup analysis Results less subject to criticism

7. International Conference on Harmonization (ICH) Guideline (E-9 Document) �Stratification by important prognostic factors measured at baseline (e.g., severity of disease, age, sex, etc.) may sometimes be valuable in order to promote balanced allocation within strata; this has greater potential benefit in small trials.�

8. Disadvantages Primarily relates to additional administrative burden of implementation of randomization. May have several randomization schedules Measurements to define stratum must be carefully made prior to randomization

9. What StratificationDoes Not Do 1. Guarantee adequate power to make within-stratum comparisons 2. Eliminate the need to carry out covariate-adjusted analysis Chance imbalance on other covariates Analysis consistent with design

10. Characteristics of Patients in Trial to Prevent Toxoplasmic Encephalitis CD4+ count (cells/mm3) 96.1 97.4 AIDS OI (%) 35.2 22.0 Karnofsky Score 89.5 89.7 Hemoglobin (g/dl) 12.6 12.7

11. �In view of the major imbalance between the groups in presentation at baseline with AIDS defining OIs, the rigorousness of the allocation procedures need to be supported in detail if the results are to be regarded as credible.�

12. Example How a small difference in an important prognostic variable can bias treatment differences.

13. Baseline Characteristics Age (years) 37.8 8.5 37.5 7.8 CD4+ 75.1 86.2 71.1 84.3 Karnofsky Score 87.2 11.9 85.3 11.9 Prior AIDS 64.8 66.7 Diagnosis (%)

14. Frequency Distribution of Karnofsky Score by Treatment Group ddI ddC < 70 4.8 6.8 70 - 79 10.0 11.8 80 - 89 21.3 24.1 90 - 99 36.1 36.7 100+ 27.8 20.6

15. Death Rate by Karnofsky Level < 70 169.8 70 - 79 84.0 80 - 89 41.0 90 - 99 31.9 100+ 18.4

16. Comparison ofUnadjusted and AdjustedRelative Risk Estimates Unadjusted 0.79 0.11 Adjusted 0.66 0.006

17. A major problem with this study is the adjustment for the �small differences at baseline� between didanosine and zalcitabine. While there is a �small difference� noted, the variability for each of these variables is quite large. For example, the difference in CD4 count was 4 cells/mm3 between treatment groups; however, the standard error was over 86 cells/mm3. Similarly, for Karnofsky performance status, the difference between the two groups was 2, but the standard error was 11.9. And, finally, there was no difference in the presence of AIDS-defining illness between the two groups. In short, the conclusion that should be drawn is that there is, indeed, no difference between the two groups and attempting to adjust for these small differences is inappropriate. The discussion of Results on page 23, first paragraph, should be eliminated.

18. Summary Small differences in a very important prognostic variable (irrespective of significance) can bias treatment comparisons Large, significant differences in unimportant variables will not bias treatment comparisons Remember a p-value is a function of both sample size and effect size Chance imbalances can occur with large sample sizes if there are many strata.

20. Considerations in the Decision to �Lump� or �Split� 1. Size of study 2. Homogeneity of study subjects 3. Strength of prognostic factors (between strata variability) 4. Administrative burden 5. Credibility

21. Usual Implementation Block randomization within stratum i.e., prepare a separate randomization schedule for each stratum usually with relatively small block sizes Makes no sense to use simple randomization Note: The aim of this method is to ensure balance within strata formed by cross-classification of all factor levels.

22. Typical Stratifying Variables Clinical site Baseline level for outcome of interest Stage of disease Combination of factors, e.g., a risk score

23. Stratification Example: TOMHS Multi-center (4 clinical sites) trial with two other strata defined by previous use of antihypertensive treatment (Rx) (Yes/No) 4 x 2 = 8 strata and randomization schedules � aim is to achieve the desired allocation ratio across all 8 groups

24. Post-stratification (def.) Classification of experimental units into strata after they have been randomized for the purpose of data analysis e.g., stratified analysis of variance (normally distributed response), Mantel-Haenszel (binary response). Often adjustment for baseline covariates is carried out using regression methods, e.g., linear regression or analysis of covariance (continuous), logistic regression (binary), or Cox regression (time to event)

25. General Problems/Issues with Post-Stratification Model dependence / data dredging How were covariates (stratifying variables) selected? How were cutpoints (metric) chosen? Frequently covariates are not pre-specified Partial solution: Analysis plan in the protocol that includes all covariates considered important (pre-stratification variables + others); updated analysis plan prior to unblinding the results of the study to investigators.

26. One can calculate the probability of obtaining a certain imbalance before the study begins. This can be used to decide whether to stratify the randomization. p(t) is the prob. of randomizing t patients to group A when there are t1 patients in stratum 1. For a certain imbalance one can sum over all p(t) for t's that give that imbalance or worse.

27. Example: Na = 100, Nb = 100, t1 = 40, g = 0.16, h = 0.24 Group A 16 84 100 Group B 24 76 100 Total 40 160 200 Want the prob. of obtaining the imbalance given by g = 0.16, h = 0.24, or worse.

28. Probability of Given Imbalance or More Extreme .52 .48 1.0 .84 .23 .55 .45 .57 .42 .002 .60 .40 .25 .07 � .70 .30 .01 � �

29. Estimates for the Size of Treatment Imbalance Let B = block size; K = number of strata; and D = imbalance. Hallstrom and Davis (Cont Clin Trials, 1988) showed that the total trial imbalance for the number of patients assigned 2 treatments across all strata = D = KB/2 with variance = K(B+1)/6 Example: Cardiac arrhythmia trial with 270 strata (site, ejection fraction, time since MI) and block size of 4. Max D = 540; Var (D) = 225; SD (D) = 15; 2 SD = 30. In this trial, 4200 patients were to be randomized and an imbalance of 30 with probability = 0.05 was considered acceptable.

30. For small studies with a large number of strata, the use of random permuted blocks within strata can be self-defeating. Example: A study of testicular cancer � 2 treatments � 3 stratifying variables Stage: 2 levels Histology: 3 levels Age: 2 levels No. of strata = 2 x 3 x 2 = 12.

31. Randomization Schedules for 12 Strata Teratocarcinoma A* A* A* B* A* A* A* A* B A* A* A* A B B B B B B B B B B A Embryonal carcinoma A* B B* A* A* B B* B* B A A B* B A A B* B A B A A B A A Choriocarcinoma B* B A* B* B A B* B* A A B* A A B B* A B B A A A A A B

32. Marginal Totals for Strata Teratocarcinoma 10 1 Embryonal carcinoma 3 5 Choriocarcinoma 1 6 Stage I 7 1 Stage II 7 11 Age: < 15 8 6 = 15 6 6 TOTAL 14 12

33. Minimization A method of adaptive stratification which balances the marginal treatment totals for each stratification variable. Interestingly, the European Committee for Proprietary Medicinal Products (CPMP) discourages use of minimization due to concerns about analysis. They note that the methods remain �highly controversial� and are �strongly discouraged�.

34. Some Notation Let Xik = number of patients already assigned treatment k k = 1, 2 (A or B) for our example i = 1, 2 �, f prognostic factors of a new patient Xtik = Xik if t ? k and = Xik+1 if t = k Xtik denotes the new allocation if the new patient is assigned to t. t = 1, 2 (A, B)

35. Lack of Balance Functions B(t) could be a function of Xik or Xtik which measures the �Lack of Balance�: 2 examples Rule of assignment: Use the treatment with smallest B(t) with higher probability. Note: Pocock and Simon�s approach is more general than Taves. It allows for variation among assignments to be considered (e.g., range) and non-deterministic assignment.

36. Characteristics of New Patient Performance status Ambulatory 30 31 x Non-ambulatory 10 9 Age < 50 18 17 x = 50 22 23 Disease-free interval < 2 years 31 32 = 2 years 9 8 x Dominant metastatic Visceral 19 21 x lesion Osseous 8 7 Soft tissue 13 12

37. Estimation of B (1) i) Factor 1, Level 1 k x x Range (x � x ) 1 30 31 31 � 31 = 0 2 31 31 ii) Factor 2, Level 1 k x x Range (x � x ) 1 18 19 19 - 17 = 2 2 17 17 iii) Factor 3, Level 2 k x x Range (x � x ) 1 9 10 10 � 8 = 2 2 8 8 iv) Factor 4, Level 1 k x x Range (x � x ) 1 19 20 20 � 21 = 1 2 21 21 B (1) = 0 + 2 + 2 + 1 = 5

39. Implementation Need to continuously update marginal totals to determine B(t) therefore this is best done at a central coordinating/statistical center

40. Flexibility in allocation: Examples 1. P = 1 if B(1) ? B(2) P = 1/2 if B(1) = B(2) Simple randomization if equal, deterministic if unequal 2. P = 2/3 if B(1) ? B(2) P = 1/2 if B(1) = B(2) P denotes the: Prob (groups become �more equal�) The more P deviates from 1 when B(1) ? B(2), the less effective the balancing

41. Theoretical Challenge Not true randomization � in some cases deterministic Violation of randomization as basis for inference If the site knows all the margins, then can predict Reality: When done in a multi-center trial, with central randomization, impossible for sites to predict Appears random to the sites Basis for inference: We do inference all the time in non-randomized trials, doesn�t bother us then

42. Summary Unless a very small block size is used, over-stratification is likely with use of block randomization within strata if you have many strata relative to the total sample size. Minimization should be considered for situations where you have several important prognostic factors and a small sample size (particularly if you are concerned about using a very small block size). Therneau suggests that as the number of distinct groups (strata) approaches N/2, adaptive methods be considered.

Stratification: Are you a lumper or a splitter

Stratification: Are you a lumper or a splitter

Presentation Transcript

Social Stratification

Chapter 12

CHAPTER 8: SOCIAL STRATIFICATION

Stratification

Rail Splitter Society

Social stratification

Social Stratification Explanations

Social Stratification

Social Stratification

Chapter 7: Class and Stratification in the United States

Delaware Rail Splitter Society

Social Stratification: Dimensions and Impact

When Should a Clinical Trial Design with Pre-Stratification be Used?

Stratification and Inequality

What is Social Stratification?

Chapter 18

Social Stratification

Chapter 11, Global Stratification

Proposed Verizon NW Line Sharing Configuration for Implementation Bay Mounted Splitter

Session 1: Social Inequality & Stratification

Outline of Stratification Lectures

Stratification: Are you a lumper or a splitter