220 likes | 401 Views
Controlling False Positive Rate Due to Multiple Analyses Unstratified vs. Stratified Logrank Test. Peiling Yang, Gang Chen, George Y.H. Chi DBI/OB/OPaSS/CDER/FDA The view expressed in this talk are those of the authors and may not necessarily represent those of the Food and Drug Administration.
E N D
Controlling False Positive Rate Due to Multiple AnalysesUnstratified vs. Stratified Logrank Test Peiling Yang, Gang Chen, George Y.H. Chi DBI/OB/OPaSS/CDER/FDA The view expressed in this talk are those of the authors and may not necessarily represent those of the Food and Drug Administration.
Issues to Explore • Implication of these tests/analyses. • Eligibility of efficacy claim based on these tests/analyses. • Practicability of multiple testing/analyses.
Outline • Notations / Settings • Introduction to logrank test • Unstratified, stratified • Comparisons • Hypotheses, test statistic, test procedure, inference • Practicability of hypotheses Testing • Multiple testing/analyses • Example of Drug X • Summary
Settings / Notations • 2 arms (control j=1; experimental: j=2). • K strata: k=1, .., K • Patients randomized within strata • t1 < t2 < …< tD: distinct death times • dijk: # of deaths & Yijk: # of patients at risk at death time ti, in jth arm & kth stratum.
Settings / Notations • Hazard ratio (ctrl./exper.): constant • Across strata: c • Within stratum: ck • Non-informative censoring
Introduction: Unstratified Logrank • Wu ~ N(0,1) under least favorable parameter configuration (c=1) in . • Reject if Wu > z. • Type I error rate is controlled at level .
Introduction: Stratified Logrank • Ws~ N(0,1) under least favorable parameter configuration (ck = 1 for all k) in . • Reject if Ws > z. • Type I error rate is controlled at level .
Comparison of Hypotheses • Different hypotheses formulations:
Comparison of Test Statistics • Corr(Wu, Ws) = 1because ofsame r.v. d.1. • Ws = a Wu + b,wherewhere • Wu ~ N(0, 1) Ws ~ N(b, a2)
Comparison of Inference • Rejection of : • Infer overall positive treatment effect in entire population. • Rejection of : • Can only infer positive treatment effect in "at least one stratum". • Further testing to identify those strata required to make claim & error rate for identifying wrong strata also needs to be controlled.
Practicability of Hypotheses Testing • Unstratified hypotheses are tested whendesired to infer overall positive treatment effect in entire population. • Stratified hypotheses are tested whendesired to infer positive treatment effect in certain strata. • Multiple testing of both unstratified & stratified hypotheses ok when not sure whether treatment is effective in entire population or certain strata (but both nulls need to be prespecified in protocol).
Multiple Testing/Analyses • Multiple testing unstratified (use Wu) & stratified (use Ws) hypotheses. • Error to control: strong familywise error (SFE), including the following: • When c1 & all ck1: falsely infer c or some ck’s>1. • When c1 & some ck’s>1: falsely infer c>1 or wrong ck’s>1 Note: parameter space of “all ck1 but c>1” impossible.
Multiple Testing/Analyses Property of SFE: FEnested in another FE. Which ck>1? Nested FE c>1 & at least one ck>1 c1 & at least one ck>1 FE c1 & all ck 1 impossible space
Example -- Drug X • Ws= aWu+b, where a = 1.039, b=0.409 • Critical value using Ws should be adjusted to az+b. • False positive error rate using Ws w/o adjustment = 0.066; • Inflation = 0.066 - 0.025 = 0.041. • Ans.: This finding is not statistically significant. for
Figure 1: False positive rate vs. desired level (w/o adjustment)
Summary • Hypotheses (unstratified or stratified or both) • should reflect what is desired to claim. • need to be prespecified in protocol. • If stratified null is rejected, further testing required to identify in which strata treatment effect is positive. • Strong family error rate needs to be controlled regardless of single or multiple testing.