270 likes | 561 Views
Statistical Methods for Testing Carcinogenic Potential of New Drugs in Animal Carcinogenicity Studies. Hojin Moon , Ph.D. E-mail : HMoon @ nctr.fda.gov September 16, 200 5. Collaborators. Dr. Ralph L. Kodell – DBRA, NCTR, FDA Dr. Hongshik Ahn – SUNY@Stony Brook.
E N D
Statistical Methods for Testing Carcinogenic Potential of New Drugs in Animal Carcinogenicity Studies Hojin Moon, Ph.D. E-mail:HMoon@nctr.fda.gov September16, 2005
Collaborators • Dr. Ralph L. Kodell – DBRA, NCTR, FDA • Dr. Hongshik Ahn – SUNY@Stony Brook National Center for Toxicological Research, U.S. Food and Drug Administration
Animal Carcinogenicity Study • Studies are conducted to assess the oncogenic potential of chemicals encountered in food or drugs for the protection of public health • Studies often involve a problem of testing the statistical significance of a dose-response relationship among dose (treatment) groups. • Various statistical testing methods for a dose-response relationship (Ahn and Kodell, 1998) National Center for Toxicological Research, U.S. Food and Drug Administration
Animal Carcinogenicity Study • The statistical analysis of animal carcinogenicity data and the Peto COD controversy are current issues in the government-regulated pharmaceutical industry • (Lee et al., 2002; STP Peto Analysis Working Group, 2001, 2002; U.S. FDA, 2001) • Town Hall meetings were held in both June 2001 & June 2002 at the annual meetings of the STP to discuss issues surrounding COD assignment and implications for using the Peto test or the alternative Poly-3 test • Opinions of a number of statisticians (Lee et al., 2002) National Center for Toxicological Research, U.S. Food and Drug Administration
Dose-Related Trend Tests • Cochran-Armitage Trend Test (Cochran, 1954; Armitage, 1955) • To detect linear trend across dose groups in lifetime tumor incidence rates • Does not require COD • Requires an assumption under H0 that all animals are at equal risk of developing a tumor over the duration of a study • A problem for this test arises from the presence of treatment-induced mortality unrelated to the tumor of interest • The CA test is known to be sensitive to increase in treatment lethality and to fail to control the probability of a Type I error (Bailer & Portier, 1988; Mancuso et al., 2002; Moon et al., 2003) National Center for Toxicological Research, U.S. Food and Drug Administration
Cochran-Armitage Trend Test • The CA test utilizes the tumor data pooled over the study duration for each group • Expected # w T in group • Dose level in group • Under the null hypothesis of equal tumor incidence rates among groups • Some treatments shorten overall survival -> decreased risks of tumor onset • Survival time is not utilized • Observed # w T in group National Center for Toxicological Research, U.S. Food and Drug Administration
The Poly-k Trend Test • Appropriate alternative to the Peto-type test • No COD required • Adopted by NTPas its official test for carcinogenicity • Survival-adjusted quantal-response procedure that takes dose-group differences in intercurrent mortality (all deaths other than those resulting from a tumor of interest) into account. National Center for Toxicological Research, U.S. Food and Drug Administration
The Poly-k Trend Test • Bailer & Portier (1988) • Proposed the Poly-3 test, which made an adjustment of the CA test by using a fractional weighting scheme • # at risk in group where (time-at-risk weight for the kth animal in group i) • Replace Ni with ri in calculating ZCA • First mentioned the Poly-k test without specifying how to obtain k • Recommended k=3 following evaluation of neoplasm onset time distribution in control F344 rats and B6C3F1 mice (Portier et al., 1986) • The Poly-k test with correct k -> Superior operating characteristics to the Poly-3 test National Center for Toxicological Research, U.S. Food and Drug Administration
The Poly-k Trend Test • Bieler & Williams (1993) • Further modified the CA test by an adjustment of the variance estimation of the test statistic using the delta method (Woodruff, 1971) • Showed that the Bailer-Portier Poly-3 test is anticonservative for low tumor incidence rates and for high treatment toxicity • Characteristics of the BP Poly-3 test and the BW Poly-3 test can be found in Chen et al. (2000) • Objectives • The Poly-k statistic: asymptotically normal under H0 of equal tumor incidence rates among groups (Bieler & Williams, 1993) • Valid only if the correct value of k is used • Develop the method of bootstrap resampling to estimate the empirical distribution of the test statistic and corresponding critical value of the Poly-k test while taking into account the presence of competing risks National Center for Toxicological Research, U.S. Food and Drug Administration
Generalized Poly-k Test • Moon et al. (2003) • Proposed a method for estimating k for data with interval sacrifices (interim sacrifices and a terminal sacrifice) • Estimation of the poly-k based empirical lifetime cumulative tumor incidence rate, a function of k • Estimation of cumulative tumor incidence rate (Kodell & Ahn, 1997) • Equate two estimate and find k National Center for Toxicological Research, U.S. Food and Drug Administration
Generalized Poly-k Test • Moon et al. (2005) – Bootstrap-based age-adjusted Poly-k test • Improving the Poly-k test for data with a single terminal sacrifice • Estimation of k for single sacrifice data is more difficult than that for data with interval sacrifices due to lack of information on tumor development among live animals before the termination of the experiment • Propose a method of bootstrap-based age-adjusted resampling to improve the Poly-k test via a modification of the permutation method of Farrar & Crump (1990), which was used for exact statistical tests National Center for Toxicological Research, U.S. Food and Drug Administration
Bootstrap Method • Suitable for data with the same CRSR • When the CRSR is different across dose groups in the original data, the bootstrap samples from the pooled data may not reflect the CRSR of each group, while satisfying the null distribution of equal tumor incidence rate across groups • Need to modify the bootstrap method in order to preserve the survival rates in each dose group • Develop an age-adjusted scheme National Center for Toxicological Research, U.S. Food and Drug Administration
T(X) Data Set X = (x1, x2, …, xn) Age-adjusted Bootstrap Scheme Age-adjusted scheme I(I,m); i=1,….,G; m=1,….,Mi . . . . . Samples X*1 X*2 . . . . . X*B Bootstrap T(X*1) T(X*2) . . . . . T(X*B) Replicates Bootstrap 100(1-α)th percentile: CR(X); Reject H0 if T(X) ≥ CR(X) National Center for Toxicological Research, U.S. Food and Drug Administration
Example • Death times (in days) in a hypothetical animal carcinogenicity data set with 4 groups National Center for Toxicological Research, U.S. Food and Drug Administration
Example • Death times (in days) in a hypothetical animal carcinogenicity data set with 4 groups National Center for Toxicological Research, U.S. Food and Drug Administration
Example • Death times (in days) in a hypothetical animal carcinogenicity data set with 4 groups National Center for Toxicological Research, U.S. Food and Drug Administration
Example • Death times (in days) in a hypothetical animal carcinogenicity data set with 4 groups National Center for Toxicological Research, U.S. Food and Drug Administration
Simulation Study • To evaluate the improvement of the proposed test in terms of the robustness to a variety of tumor onset distributions • Typical bioassay design according to standard designs of NTP • 4 dose groups (dose levels: 0, 1, 2 and 4) of 50 animals each • Experimental duration of 2 yrs. • A single terminal sacrifice at the end of the experiment National Center for Toxicological Research, U.S. Food and Drug Administration
Simulation Study • Tumor onset distributions: • Weibull tumor onset distribution with shape parameter k = 1.5, 3.0 and 6.0 • Tumor rates: • .05, .15 and .30 for the control • Size evaluation: • tumor rates are the same across dose groups • Power evaluation: • tumor rates for the highest dose group by 104 weeks: 5, 3 and 2 times the background tumor rates of .05, .15 and .30, respectively • CRSR (from NTP feeding studies, Haseman et al., 1998) • (.6, .6, .6, .6); (.6, .5, .4, .3); (.6, .6, .5, .2); (.5, .5, .5, .2); (.5, .6, .5, .4); (.5, .7, .6, .4); (.5, .7, .6, .5) • 5000 simulated data sets; α = .05 significance level; • For each data set, 5000 bootstrap samples National Center for Toxicological Research, U.S. Food and Drug Administration
Simulation Study • Size & Power Evaluation with 5000 simulated data sets, 5000 bootstrap samples for each data set and 5% nominal significance level National Center for Toxicological Research, U.S. Food and Drug Administration
Example • The 2-yr Gavage Study of Furan • Furan (C4H4O), a clear and colorless liquid, serves primarily as an intermediate in the synthesis and preparation of numerous organic compounds (NTP, 1993) • Toxicology and carcinogenesis studies were conducted by administering furan in corn oil by gavage to groups of F344/N rats and B6C3F1 mice of each sex for 2 yrs • Furan was nominated by the NCI for evaluation of carcinogenic potential due to its large production volume and use, and because of the potential for widespread human exposure to a variety of furan-containing compounds National Center for Toxicological Research, U.S. Food and Drug Administration
Example • Female F344/N rats • Evaluation of carcinogenic potential on incidences of cholangiocarcinoma or hepatocellular neoplasms of the liver • Groups of 50 rats were administered 2, 4 or 8 mg furan per kg body weight in corn oil by gavage 5 days per week for 2 yrs • Male B6C3F1 mice • Evaluation of carcinogenic potential on incidences of adenocarcinoma or alveolar/bronchiolar adenoma of the lung. • Groups of 50 mice received doses of 8 or 15 mg/kg furan 5 days per week for 2 yrs National Center for Toxicological Research, U.S. Food and Drug Administration
Test results on the carcinogenic activity of furan in female F344/N rats based on increased incidences of cholangiocarcinoma and hepatocellular neoplasms of the liver (Reject when T(X) ≥ CR(X)) aThe BWP3 test statistic obtained from the data bStandard normal critical value at the significance level .05 cCritical value estimated by the 95th percentile of T(X)’s from our method • NTP concluded that under the conditions of these 2-yr gavage studies, there was clear evidence of carcinogenic activity of furan in female F344/N rats based on increased incidences of cholangiocarcinoma and hepatocellular neoplasms of the liver National Center for Toxicological Research, U.S. Food and Drug Administration
Test results on the carcinogenic potential of furan on incidences of adenocarcinoma and alveolar/bronchiolar adenoma of the lung in male B6C3F1 mice (Reject when T(X) ≥ CR(X)) aThe BWP3 test statistic obtained from the data bStandard normal critical value at the significance level .05 cCritical value estimated by the 95th percentile of T(X)’s from our method • Our test results agree with the conclusions from NTP National Center for Toxicological Research, U.S. Food and Drug Administration
Significance • The statistical analysis of tumorigenicity data from animal bioassays remains an important regulatory issue to FDA and the pharmaceutical industry • The present research will build to further refine the Poly-k test in order to make it more broadly competitive with the Peto test • The improved Poly-k test for dose-related trend will be robust to a variety of tumor onset distributions. • It will control the false positive rate better than the Poly-3 test, thus having enhanced performance in identifying dose-related trends. • With no information on COD or tumor lethality, the improved version can be used confidently when Peto’s test can not be implemented National Center for Toxicological Research, U.S. Food and Drug Administration
References • Ahn H, Kodell RL (1998). Analysis of long-term carcinogenicity studies. In Design and Analysis of Animal Studies in Pharmaceutical Development, Chow SC, Liu JP (eds). Marcel Dekker, Inc.: New York, 259-289. • Armitage P (1955). Tests for linear trends in proportions and frequencies. Biometrics, 11, 375-386. • Bailer AJ, Portier CJ (1988). Effects of treatment-induced mortality and tumor-induced mortality on tests for carcinogenicity in small samples. Biometrics, 44, 417-431. • Bieler GS, Williams RL (1993). Ratio estimates, the delta method, and quantal response tests for increased carcinogenicity. Biometrics, 49, 793-801. • Chen JJ, Lin KK, Huque MF, Arani RB (2000). Weighted p-value for animals carcinogenicity trend test. Biometrics, 56, 596-592. • Cochran WG (1954). Some methods for strengthening the common x2 tests. Biometrics, 10, 417-451. • Lee PN, Fry JS, Fairweather WR, Haseman JK, Kodell RL, Chen JJ et al. (2002). Current issues: statistical methods for carcinogenicity studies. Toxicologic Pathology, 30, 403-414. • Mancuso JY, Ahn H, Chen JJ, Mancuso JP (2002). Age-adjusted exact trend tests in the event of rare occurrences. Biometrics, 58, 403-412. • Moon H, Ahn H, Kodell RL, Lee JJ (2003). Estimation of k for the poly-k test. Statistics in Medicine, 22, 2619-2636. • National Toxicology Program (1993). Toxicology and carcinogenesis studies of furan in F344/N rats and B6C3F1 mice (Gavage studies). NTP Technical Report, 402, Research Triangle Park, NC. • STP Peto Analysis Working Group (2001). The Society of Toxicological Pathology’s position on statistical methods for rodent carcinogenicity studies. Toxicologic Pathology, 29(6), 670-672. • STP Peto Analysis Working Group (2002). The Society of Toxicological Pathology’s recommendations on rodent carcinogenicity studies. Toxicologic Pathology, 30, 415-418. • U.S. FDA (2001). Guidance for industry: statistical aspects of the design, analysis, and interpretation of chronic rodent carcinogenicity studies of pharmaceuticals. Federal Register, 66(89), 23266-23267. • Woodruff RS (1971). A simple method for approximating the variance of a complicated estimate. Journal of the American Statistical Association, 66, 411-414.