Statistical Issues With Linear Extrapolation of Stability Data

Statistical Issues With Linear Extrapolation of Stability Data Jason Marlin, MS/T Statistics, Eli Lilly & Co.

Define OOT & PTE • OOT—Out of Trend: the FDA considers any atypical test result or slope to be an “out of trend.” • PTE—Project to Exceed (Lilly term): linear extrapolation of stability data that indicates a lot may exceed the regulatory limit prior to expiry

FDA Resource for OOT PTE • From Q1E: The Evaluation of Stability Data (June 2004) • The FDA expects industry to monitor stability batches and to react to trends or changes. • Where appropriate, determine if batches may not stay within the regulatory specifications during shelf-life. • Inform agency in advance if product will fail • Modeling an appropriate fit is clearly important to this work.

Additional Resources • Published reference papers: • April 2003, Identification of Out of Trend Stability Results, by PhRMA CMC Statistics Stability Expert Teams, published in Pharmaceutical Technology • October 2005, Identification of Out-of-Trend Stability Results, Part II, by PhRMA CMC Statistics Stability Expert Teams, published in Pharmaceutical Technology • These papers establish three types of OOT data: • 1) Analytical—a single result out of trend but within specification • 2) Process Control—a succession of data points with an atypical pattern • 3) Compliance—a single result or succession of results indicates the likelihood for an Out Of Specification prior to Expiry

PTE Resource • From April 2003, Identification of Out of Trend Stability Results • “The extrapolation of OOT should be limited and scientifically justified, just as the use of extrapolation of stability data is limited in regulatory guidance (ICH, FDA). “

WIIFM • Every pharmaceutical company is required to evaluate stability data for trends and respond accordingly. • Every product has different challenges—e.g. assay (RMSE of common slopes), process mean, markets, possibly different stability timepoints, possibly different sites of manufacture, etc. • Intent: provide some measure of accounting for these differences in the process of evaluating stability data for projection to exceed. • What is not covered is a method for controlling the Type II error rate—namely, failing to signal a batch that will exceed regulatory prior to expiry. (A rough simulation has been performed but further refinement is needed).

WIIFM-continued • PTE’s are often evaluated by a stability coordinator (often non-statistician) • A false signal=wasted resources (deviation, investigation, re-assay, etc.) • Generic computer algorithms don’t account for necessary variables (how many batches are manufactured, assay, process mean, etc.) • Thus, some “decision-science” is needed to allow the coordinator to determine how many stability timepoints are required to obtain a reliable projection

Example: Actual Example ( w/coded data) 7 stability batches, all with data through at least 50% of expiry

Example: Actual Product w/coded data continued Expiry A lot with 6 timepoints PTE. 6th timepoint at 18 months

Sources of OOT • Reasons for a false signal: • Non-linear data • Limited data • Process mean deviation from label claim • Assay variability • What data requirements are needed to limit the probability of a false signal? • Problem: for products with no practical change on stability and relatively high assay variability, three or four points are not sufficient to prevent false signals simulation

Simulation • Properties with Single-sided and Dual-sided limits are evaluated. • For single-sided limits, the mean bias and the assay standard deviation are expressed as a portion of the regulatory limit • For dual-sided limits, the mean bias and the regulatory limit are expressed in actual units as % of label claim but they can also be expressed as k-sigma units of the assay standard deviation • For the sake of simplicity, both single-sided and dual-sided properties are expressed in k-sigma units as “distance to nearest specification.” This eliminates the need for presenting two sets of results.

Simulation-cont’d Two-sided Limit One-sided Limit

Simulation—continued • Population means and assay standard deviations (assumed known) simulated relative to target. • For single-sided and dual-sided limits,1500 simulations are performed for 40 different combinations of true mean, regulatory limits and assay standard deviation • Assumptions: 1) No Change on Stability 2) 36-month expiry 3) Typical Stability timepoints (0,3,6,9,12,18,24,30) A reduction in alpha (α)—all else held constant—could affect beta (β). **The assumption that there is not change on stability MUST be based on the science of the molecule/packaging and supported by available stability data

Errors in PTE • Type I—designating a batch as PTE when the true property is not outside regulatory limits at expiry. • Type II—failing to detect a batch with a true property value outside regulatory limits at or before expiry.

Goal • In general, identify the minimum n such that no PTE’s are issued >95% of the time for properties which are practically stable. • Also evaluated n (minimum # of stability timepoints) for 90 and 99%. • Allow manufacturing sites to select a minimum # of stability time-points to achieve the desired alpha. • Allow manufacturing sites to see the impact of mean bias and assay standard deviation on the likelihood of falsely PTE • Target products are those with large assay error and with process means close to the Regulatory limits (small R values)

Brief JMP Simulation Overview

R Question: Is an R=4 the same for different combinations? R1=4=100-90/2.5 R2=4=99-95/1 Simulation shows the results are effectively the same for a given n.

Rn=4(0,3,6,9,12,18,24months) Each point represents a different combination of assay, process mean and regulatory. Bivariate Fit of % Not PTE By R n=4 RMSE=2.79%

Rn=5(0,3,6,9,12,18,24months) Bivariate Fit of % Not PTE By R n=5 RMSE=2.16%

R (4, 5, 6 & 7 stability timepoints)

R Summary Tables 0,3,6,9,12,18,24 months 0,3,6,12,18,24,30 months

Summary • Simulation provides a method to protect against falsely PTE for products which are stable • Increasing the number of data-points required to perform a PTE evaluation can—for instances where actual product stability changes occur—result in an increased likelihood of failing to detect a PTE for a product which will have a true property value outside the regulatory specifications • PTE evaluations need to account for assay variability, mean bias and the scientific basis of product stability.

Summary • Next Steps—simulate with linear slope to determine at what expense the reduction in false PTE’s (by increasing n) is achieved • Determine how the PTE changes as a result of true change on stability • Initial look (slopes of 0.05%/month and 0.10%/month ) for 36-month dating and for a product with ±10% regulatory limits, as long as the mean was within 2% of target, these slope changes would not result in a true property value outside the regulatory limits.

Talking Points • Distance of projection to expiry is very important • Alpha should be a ƒn of # of lots manufactured and historical data • In general, 4 stability timepoints is marginal for 95% probability of not falsely PTE • Beta dependent on assumption of stability

Questions

One-Sided Simulation • For different true, but unknown property mean values and for different true but unknown assay variabilitys, simulate the probability of PTE given the seven assumed stability time-points for two different stability evaluations: • 0, 3, 6, 9, 12, 18 and 24 months (standard validation timepoints) • 0, 3, 6, 12, 18, 24 and 30 months (alternative timepoint approach) • 1500 simulations are performed for 20 different combinations of true mean and assay standard deviation • Assay stdev levels=10,20, 30, 40 and 50 % of regulatory • Mean Bias Levels=0, 5, 10, 20 and 25 % of regulatory • Regulatory Level is not a factor for the one-sided simulation as both variables are expressed in terms of a given regulatory limit (this is verifed by simulation to work for different regulatory limits) • Assumptions: 1) No Change on Stability 2) 36-month expiry 3) Typical Stability timepoints 4) A reduction in alpha (α)—all else held constant—could affect beta (β).

Two-Sided Simulation • For different true, but unknown property mean values (in % of label claim but also easily expressed in k-sigma units from target) and for different true but unknown assay variabilities, simulate the probability of PTE given the seven assumed stability time-points for two different stability evaluations: • 0, 3, 6, 9, 12, 18 and 24 months (standard validation timepoints) • 0, 3, 6, 12, 18, 24 and 30 months (alternative timepoint approach) • 1500 simulations are performed for 40 different combinations of true mean, regulatory limits and assay standard deviation • Assay stdev levels=0.50, 0.75, 1.00, 1.25 and 1.50 % of target • Mean Bias Levels=0.0, 0.5, 1.0, and 1.5 % from target • Regulatory Levels=±5% from target, ±10% from target • Assumptions: 1) No Change on Stability 2) 36-month expiry 3) Typical Stability timepoints 4) A reduction in alpha (α)—all else held constant—could affect beta (β).

Summary Table for One-sided and Two-sided

Summary graphs for R by simulation 0, 3, 6, 9, 12, 18, 24 months 0, 3, 6, 12, 18, 24, 30 months

Statistical Issues With Linear Extrapolation of Stability Data