1 / 23

Biostatistics Case Studies 2007

Biostatistics Case Studies 2007. Session 5: Demonstrating Lack of Treatment Effect: Equivalence or Non-inferiority. Peter D. Christenson Biostatistician http://gcrc.labiomed.org/Biostat. Terminology. Superiority and/or Inferiority Study:

Download Presentation

Biostatistics Case Studies 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biostatistics Case Studies 2007 Session 5: Demonstrating Lack of Treatment Effect: Equivalence or Non-inferiority Peter D. Christenson Biostatistician http://gcrc.labiomed.org/Biostat

  2. Terminology • Superiority and/or Inferiority Study: • Two or more treatments are assumed equal and the study is designed to find overwhelming evidence of a difference. • Usually, one treatment is a control, sham, or placebo. • Most common comparative study type. • It is rare to assess only one of superiority or inferiority (“one-sided” statistical tests), unless there is biological impossibility of one of them.

  3. Terminology • Equivalence Study: • Two treatments are assumed to differ and the study is designed to find overwhelming evidence that they are equal. • Usually, the quantity of interest is a measure of biological activity or potency and “treatments” are drugs or lots or batches of drugs. • AKA, bioequivalence. • Sometimes used to compare clinical outcomes for two active treatments, e.g., statins or vaccines, if neither treatment can be considered standard or accepted. This usually requires large numbers of subjects

  4. Terminology • Non-Inferiority Study: • Usually a new treatment or regimen is compared with an accepted treatment or regimen or standard of care. • The new treatment is assumed inferior to the standard and the study is designed to show overwhelming evidence that it is at least nearly as good, i.e., non- inferior. It may has other advantages, e.g., oral vs. inj. • A negative inferiority study fails to detect inferiority, but does not necessarily give evidence for non-inferiority. • The accepted treatment is usually known to be efficacious already, but an added placebo group may also be used. • The distinguishing feature is an attempt to prove negativity, not the one-sidedness of the inference.

  5. Case Study

  6. pASA+PPI = 1.5% Demonstrate: pclop – pASA+PPI ≤ 4% N=145/group Power=80% for what?

  7. Typical Analysis: Inferiority or Superiority [Not used in this paper] H0: pclop – pASA+PPI= 0% H1: pclop – pASA+PPI≠ 0% H1 → therapies differ α = 0.05 Power = 80% for Δ=|pclop - pASA+PPI|=? = 95% CI for pclop – pASA+PPI Clop inferior pclop – pASA+PPI 0 Clop superior pclop – pASA+PPI 0 No diff detected* pclop – pASA+PPI 0 * and 80% chance that a Δ of (?) or more would be detected.

  8. Typical Analysis: Inferiority or Superiority [Not used in this paper] H0: pclop – pASA+PPI= 0% H1: pclop – pASA+PPI≠ 0% H1 → therapies differ α = 0.05 Power = 80% for Δ=|pclop - pASA+PPI|=? Detectable Δ = 5.5%-1.5%=4% So, N=331/group → 80% chance that a Δ of 4% or more would be detected.

  9. Typical Analysis: Inferiority or Superiority [Not used in this paper] H0: pclop – pASA+PPI= 0% H1: pclop – pASA+PPI≠ 0% H1 → therapies differ α = 0.05 Power = 80% for Δ=|pclop - pASA+PPI|=4% Note that this could be formulated as two one-sided tests (TOST): H0: pclop – pASA+PPI≤ 0% H1: pclop – pASA+PPI> 0% H1 → clop inferior α = 0.025 Power = 80% for pclop - pASA+PPI =4% H0: pclop – pASA+PPI≥ 0% H1: pclop – pASA+PPI< 0% H1 → clop superior α = 0.025 Power = 80% for pclop - pASA+PPI =-4%

  10. Demonstrating Equivalence [Not used in this paper] H0: |pclop – pASA+PPI| ≥ E% H1: |pclop – pASA+PPI|< E% H1 → therapies “equivalent”, within E Note that this could be formulated as two one-sided tests (TOST): H0: pclop – pASA+PPI≤ -4% H1: pclop – pASA+PPI> -4% H1 → clop non-superior α = 0.025 Power = 80% for pclop - pASA+PPI = 0% H0: pclop – pASA+PPI≥ 4% H1: pclop – pASA+PPI< 4% H1 → clop non-inferior α = 0.025 Power = 80% for pclop - pASA+PPI = 0%

  11. Demonstrating Equivalence H0: |pclop – pASA+PPI | ≥ 4% H1: |pclop – pASA+PPI | < 4% H1 → equivalence α = 0.05 Power = 80% for pclop - pASA+PPI = 0 = 95% CI for pclop – pASA+PPI pclop – pASA+PPI Clop non-superior -4 0 4 pclop – pASA+PPI Clop non-inferior -4 0 4 pclop – pASA+PPI Equivalence* -4 0 4 * both non-superior and non-inferior.

  12. This Paper: Inferiority and Non-Inferiority Apparently, two one-sided tests (TOST), but only one explicitly powered: H0: pclop – pASA+PPI≤ 0% H1: pclop – pASA+PPI> 0% H1 → clop inferior α = 0.025 Power = 80% for pclop - pASA+PPI = ?% H0: pclop – pASA+PPI≥ 4% H1: pclop – pASA+PPI< 4% H1 → clop non-inferior α = 0.025 Power = 80% for pclop - pASA+PPI = 0% The authors chose E=4% as the maximum therapy difference that therapies are considered equivalent.

  13. This Paper: Inferiority and Non-Inferiority = 95% CI for pclop – pASA+PPI Decisions: pclop – pASA+PPI Clop inferior -4 0 4 pclop – pASA+PPI Clop non-inferior -4 0 4 “Non-clinical” inferiority* pclop – pASA+PPI -4 0 4 * clop is statistically inferior, but not enough for clinical significance. Observed Results: pclop = 8.6%; pASA+PPI = 0.7%; 95% CI = 3.4 to 12.4 pclop – pASA+PPI Clop inferior -4 0 4 12

  14. Power for Test of ClopidrogrelNon-Inferiority H0: pclop – pASA+PPI≥ 4% H1: pclop – pASA+PPI < 4% H1 → clop non-inferior α = 0.025 Power = 80% for pclop - pASA+PPI = 0%

  15. Power for Test of Clopidrogrel Inferiority H0: pclop – pASA+PPI≤ 0% H1: pclop – pASA+PPI > 0% H1→ clop inferior α = 0.025 Power = 80% for pclop - pASA+PPI = 7.3% Detectable Δ = 8.8%-1.5%=7.3%

  16. Conclusions: This Paper • In this paper, clop was so inferior that investigators were apparently lucky to have enough power for detecting it. The CI was too wide with this N for detecting a smaller therapy difference. • Investigators justify testing non-inferiority of clop only (and not of Aspirin + Nexium) with the lessened desirability of combination therapy (?). • This is a good approach for size and power for a new competing therapy against a standard, if the N for clop inferiority had been considered also. • Note that power calculations were based on actual %s of subjects, whereas cumulative 12-month incidence was used in the analysis. There are not power calculations for equivalency tests using survival analysis, that I know of.

  17. Conclusions: General • “Negligibly inferior” would be a better term than non- inferior. • All inference can be based on confidence intervals. • Pre-specify the comparisons to be made. Cannot test for both non-inferiority and superiority. • Power for only one or for multiple comparisons, e.g., non-inferiority and inferiority. Power can be different for different comparisons. • Very careful consideration must be given to choice of margin of equivalence (4% here). The study is worthless if others in the field would find your margin too large.

  18. FDA Guidelines • http://www.fda.gov/cder/guidance/4155fnl.pdf • FDA has at least 4 major concerns: • Need strong evidence that standard treatment is effective. • Must have acceptable margin of equivalence that is much smaller than the effect of the standard over placebo. • Trial design must be very close to that which established the effectiveness of the standard treatment. • Study conduct must be high quality. This sounds like business-speak about “excellence”, but it’s really referring to the fact that superiority studies are by nature conservative: e.g., non-compliance and misclassification bias the results toward no effect. Those flaws in a non-inferiority study have the same bias, making it easier to falsely prove the aim.

  19. Appendix: Possible Errors in Study Conclusions Typical study to demonstrate superiority/inferiority Truth: Study Claims: H0: No Effect H1: Effect No Effect Correct Error (Type II) Specificity Sensitivity Effect Error (Type I) Correct Set α=0.05 Specificity=95% Power: Maximize Choose N for 80%

  20. Typical study to demonstrate superiority/inferiority Appendix: Graphical Representation of Power H0: true effect=0 HA: true effect=3 Effect in study=1.13 N=100 per Group Larger Ns give narrower curves 41% HA H0 5% Effect (Group B mean – Group A mean) \\\ = Probability of concluding HA if H0 is true. /// = Probability of concluding H0 if HA is true. Power=100-41=59% Note greater power if larger N, and/or if true effect>3, and/or less subject heterogeneity.

  21. Appendix: Online Study Size / Power Calculator www.stat.uiowa.edu/~rlenth/Power Does NOT include tests for equivalence or non-inferiority or non-superiority

More Related