Safety Data Mining of Hepatotoxic Signals: Using Data Mining to Systematically Identify Hepatotoxic Signals in the Pres

1. 1 Safety Data Mining ofHepatotoxic Signals:Using Data Mining to Systematically Identify Hepatotoxic Signals in the Presence of Noise

2. 2 Overview What is safety data mining? Why use it? How it identifies �signals� Interpreting the meaning of a signal Does safety data mining provide new opportunities for analysis?

3. 3 What is Safety Data Mining? Use of computer algorithms to systematically and objectively analyze records contained in huge drug safety databases. Goal: to discover hidden interesting patterns of unexpected adverse drug occurrences.

4. 4 Why Use Safety Data Mining? Drug safety databases often have millions of reports These databases continue to grow each year How can we objectively analyze these reports in a timely manner? In other words, how do we find a needle in a haystack?

5. 5 Safety Data Mining at FDA Algorithm: Multi-Item Gamma Poisson Shrinker (MGPS) Database: FDA�s Adverse Event Reporting System (AERS)

6. 6 MGPS Developed by William DuMouchel1,2 (AT&T) Through statistical modeling, identifies signals of higher-than-expected drug-event combinations Handles complex stratification (>945 categories) Can also handle triples, quadruples, quintuples, etc.Can also handle triples, quadruples, quintuples, etc.

7. 7 Adverse Events Reporting System (AERS) Created by FDA in 1968 Contains ~ 2.3 million reports ~ 300,000 new reports submitted each year (approximately 1,000 reports per day!) > 9,000 event codes (MedDRA Preferred Terms) > 7,000 drug/biological products by trade names > 3,000 by generic names (generic names + combination products), from health professionals, suspect products only > 63,000,000 drug-event combinations possible! For all possible quadruples (e.g., drug-drug-drug-event or drug-drug-event-event) 20,000,000,000,000,000 combinations are possible!

8. 8

9. 9 The 4 Components of FDA�s Safety Data Mining System AERS database MGPS data mining algorithm The WebVDME data mining software Modern, easy-to-use, web-based interfaces that support data-intensive applications Specialized graphic visualization tools Helps organize, identify, and interpret patterns in the data

10. 10 Terminology Expected: Identified by their proportion in each stratum in the MedWatch database and added up across all strata Example, If Fluoxetine contains 37,443 reports and �Hepatic Function Abnormal 19,982, and the whole data base contains 2.3 million reports, Then the �Fluoxetine-�Hepatic Function Abnormal�� combination expected count is 323 reports ((37,443 x 19,982)/ 2.3 million) The sum across 945 categories results in an expected value of 622 reports EBGM (Empirical Bayes Geometric Mean): adjusted N/E after modeling Example: if EBGM=3.9 for acetaminophen-hepatic failure, then this drug-event combination occurred in the data 3.9 times more frequently than expected EB05: The estimated lower 95% �confidence limit� for the EBGM Example: if EB05=20, then the drug-event occurred AT LEAST 20 times more frequently in AERS than expected EB05 and EB95: Are the lower and upper bounds of the 2-sided 90% confidence interval around EBGM Data Mining Signal (�Threshold Interval�) EB05 >=2: The drug-event occurred AT LEAST twice as often as expected This threshold gives assurance that potential signals are unlikely to be noise

11. 11

12. 12 Reported Liver Events in AERS Over 88,000 (3.7%) of the 2.3 million reports for in AERS have Liver Events Over 140,000 (3%) of the 4.6 million unique D-E combinations by year that exist in these reports have Liver Events

13. 13 Visualization of Data Results

14. 14

15. 15

16. 16

17. 17

18. 18

19. 19

20. 20

21. 21

22. 22

23. 23

24. 24

25. 25 Interpreting the Meaning of Signals

26. 26 Investigating the Impact of Potential Bias on the Hepatotoxic Signals of a Given Drug Combining and non-combining event terms or drug terms Trade and generic Suspect drug vs. suspect + concurrent drugs Serious vs. non-serious outcomes Stratifying by: Age, Gender, Year Report Source (health professionals vs. consumers) U.S. vs. Non U.S. reports Wider and narrower stratification categories (1 yr vs. 5 yr interval) We are not analyzing causality We are not analyzing sentinel cases Misclassification can occur due to coding practices We are not analyzing the severity of the liver injury yet!

27. 27 Analyzing Hepatotoxicity of an Index Drug by Comparing Confidence Limits Comparing non-overlapping and overlapping EBGM and (EB05, EB95) of drug-event combinations of an index drug using: �big ticket� hepatotoxins drugs having the same indications adverse event data from different time periods for the earliest years vs. latest years Identifying Polytherapy Bias Adjustment Factor (�innocent bystander� effect)

28. 28

29. 29 Can Scores Be Compared? Non-overlapping (EB05, EB95) intervals can be explained by proportional frequency of reported events being higher for one drug than another drug, indicating more �lack of independence� Non-overlapping (EB05, EB95) intervals can provide some information about the relative toxicity between drugs, though the exact degree is not known A drug is not proven to be more or less toxic than another simply because of EBGM scores or overlapping or non-overlapping (EB05, EB95) intervals

30. 30 Comparing Scores is Controversial Reports are spontaneous Scores indicate associations, not causality Confounders exist such as �innocent bystanders�

31. 31 White paper: PhRMA-FDA Best Safety Data Mining Practices Working Group Membership: Chair: June Almenoff (GSK) june.s.almenoff@gsk.com FDA members: A. Szarfman, J. Tonning, M. Johnston, L. Furlong, R. Ouellet-Hellstrom, P., Nourjah, M. Pitts, S. Gitterman, S. Comfort, G. Rochester (CDER), R. Ball (CBER), G. Pennello (CDRH) DoD member: Trinka Coster (WRAIR) PhRMA companies: Allergan, Abbott, AstraZeneca, Bristol-Myers Squibb, GlaxoSmithKline, Lilly, Merck, Novartis, Schering, Pfizer, Roche, Wyeth Mission: To develop a consensus view of Best Safety Data Mining Practices

32. 32 Summary Safety Data Mining can: Provide clues to potential drug safety issues Signal important information that might be missed if a pattern is not expected Identify potential �risk� factors for specific adverse events Predict future trends or behaviors (e.g., of drugs in same class) Enable decision-makers to make proactive, knowledge-driven decisions Other advantages: Does not require exposure or background data Useful for generating hypotheses that can be further tested in clinical and pharmacoepidemiogy studies Signals ? causality!

33. 33 �Data Mining is not intended to replace current pharmacovigilance techniques, but to enhance them.� Szarfman et al., Drug Safety (2002), 25, 381

34. 34 [1] DuMouchel W, Pregibon D. Empirical bayes screening for multi-item associations. Proceedings of the conference on knowledge discovery and data; 2001 Aug 26-29; San Diego (CA): ACM Press: 67-76. [2] Szarfman A, Machado SG, O�Neill RT. Use of Screening Algorithms and Computer Systems to Efficiently Signal Higher-Than-Expected Combinations of Drugs and Events in the US FDA�s Spontaneous Reports Database. Drug Safety 2002; 25:381-392

35. 35 Levine J.G., Szarfman A. Standardized Data Structures and Visualization Tools: A way to accelerate the regulatory review of the integrated summary of safety of new drug Applications. Biopharmaceutical Report, 4(3):12-7,1996 Szarfman A., Talarico L., Levine J.G. Chapter 4.21. Analysis and Risk Assessment of Hematological Data from Clinical Trials. In Volume 4, Toxicology of the Hematopoietic System, In: Comprehensive Toxicology. 4:363-79,1997. Editors-in-chief: I.Glenn Sipes, Charlene A.McQueen, A. Jay Gandolfi. Elsevier Science Inc. Hand DJ. Datamining: Statistics and more? The American Statistician. 1998;2:112 DuMouchel W. Bayesian Data Mining in Large Frequency Tables, With an Application to the FDA Spontaneous Reporting System. The American Statistician.53:177-90,1999 O'Neill, R.T, Szarfman, A. Discussion: Bayesian Data Mining in Large Frequency Tables, With an Application to the FDA Spontaneous Reporting System by William DuMouchel. The American Statistician.53:190-6,1999 Louis TA, Shen W. Discussion: The American Statistician.53:196-8,1999 Madigan D. Discussion: The American Statistician.53:198-200,1999 DuMouchel W. Reply The American Statistician.53:201-202,1999 Szarfman A. Assessing Gender Effects from a Large Spontaneous Reporting Database. Annual Industry-FDA Statistics Workshop. Abstract. October 1999 Szarfman A. Discussion: A report on the activities of the adverse events working groups: Focus on improving the detection of rare but serious adverse events. 1999 Proceedings of the Biopharmaceutical Section. American Statistical Association, pages 12-4 Szarfman A. http://www.fda.gov/cder/present/ispe-1999/default.htm Szarfman A. Assessing Gender Effects from a Large Spontaneous Reporting Database. Annual Industry-FDA Statistics Workshop. Abstract. October 1999 Szarfman A. The Application Of Bayesian Data Mining And Graphic Visualization Tools To Screen FDA's Spontaneous Reporting System Database. 2000 Proceedings of the Section on Bayesian Statistical Science. American Statistical Association, pages 67-71 O'Neill, R.T, Szarfman, A. Some FDA Perspectives on Data mining for pediatric safety assessment. Workshop on Adverse Drug events in Pediatrics. Curr Ther Res Clin Exp. 62:650-63, 2001 Szarfman, A, Machado, SG, O�Neill, RT. Use of Screening Algorithms and Computer Systems to Efficiently Signal Higher-Than-Expected Combinations of Drugs and Events in the US FDA�s Spontaneous Reports Database. Drug Safety 25: 381-92, 2002 REFERENCES

Safety Data Mining of Hepatotoxic Signals: Using Data Mining to Systematically Identify Hepatotoxic Signals in the Pres

Safety Data Mining of Hepatotoxic Signals: Using Data Mining to Systematically Identify Hepatotoxic Signals in the Pres

Presentation Transcript

Data Mining Techniques for CRM

Chapter 2 Data Mining

Signals

Signals

Data Mining

Data Mining

Data Mining Tools

CS/CMPE 536 –Data Mining

Data Mining

Data Mining: Concepts and Techniques

Parallel Data Mining

HEPATOTOXIC DRUGS

HEPATOTOXIC DRUGS

Data Mining Techniques for CRM

Data Mining

Introduction to Data Mining with XLMiner

Data Mining: Concepts and Techniques

Session 2

Chapter One

CS/CMPE 636 – Advanced Data Mining