1 / 16

Issues in the Practical Application of Data Mining Techniques to Pharmacovigilance

Issues in the Practical Application of Data Mining Techniques to Pharmacovigilance. A. Lawrence Gould Merck Research Laboratories May 18, 2005. Spontaneous AE Reports. Clinical trial safety information is incomplete Few patients -- rare events likely to be missed

prisca
Download Presentation

Issues in the Practical Application of Data Mining Techniques to Pharmacovigilance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Issues in the Practical Application of Data Mining Techniques to Pharmacovigilance A. Lawrence Gould Merck Research Laboratories May 18, 2005

  2. Spontaneous AE Reports • Clinical trial safety information is incomplete • Few patients -- rare events likely to be missed • Not necessarily ‘real world’ • Need info from post-marketing surveillance & spontaneous reports : Pharmacovigilance • Carried out by skilled clinicians & epidemiologists • Long history of research on issue, e.g. • Finney (1974, 1982) Royall (1971) • Inman (1970) Napke (1970)

  3. Signal Generation: The Traditional Method PatientExposure ComparativeData ConsultMarketing ConsultDatabase Consultation Single suspicious case or cluster PotentialSignals RefinedSignal(s) Action IntegrateInformation IdentifyPotentialSignals ConsultLiterature ConsultProgrammer StatisticalOutput BackgroundIncidence

  4. Some Limitations of Traditional Approach • Incomplete reports of events, not reactions • How to compute effect magnitude • Many events reported, many drugs reported • Bias & noise in system • Difficult to estimate incidence because no. of pats at risk, pat-yrs of exposure seldom reliable • Inappropriate to consider incidence using only spontaneous reports

  5. The Pharmacovigilance Process Traditional Methods Data Mining Detect Signals Generate Hypotheses Insight from Outliers Public Health Impact, Benefit/Risk Refute/Verify Type A (Mechanism-based) Estimate Incidence Act Inform Type B (Idiosyncratic) Restrict use/ withdraw Change Label

  6. Major Uses of Data Mining • Identify subtle associations that might exist in large databases • Early identification of potential toxicities • Identify complex relationships not apparent by simple summarization • Screening tool to identify potential associations to undergo clinical/epidemiological followup

  7. More to Pharmacovigilance than Data Mining • Data mining a refinement to discover subtleties • Still need initial case review respond to reports involving severe, potential life-threatening events eg., Stevens-Johnson syndrome, agranulocytosis, anaphylactic shock • Clinical/biological/epidemiological verification of apparent associations is essential • Need to think about most effective use of data mining in routine pharmacovigilance practice

  8. Statistical Methodology (1) • Not the key issue • Most use variations of 2-way table statistics Basic idea: Flag when R = a/E(a) is “large” • Some possibilities • Reporting Ratio: E(a) = nTD  nTA/n • Proportional Reporting Ratio: E(a) = nTD  c/nOD • Odds Ratio: E(a) = b  c/d

  9. Statistical Methodology (2) • Estimate variability in various ways, e.g., usual chi-square statistic, Bayesian & Empirical Bayesian models) • Similar results for all methods if more than a few drug/event combinations reported (e.g., 10) • No non-clinical “gold standard” → can’t assess diagnostic utility of any method in usual sense • OR > PRR > RR when a > E(a), doesn’t mean OR identifies real associations better than RR • RR probably most stable

  10. Spontaneous Report Database Limitations • Significant under reporting (esp. OTC) -- depending on seriousness or novelty of event, newness of drug, intensity of monitoriing • Different regulatory reporting requirements • Reflects only reporting practice, not incidence • Synonyms for drugs & events → sensitivity loss • Much duplication of reports • Exposure rate unknown • For any given report, there is no certainty that a suspected drug caused the reaction

  11. A Major Limitation (Often Ignored) • Accumulated reports cannot be used to calculate incidence or to estimate drug risk. Comparisons between drugs cannot be made from these data • Unfortunately, this still is done – disclaimers do not balance the effect of the misrepresentation • Easy to show differences with data mining techniques, but impossible to make valid inferences about causality and may mislead

  12. Implementation Issues • Portfolio bias in company databases can lead to inaccurate estimates of relative reporting rates • Does public health benefit justify cost of following up signals detected by routine data mining methods? • Variation in tools and databases among regulators could lead to significant cost without public health benefit • Do frequency-based signal detection methods useful to regulators have business value in industry settings? • Need examples of situations where computerized approach failed to identify important issues and where signals were “created” by publicity or reporting artifacts

  13. Mining is Easy, Refining Low-grade Ore is Hard • What is data mining activity intended to accomplish -- what are the clinical/epidemiological/regulatory questions that need to be answered • Need to address the impact of various factors, e.g., evolution of apparent association over time, association with key demographic factors such as age, sex, disease classification

  14. More Issues • Composition of database may be important, important associations of a new drug could be cloaked by events associated with an old drug with similar mechanism of action • Individual company databases tend to have comprehensive information about company products, but not general spectrum of drugs/ vaccines • Databases contain reports mentioning drugs, not demonstrations of causality

  15. Discussion • Most apparent associations represent known problems • Some reflect disease or patient population • ~ 25% may represent signals about previously unknown associations • Statistical involvement in implementation & interpretation is important • The actual false positive rate is unknown as are the legal and resource implications

  16. What Next? • PhRMA/FDA working group is considering ways to address issues – white paper will be published • May be worthwhile to construct & maintain a cleaned-up canonical database from AERS to provide a common resource for checking data mining findings based on individual company proprietary databases

More Related