180 likes | 397 Views
Predictive Analytics for OpenFDA & Other Sources. October 6, 2014. Data Fusion to Know a Individuals. OpenFDA Queries. https:// api.fda.gov/drug/event.json ? search =patient.drug.openfda.pharm_class_epc:"nonsteroidal+anti-inflammatory+ drug” &count= patient.reaction.reactionmeddrapt.exact.
E N D
Predictive Analytics for OpenFDA & Other Sources October 6, 2014
OpenFDA Queries • https://api.fda.gov/drug/event.json? • search=patient.drug.openfda.pharm_class_epc:"nonsteroidal+anti-inflammatory+drug” • &count=patient.reaction.reactionmeddrapt.exact End Point search for records where openfda.pharm_class_epc (pharmacologic class) contains nonsteroidal anti-inflammatory drug. count the field patient.reaction.reactionmeddrapt (patient reactions).
https://api.fda.gov/drug/event.json?search=patient.drug.openfda.pharm_class_epc:%22nonsteroidal+anti-inflammatory+drug%22&count=patient.reaction.reactionmeddrapt.exacthttps://api.fda.gov/drug/event.json?search=patient.drug.openfda.pharm_class_epc:%22nonsteroidal+anti-inflammatory+drug%22&count=patient.reaction.reactionmeddrapt.exact
Important OpenFDA data types • What the drug is supposed to fix: • Pharmacologic Class (EPC) - pharm_class_epc • How the drug works: • Mechanism of Action (MOA) - pharm_class_moa • What the drug affects: • Physiologic Effect (PE) - pharm_class_pe • What is in the drug: • Chemical Structure (CS) - pharm_class_cs
https://api.fda.gov/drug/event.json?search=patient.drug.openfda.pharm_class_epc:%22Serotonin+and+Norepinephrine+Reuptake+Inhibitor%22https://api.fda.gov/drug/event.json?search=patient.drug.openfda.pharm_class_epc:%22Serotonin+and+Norepinephrine+Reuptake+Inhibitor%22 Safety Report ID Adverse Reactions Biographical Data Drug Information
More OpenFDA data types • How serious is the reaction: serious (1 for Yes, 2 for No) • "serious": "1", • "seriousnesscongenitalanomali": "1", • "seriousnessdeath": "1", • "seriousnessdisabling": "1" • "seriousnesshospitalization": "1", • "seriousnesslifethreatening": "1", • "seriousnessother": "1” • What is the drug indicated for: drugindication • Circumstances for taking drug: patient.drug.drugadditional
Predictions on OpenFDA Data • Hierarchical Clustering (“unsupervised learning”) on Manufacturers by Drug Class and Adverse Events • Generates Insights and Further Questions to Explore, Like; • Do some adverse events dominate all others? • What is the role of retail distributors rather than manufacturers – an artifact of the data or something else they do between between themselves and patient?
Manufacturers by All Drug Classes Group troubling in the large number of adverse events for the products they make – includes companies Abbvie and Pfizer Group distinguished by abnormally large adverse events for the products they make – includes companies Mylan and Teva Group above average for the number of product adverse events. includes private labeling companies CVS, Kroger, Wal-Mart, Publix Other manufacturers not troubling in the number of adverse events
Manufacturers by All Adverse Events Group of 1 highly (Mylan) distinguished by abnormally large adverse events for the products they make Group above average for the number of product adverse events. includes big pharma maker Merck. Group troubling in the large number of adverse events for the products they make – includes companies Teva and Grocery Store Kroger Other manufacturers not troubling in the number of adverse events
Conditional Probability Models (Bayes) Very Helpful for Predictions
Why is Bayes So Much Better? • Works on Conditional Probability • Utilizes Much More of What We Already Know Probability of Age 18to34 | Rating % Age 18to34 drug drug
Bayes is Conditional Probability • Intuition is “What the chances of X given I know Y” • This will always be better than flipping a coin – as in the case of gender prediction • The probability of Female (F) for a any given Drug (T) is the same as the probability of the Drug given Female times the probability of being female divided by the probability of the Drug.
Simplifying the Problem Set Single Households 123K Age / Gender models by Drug Multi-Person Households 500K Same Gender & Same Age Class 21K nothing to predict Same Gender & Diff. Age Class predict age 44K Diff. Gender & Same Age Class 303K predict gender Diff. Gender & Diff. Age Class 133K predict both
2 Stage Models Single Households Age / Gender Models by Drug 1 2 Age / Gender Conditional Probability Same Gender & Diff. Age Class predict age Diff. Gender & Same Age Class predict gender Diff. Gender & Diff. Age Class predict both
Full Bayes Model • Using all the independent variables – • Where MAX is the prediction of Age or Gender classification given all the conditional probabilities known. • NOTE: The MAX prediction for Age is constrained by ID – each ID has only 2 possible Age classes since these are known, so if model predicts an Age class outside boundaries of a ID pick next highest MAX probability for Age.