1 / 17

Rohit Kate

Computational Intelligence in Biomedical and Health Care Informatics HCA 590 (Topics in Health Sciences). Rohit Kate. Data Mining : Sample Medical Applications. Reading.

livi
Download Presentation

Rohit Kate

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Intelligence in Biomedical and Health Care InformaticsHCA 590 (Topics in Health Sciences) Rohit Kate Data Mining: Sample Medical Applications

  2. Reading Data Mining of Medical Data: Opportunities and Challenges in Mining Association RulesDan A. Simovici. International Academy of Life Sciences conference, Cecilienhof, Potsdam, August 2012.

  3. Data Mining Applications in Medicine • Numerous and well established • Evaluating treatment effectiveness • Health care management • The analysis of relationships between patients and providers of care • Pharmacovigilance • Fraud and abuse detection • Limitations • Limited accessibility to medical data • Technical challenges: distributed data (clinical, administrative) • Legal and social challenges: privacy concerns, data ownership • Incomplete or noisy

  4. Adverse Drug Reactions • A serious problem • 5% of hospital admissions • 28% of emergency department visits • 5% of hospital deaths • Loss of several billion dollars each year • Why are they not detected earlier? • Although drugs are thoroughly tested before introducing them in the market, it is not possible to predict: • Long-term effects • Effects in every type of patient • Effects in every combination of other treatments (e.g. every possible drug-drug interaction)

  5. Adverse Drug Reactions Data • Monitored internationally in multiple sites • Uppsala Monitoring Center in Sweden • Unit of World Health Organization (WHO) • Center mines data from case safety reports • Vigibase: Case safety reporting database • Data from 1978, access allowed for a fee • Food and Drug Administration (FDA) • FDA Adverse Event Reporting System (FAERS) database • Formerly AERS • Access is free • http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Surveillance/AdverseDrugEffects/default.htm • Various pharma entities maintain proprietary databases • Must record adverse drug reactions by US law

  6. Association Rule Mining for Adverse Drug Reactions • Given a database of adverse drug reactions, the goal is to mine useful patterns from them • What drugs and their combinations usually lead to what single or multiple side-effects? • Association rules are a good format for patterns • Simplest association rule: Vioxx heart attack

  7. Association Rule Mining for Adverse Drug Reactions • Harpaz et al. [2000] did a study of mining association rules from undesirable drug interactions • Based on 162,744 reports of suspected adverse drug reactions from AERS database • Database items are naturally partitioned in two classes: drugs and symptoms and association rules have form X  Y where X is a set of drugs and Y is a set of symptoms • This makes the mining algorithm more efficient

  8. Association Rule Mining for Adverse Drug Reactions • Apriori algorithm was applied along with “relative reporting ratio” interestingness measure • 1167 association rules were automatically mined • Sample mined association rules: • metforminmetoprolol NAUSEA 50 7.4 • cyclophosphamide, prednisone, vincristine  FEBRILE NEUTROPENIA 78 45 • cyclophosphamide, doxorubicin, prednisone, rituximab FEBRILE NEUTROPENIA 63 59 • atorvastatin, lisinopril DYSPNOEA 55 3.5 • omeprazolesimvastatin DYSPNOEA 58 12 • vareniclinedarvocet ABNORMAL DREAMS, FATIGUE, INSOMNIA,MEMORY IMPAIRMENT, NAUSEA 52 2668 • Association rules known: 67% • Association rules unknown: 33%

  9. Association Rule Mining for Adverse Drug Reactions Drug Combinations in Association Rules: • Drug-drug interactions found that are known: 4% • Drug-drug combinations known to be given together or treat same indication: 78% • Drug-drug combinations that seem to be due to confounding: 9% • Drug-drug interactions that are unknown: 9%

  10. Drug Resistant Bacteria • Some bacteria develop drug resistance making infection control difficult • Brossette & Hymel [2008] and Brossette et al. [1998] did data mining for infection control • They studied Pseudomonas aeruginosa bacteria, notorious for drug resistance • Common cause of infections in humans • Transmission is caused by medical equipment

  11. Association Rule Mining for Infection Control • Data collection includes records for single Pseudomonas aeruginosaisolates with attributes • date reported • source of isolate (sputum, blood) • location of patient in the hospital • patient’s home zip code • resistant (R), intermediate resistance (I), susceptible (S) for piperacillin, ticarcillin/clavulanate, ceftazidime, imipenem, amikacin, gentamicine, tobramycine, ciprofloxacin.

  12. Association Rule Mining for Infection Control • System was designed to detect patterns of increasing resistance to antimicrobials • Data is partitioned in time slices to determine the change in resistance of the bacterium • Variation of confidences of a rule XY across time slices is computed • Substantial increase in confidence is deemed to constitute an event

  13. Association Rule Mining for Infection Control • Following time slices were created: • 12 one-month fragments to find short-lived patterns • 4 three-month fragments to find medium duration patterns • 2 six-month fragments to find long-lived patterns • Apriori algorithm was applied with minimum support 2 for items and 10 for association rules

  14. Association Rule Mining for Infection Control Sample association rules found: • Empty  R-ticarcillin/clavulanate R-ceftazidime R-piperacillin • a jump from 4%(Oct) to 8%(Nov) to 11%(Dec)suggests that the isolate is resistant to ticarcillin/clavulanate, ceftazidime and piperacillin • R-ceftazidime R-piperacillinsputumR-ticarcillin/clavulanate • 8%(Feb)-32%(Aug) it is likely that the isolate is from sputum and is ticarcillinresistent given that is resistant to ceftazidime and piperacillin • R-piperacillinsputumR-ticarcillin/clavulanateR – ceftazidime • an increase from 6% (Q3) to 26% (Q4) in the probability that the isolate is from sputum, is ticarcillin/clavulanate and ceftazidime resistant given that is piperacillin resistant • R-ticarcillin/clavulanatesputumR-ceftazidimeR-piperacillin • an increase from 7% (Q3) to 24% (Q4) in the probability that isolate is from sputum, is ceftazidime and piperacillin resistant given that is ticarcilline/clavulanate resistant • R-ticarcillin/clavulanateR-ceftazidimeR-piperacillin sputum • an increase from 12% (Q3) to 42% (Q4) in the probability that the isolate is from sputum given that it is resistent to ticarcillin/clavulanate, ceftazidime, and piperacillin

  15. Transitivity of Association Rules • For medical data mining, it is desirable for to have transitivity property for the association rules: XY and YZ should also imply XZ • For consistency • For analyzing the rules • Popular data mining methods, like Apriori algorithm, do not ensure transitivity of the association rules they mine

  16. Transitivity of Association Rules • Existing methods have been suitably modified to ensure transitivity of association rules: • Investigate XZ if XY and YZ have a medical interpretation [Mukhopadhyay et al. 2004]; TransMiner software • Starting from X Z, seek canditates XY and YZ [Wright et al. 2010]

  17. Data Mining in Medicine • Data mining cannot replace human factor in medical research but it can greatly aid • Interaction between data mining and medical research is beneficial for both the domains • Biology and medicine suggest novel problems for data mining and machine learning • Open problems: • Mining from unstructured data (natural language text: progress reports, outpatient notes etc.) • Evaluation of association rules

More Related