1 / 31

Text-based Discovery in Biomedicine The Architecture of the DAD -system

Text-based Discovery in Biomedicine The Architecture of the DAD -system. Marc Weeber 1,2 , Henny Klein 1 , Alan R. Aronson 2 , Jim G. Mork 2 , Lolkje T. W. de Jong - van den Berg 1 , Rein Vos 1,3.

Download Presentation

Text-based Discovery in Biomedicine The Architecture of the DAD -system

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Text-based Discovery in BiomedicineThe Architecture of the DAD-system Marc Weeber1,2, Henny Klein1, Alan R. Aronson2, Jim G. Mork2, Lolkje T. W. de Jong - van den Berg1, Rein Vos1,3 1Department of Social Pharmacy and Pharmacoepidemiology, Groningen University Institute for Drug Exploration, The Netherlands 2Lister Hill National Center for Biomedical Communication, National Library of Medicine, Bethesda, MD 3Health Ethics and Philosophy, Faculty of Health Sciences, University of Maastricht, The Netherlands

  2. Introduction • Goal: Finding new biomedical knowledge through the combination of existing knowledge as represented in the medical literature • Motivation: Prevention of re-inventing the wheel, re-usage of specific knowledge outside the original domain of discovery

  3. Swanson • AB: Raynaud’s disease is characterized by high blood viscosity and high platelet aggregation • BC: Fish oil is known to reduce blood viscosity and platelet aggregation A B C ?

  4. Vos and Rikken • Drugs instead of diet factors • Intermediate (B) terms are adverse drug reactions • Drug – Adverse drug reactions – Disease: The DAD-system • Vos (1991) Drugs looking for diseases

  5. Existing Techniques • Swanson & Smalheiser: • Single words/multi word terms • MEDLINE titles • No statistics • Gordon & Lindsay: • Single words/multi word terms • Information Retrieval statistics • Replication of Swanson’s discoveries

  6. New Techniques • Use of UMLS concepts • PubMed • MetaMap: mapping free text (MEDLINE titles and abstracts) to concepts • Interactive web interface

  7. A ? ? A ? C Two-step Approach • Open discovery, generating a hypothesis • Closed discovery, testing a hypothesis

  8. Why UMLS Concepts? • Use of only biomedically relevant information • Useful transition from single word to multi word term • Semantic information (semantic types) for filtering (e.g. select only Disease or Syndrome)

  9. Meta- thesaurus Specialist Lexicon Semantic Network PubMed MetaMap DAD-system KS

  10. Meta- thesaurus Specialist Lexicon Semantic Network MySQL Database PubMed MetaMap Txt2Con Query Show DAD-system KS Filter Select

  11. Meta- thesaurus Specialist Lexicon Semantic Network MySQL Database PubMed MetaMap Txt2Con Query Show DAD-system KS Filter Select

  12. Open Discovery A • Query (user input): • raynaud’s disease

  13. Open Discovery A • Mapping text to concept through MetaMap: • Raynaud's Disease [Disease or Syndrome]

  14. Open Discovery A • Synonym lookup: • Raynaud's syndrome • Raynaud's disease /phenomenon • Variant generation: • e.g. syndrome / syndromes

  15. Open Discovery A • PubMed query: • raynaud OR raynauds • Processing: query in titles and abstracts • Result: 1,246 MEDLINE citations

  16. Open Discovery A • Text to concept mapping of all citations • Sentences with Raynaud’s disease • Result: 1,278 UMLS concepts

  17. Open Discovery A • Select functional/physiological concepts • Semantic types in filter: • Body Location or Region • Biologic Function • Cell Function • Phenomenon or Process • Physiologic Function • Tissue

  18. Open Discovery A B • Result: 57 Concepts • Frequency range: • 1- 18

  19. Open Discovery A B • Selected B-concepts: • Plasma Viscosity Level • Blood Viscosity • Platelet Adhesiveness • Platelet Aggregation • Effects, Blood Coagulation

  20. Open Discovery A B • Variants: • plasma, plasmas • viscosity, viscous, • aggregation, aggregations, aggregating • coagulation, coagulating

  21. Open Discovery A B • PubMed query: • blood coagulation OR blood viscosity OR plasma viscosity OR platelet adhesiveness OR platelet aggregation • Result: 10,611 MEDLINE citations

  22. Open Discovery A B • Concepts in sentences with B-concepts: • 7,702 • Concepts not in Raynaud sentences: • 6,747

  23. Open Discovery A B • Filter for dietary related concepts • Semantic types in filter: • Vitamin • Lipid • Element, Ion, or Isotope

  24. Open Discovery A B C • Result: 206 Concepts • Rank order on relations • Fish oil related concepts: Eicosapentaenoic Acid Fish Oil Fatty Acids, Omega 3 MAXEPA Omega-3 Polyunsaturated Fatty Acid Cod Liver Oil Salmon Oil

  25. Closed Discovery A C Raynaud’s Disease Eicosapentaenoic Acid Fish Oil Fatty Acids, Omega 3 MAXEPA Omega-3 Polyunsaturated Fatty Acid Cod Liver Oil Salmon Oil

  26. Closed Discovery A C 1,246 citations 1,278 concepts 463 citations 1,795 concepts 479 common concepts

  27. Closed Discovery A C Functional / Physiological Filter 45 B-concepts

  28. Closed Discovery A B C • New concepts: • Vasodilatation • Veins, Capillaries • Dinoprostone • Fibrinolysis • Deformability • Rheology • Known concepts: • Plasma viscosity level • Blood Viscosity • Platelet Adhesiveness • Platelet Aggregation • Effects, Blood • Coagulation

  29. Juxtaposition

  30. Success / Failure • Simulation of Raynaud’s disease – fish oil and migraine – magnesium • Discovery of new therapeutic applications for thalidomide • Mapping (Mg = milligram / magnesium) • Association defined by co-occurrence

  31. Future • Better semantic analysis: increase(A,B) and decrease(B,C) • Better user interface • More databases e.g. finding genetic bases for diseases

More Related