310 likes | 483 Views
Text-based Discovery in Biomedicine The Architecture of the DAD -system. Marc Weeber 1,2 , Henny Klein 1 , Alan R. Aronson 2 , Jim G. Mork 2 , Lolkje T. W. de Jong - van den Berg 1 , Rein Vos 1,3.
E N D
Text-based Discovery in BiomedicineThe Architecture of the DAD-system Marc Weeber1,2, Henny Klein1, Alan R. Aronson2, Jim G. Mork2, Lolkje T. W. de Jong - van den Berg1, Rein Vos1,3 1Department of Social Pharmacy and Pharmacoepidemiology, Groningen University Institute for Drug Exploration, The Netherlands 2Lister Hill National Center for Biomedical Communication, National Library of Medicine, Bethesda, MD 3Health Ethics and Philosophy, Faculty of Health Sciences, University of Maastricht, The Netherlands
Introduction • Goal: Finding new biomedical knowledge through the combination of existing knowledge as represented in the medical literature • Motivation: Prevention of re-inventing the wheel, re-usage of specific knowledge outside the original domain of discovery
Swanson • AB: Raynaud’s disease is characterized by high blood viscosity and high platelet aggregation • BC: Fish oil is known to reduce blood viscosity and platelet aggregation A B C ?
Vos and Rikken • Drugs instead of diet factors • Intermediate (B) terms are adverse drug reactions • Drug – Adverse drug reactions – Disease: The DAD-system • Vos (1991) Drugs looking for diseases
Existing Techniques • Swanson & Smalheiser: • Single words/multi word terms • MEDLINE titles • No statistics • Gordon & Lindsay: • Single words/multi word terms • Information Retrieval statistics • Replication of Swanson’s discoveries
New Techniques • Use of UMLS concepts • PubMed • MetaMap: mapping free text (MEDLINE titles and abstracts) to concepts • Interactive web interface
A ? ? A ? C Two-step Approach • Open discovery, generating a hypothesis • Closed discovery, testing a hypothesis
Why UMLS Concepts? • Use of only biomedically relevant information • Useful transition from single word to multi word term • Semantic information (semantic types) for filtering (e.g. select only Disease or Syndrome)
Meta- thesaurus Specialist Lexicon Semantic Network PubMed MetaMap DAD-system KS
Meta- thesaurus Specialist Lexicon Semantic Network MySQL Database PubMed MetaMap Txt2Con Query Show DAD-system KS Filter Select
Meta- thesaurus Specialist Lexicon Semantic Network MySQL Database PubMed MetaMap Txt2Con Query Show DAD-system KS Filter Select
Open Discovery A • Query (user input): • raynaud’s disease
Open Discovery A • Mapping text to concept through MetaMap: • Raynaud's Disease [Disease or Syndrome]
Open Discovery A • Synonym lookup: • Raynaud's syndrome • Raynaud's disease /phenomenon • Variant generation: • e.g. syndrome / syndromes
Open Discovery A • PubMed query: • raynaud OR raynauds • Processing: query in titles and abstracts • Result: 1,246 MEDLINE citations
Open Discovery A • Text to concept mapping of all citations • Sentences with Raynaud’s disease • Result: 1,278 UMLS concepts
Open Discovery A • Select functional/physiological concepts • Semantic types in filter: • Body Location or Region • Biologic Function • Cell Function • Phenomenon or Process • Physiologic Function • Tissue
Open Discovery A B • Result: 57 Concepts • Frequency range: • 1- 18
Open Discovery A B • Selected B-concepts: • Plasma Viscosity Level • Blood Viscosity • Platelet Adhesiveness • Platelet Aggregation • Effects, Blood Coagulation
Open Discovery A B • Variants: • plasma, plasmas • viscosity, viscous, • aggregation, aggregations, aggregating • coagulation, coagulating
Open Discovery A B • PubMed query: • blood coagulation OR blood viscosity OR plasma viscosity OR platelet adhesiveness OR platelet aggregation • Result: 10,611 MEDLINE citations
Open Discovery A B • Concepts in sentences with B-concepts: • 7,702 • Concepts not in Raynaud sentences: • 6,747
Open Discovery A B • Filter for dietary related concepts • Semantic types in filter: • Vitamin • Lipid • Element, Ion, or Isotope
Open Discovery A B C • Result: 206 Concepts • Rank order on relations • Fish oil related concepts: Eicosapentaenoic Acid Fish Oil Fatty Acids, Omega 3 MAXEPA Omega-3 Polyunsaturated Fatty Acid Cod Liver Oil Salmon Oil
Closed Discovery A C Raynaud’s Disease Eicosapentaenoic Acid Fish Oil Fatty Acids, Omega 3 MAXEPA Omega-3 Polyunsaturated Fatty Acid Cod Liver Oil Salmon Oil
Closed Discovery A C 1,246 citations 1,278 concepts 463 citations 1,795 concepts 479 common concepts
Closed Discovery A C Functional / Physiological Filter 45 B-concepts
Closed Discovery A B C • New concepts: • Vasodilatation • Veins, Capillaries • Dinoprostone • Fibrinolysis • Deformability • Rheology • Known concepts: • Plasma viscosity level • Blood Viscosity • Platelet Adhesiveness • Platelet Aggregation • Effects, Blood • Coagulation
Success / Failure • Simulation of Raynaud’s disease – fish oil and migraine – magnesium • Discovery of new therapeutic applications for thalidomide • Mapping (Mg = milligram / magnesium) • Association defined by co-occurrence
Future • Better semantic analysis: increase(A,B) and decrease(B,C) • Better user interface • More databases e.g. finding genetic bases for diseases