390 likes | 569 Views
Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA. ISMB 2002 Fifth Annual Bio-Ontologies Meeting August 8, 2002. Experiences in visualizing and navigating biomedical ontologies and knowledge bases. Introduction 1. Biomedical knowledge
E N D
Olivier Bodenreider Lister Hill National Centerfor Biomedical CommunicationsBethesda, Maryland - USA ISMB 2002 Fifth Annual Bio-Ontologies Meeting August 8, 2002 Experiences in visualizing and navigating biomedical ontologies and knowledge bases
Introduction 1 • Biomedical knowledge • Terminologies (names) • Ontologies (objects) • Knowledge bases (facts) • Common features • Terms / Concepts • Inter-concept relationships • Hierarchical • Associative
knowledge term Introduction 2 • Challenges • Volume of information • 104 -106 concepts • 105 -107 relationships • Orientation • Mapping to concepts • Visualizing concept spaces • Navigating concept spaces
SemNav UMLS browser Entry point: biomedical term Display related concepts Display properties of interconcept relationships Allow navigation among concepts GenNav GO browser Entry point: GO term or gene product name/symbol Display related GO terms and gene products Display properties of term/term and term/gene product relationships Allow navigation between GO terms and gene products Introduction 3
Outline • Background • Unified Medical Language System (UMLS) • Gene Ontology • Overview of the browsers • SemNav • GenNav • Common features • Differences
Unified Medical Language System • Developed at NLM since 1990 • 13th edition in 2002 • Integrates some 60 terminological resources • Clinical vocabularies (including specialties) • Core terminologies (anatomy, drugs, med. devices) • Administrative terminologies, standards • Integration • Synonymous terms are clustered in a concept • Hierarchies (trees) are combined in a graph structure
MeSH, SNOMED CTV3, Jablonski, CRISP, DxPlain, MedDRA, LOINC Duchenne muscular dystrophy Duchenne’s muscular dystrophy COSTAR Duchenne de Boulogne muscular dystrophy Jablonski Duchenne type progressive muscular dystrophy SNOMED MeSH, CTV3 SNOMED pseudohypertrophic muscular dystrophy X-liked recessive muscular dystrophy Jablonski severe generalized familial muscular dystrophy SNOMED Terminology integration Terms
Adrenal Gland Diseases Adrenal Cortex Diseases SNOMED MeSH AOD Read Codes Hypoadrenalism Adrenal Gland Hypofunction UMLS Adrenal cortical hypofunction Addison’s Disease Terminology integration Relationships
Semantic Network Semantic Type categorization Concept Metathesaurus UMLS • Two-level structure • Semantic Network • 134 Semantic Types (STs) • 54 types of relationshipsamong STs • Metathesaurus • 800,000 concepts • ~10 M inter-conceptrelationships • Link = categorization
Semantic Types Anatomical Structure Fully Formed Anatomical Structure Embryonic Structure Disease or Syndrome Body Part, Organ or Organ Component Semantic Network Pharmacologic Substance Population Group Metathesaurus Medias-tinum Saccular Viscus 4 Angina Pectoris 97 Esophagus 12 Heart Cardiotonic Agents 225 Left PhrenicNerve Tissue Donors Heart Valves Fetal Heart 22 9 31 Concepts
Gene Ontology • Developed by the GO Consortium • Several components • Ontology (~11,000 concepts) • Molecular functions • Cellular components • Biological processes • Gene products (~125,000) • Associations between Gene products and GO concepts (~357,000)
Semantic Types Biologically ActiveSubstance Disease orSyndrome Amino Acid,Peptide or Protein MuscularDystrophy,Duchenne Dystrophin 55 Concepts SemNav Relationships
Common features and differences
Mapping query terms • Mapping terms to concepts • Matching criteria (exact, approximate) • Normalization techniques • work well on clinical terms • less applicable to gene names • Query disambiguation • With semantic type in SemNav • With species in GenNav
Visualization • Graph vs. Trees (Forest) • Multiple inheritance is better visualized by graphs than by trees • Off-the-shelf, freely available graph visualization packages are available (GraphViz) • Need to reduce complexity • Transitive reduction on complex graphs • Feature selection • e.g., a given vocabulary in SemNav • e.g., a given species in GenNav
Navigation • Tool for exploration • Navigation among concepts(SemNav and GenNav) • Navigation between two poles(Gene products and GO concepts in GenNav) • Self-contained (SemNav)or opened to external resources (GenNav)
Conclusions • Most of the lessons learned while developing SemNav (for browsing general biomedical knowledge) were applicable to GenNav (for browsing molecular biology knowledge) • The lexical techniques suitable for mapping text to clinical terminologies require adaptation to the specificity of molecular biology terminologies
Olivier Bodenreider Lister Hill National Centerfor Biomedical CommunicationsBethesda, Maryland - USA Contact: olivier@nlm.nih.gov SemNav http://umlsks.nlm.nih.gov* ► Resources ► Semantic Navigator (* free UMLS registration required) GenNav http://etbsun2.nlm.nih.gov:8000/perl/gennav.pl