1 / 28

Finding Bugs in People: Developing an Entomology Ontology from the UMLS

This paper explores the use of the Unified Medical Language System (UMLS) to develop an ontology for entomology, specifically focusing on mapping and matching terms between the Torre-Bueno Glossary of Entomology and the UMLS Metathesaurus. The study demonstrates the potential of using existing biomedical ontologies, such as UMLS, to seed new domain-specific ontologies.

mminnis
Download Presentation

Finding Bugs in People: Developing an Entomology Ontology from the UMLS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Finding Bugs in People:Developing an Entomology Ontology from the UMLS Indra Neil Sarkar, PhD Lewis B. & Dorothy Cullman Bioinformatics Associate Division of Invertebrate Zoology American Museum of Natural History NKOS Workshop 10 June 2005

  2. Phenotypes Structural Data Sequence Data Morphology Total Evidence Tree of Life

  3. Statements of Homology • Sequence Data • Multiple Sequence Alignments • CLUSTAL, T-COFFEE, MUSCLE • Non-sequence Data • Ontologies

  4. Color Red White Blue Ontologies “White” “Blanc” “Weiss”

  5. “White” “Blanc” “Weiss” Ogden-Richards Semiotic Triangle Thought/Reference XVFD Symbols Referent

  6. Ontology Development • Protégé • http://protege.stanford.edu • “Frame-based”

  7. Ontology Development

  8. Forelimb Foreleg Wing Arm Ontologies in Phylogenetics “Wing” “Aile” “Flügel”

  9. Forelimb Foreleg (1) Arm (3) Wing (2) 1 1 1 1 3 2 Ontologies in Phylogenetics Forelimb Foreleg Arm Wing CAT BAT BIRD [Gene 1] [Gene 1] [Gene 1] [Gene 2] [Gene 2] [Gene 2] … … …

  10. Ontologies in Phylogenetics • Genetic Information • 99% of Earth’s biota are extinct! • Morphological Information • Fossil record • Morphological studies from extant organisms

  11. Ontologies in Phylogenetics • Ontology Development • Web Ontology Language (OWL) • Structured Descriptive Data (SDD) • Can be exported to NEXUS, DELTA, Lucid • Ontology Acquisition and Markup • Archival Resources • Natural Language Processing

  12. Unified Medical Language System (UMLS) • Metathesaurus • One Million Concepts • 100+ Biomedical Terminologies/Ontologies • Semantic Network • 135 Semantic Types • 15 Coarse Semantic Groups • SPECIALIST Lexicon • English + Biomedical Words

  13. Torre-Bueno Glossary of Entomology (TBGE) • Common Entomology Phrases • 300 Primary Sources • 15,010 Terms/Phrases

  14. TBGE to UMLS • Question 1: Is Entomology Language Different than Biomedical Language? • TBGE to SPECIALIST • Question 2: Can UMLS Be Used to Seed an Ontology for Entomology? • TBGE to UMLS Metathesaurus • Organize Results According to Semantic Network

  15. Q1: Is Entomology a Unique Language? • “Look-up” Individual Word Atoms in SPECIALIST • Complete Look-up • 48% Coverage • Partial Look-up • 66% Coverage • Not found • 34% Not covered

  16. Q2: Can UMLS Be Used to Seed Entomology Ontology? • Three-Tiered Mapping Approach • Tier 1: Direct Mapping • Exact & Normalized String Matching • Tier 2: Direct Mapping after Demodification • Remove nominal and adjectival modifiers • Exact & Normalized String Matching • Tier 3: Approximate Matching • MetaMap Application

  17. Q2: Can UMLS Be Used to Seed Entomology Ontology? • Three-Tiered Mapping Approach • Tier 1: Direct Mapping • Exact & Normalized String Matching • Tier 2: Direct Mapping after Demodification • Remove nominal and adjectival modifiers • Exact & Normalized String Matching • Tier 3: Approximate Matching • MetaMap Application

  18. 20 20 86 86 37 78 49 74 23 61 41 71 Q2: Can UMLS Be Used to Seed Entomology Ontology?

  19. Q2: Can UMLS Be Used to Seed Entomology Ontology?

  20. Source Terminologies Q2: Can UMLS Be Used to Seed Entomology Ontology?

  21. TBGE-UMLS Implications • UMLS Semantic Network is a good Seed Ontology for Biological Domain Ontologies • Best Term-Concept Mappings into Anatomy

  22. Bottom-Up vs. Top-Down

  23. In Summary… • Ontologies are Needed for Phylogenetics • Existing Biomedical Ontologies Are Useful for New Domain Ontologies (especially UMLS) • Top-Down Strategy using UMLS is Tractable

  24. Phenotypes Structural Data Sequence Data Morphology End Goal SDD OWL

  25. Next Steps • Represent Seed Entomology Ontology in OWL • Link OWL Representation to SDD for use in Taxonomic Descriptions • Involve Team of Experts for Validation • Go Beyond Morphology-- Location, Biodiversity Data, etc.

  26. Acknowledgements

  27. Tom Moritz Rob DeSalle Mark Siddall David Figurski Susan Perkins Paul Planet Gloria Coruzzi Olivier Bodenreider Carol Friedman Jim Cimino Bob Morris Mark Musen Acknowledgements National Institutes of Health National Science Foundation American Museum of Natural History

  28. http://www.GenomeCurator.org/people/sarkar Indra Neil Sarkar, Cullman Bioinformatics Associate American Museum of Natural History Thank you! sarkar@amnh.org

More Related