250 likes | 376 Views
Computational Intelligence in Biomedical and Health Care Informatics HCA 590 (Topics in Health Sciences). Rohit Kate. Biomedical Ontologies 2. Reading. Chapter 8, Text 1 Chapter 8, Main Text A few slides about UMLS have been adapted from
E N D
Computational Intelligence in Biomedical and Health Care InformaticsHCA 590 (Topics in Health Sciences) Rohit Kate Biomedical Ontologies 2
Reading • Chapter 8, Text 1 • Chapter 8, Main Text A few slides about UMLS have been adapted from http://www.nlm.nih.gov/pubs/techbull/mj07/theater_ppt/umls.ppt
Biomedical Ontologies • They provide principled organization of biomedical concepts and their relations • GALEN • SNOMED • FMA • MENELAS • GO (Gene Ontology) • …
Biomedical Terminologies • CPT (Current Procedure Terminology) • ICD (International Classification of Diseases) • HCPCS (hick-picks; Healthcare Common Procedure Coding System) • LOINC (Logical Observation Identifiers Names and Codes) • RxNorm (normalized names for clinical drugs) • …
Too Many Ontologies, Terminologies and Vocabularies in Biomedicine! • Question: Is there a resource that relates concepts in one to concepts in others? • Answer: Yes, UMLS.
UMLS • UMLS: Unified Medical Language System SPECIALIST Lexicon +Tools Semantic Network Metathesaurus 135 broad categories and 54 relationshipsbetween them lexical information and programs for language processing 1 million+ biomedical concepts from over 100 sources 3 Knowledge Sources used separately or together
UMLS Objectives Began in 1986 as long-term R&D project • Overcome variety of ways in which same concepts are expressed differently under different resources in both machine readable and human languages • Designed for system developers to use it as a resource to achieve interoperability
UMLS Uses • Information retrieval • Thesaurus construction • Natural language processing • Automated indexing • Electronic health records (EHR)
Metathesaurus • 100+ general and specialized biomedical vocabularies, terminologies and ontologies • 17 languages (63% English) • 1 million+ concepts; 6 million+ names • 100K+ relationships • Distributed in an electronic format
Metathesaurus Source Vocabularies • Vary in purpose, structure and properties • Used in clinical, research, administrative, public health reporting • Some of the sources: • Thesauri, e.g., MeSH, CRISP, NCI • Classifications, e.g., ICD-9-CM • Billing codes, e.g., CPT, ABC Codes • Ontologies, e.g., SNOMED CT
Metathesaurus Concepts • Synonymous terms clustered into a concept • Concept Unique Identifier (CUI) is assigned • Source information preserved Addison’s diseaseSNOMED CT PT 363732003 Addison’s DiseaseMedlinePlus PT T1233 Addison DiseaseMeSH PT D000224 Primary Adrenal InsufficiencyMeSH EN D000224 Primary hypoadreanlismMedDRA LT 10036696 syndrome, Addison … … Addison’s disease C0001403
Semantic Network • 135 Semantic Types • Broad subject categories • One of these semantic types is assigned to every Metathesaurus concepts • 54 Semantic Relationships • Useful, important links between Types • Hierarchical “isa” and other relations • Categorize the Metathesaurus • Enhance meaning of concepts
Semantic Relations Between Types • Disease or Syndrome associated_with Finding • Disease or Syndrome result_ofPathologic Function • Body Part, Organ, or Organ Component location_of Disease or Syndrome • Hormone affects Disease or Syndrome Hormone causes Disease or Syndrome Hormone complicates Disease or Syndrome
UMLS Semantic Network and Metathesaurus UMLS Semantic Network Semantic Type b Semantic Type a Network relationships Semantic Type c Concept categorization Inter-concept relationship Concept 1 Concept 2 UMLS Metathesaurus
SPECIALIST Lexicon and Lexical Tools • English lexicon of 300K+ common words and biomedical terms • Lexical records encode information on: • Syntax • Morphology • Orthography • Used with associated lexical tools • in Metathesaurus production • in natural language processing applications
SPECIALIST Lexicon: An Example Lexical Entry Base form Unique identifier Part of speech category Lexical variants Prepositional phrase complements {base=disease entry=E0023270 cat=noun variants=reg variants=uncount compl=pphr(of,np|bone|) compl=pphr(of,np|breast|) compl=pphr(of,np|liver|) compl=pphr(of,np|ovary|)}
Different Representations • Often the same concept is represented differently in different medical ontologies • For example, blood is a: • Tissue in GALEN, UMLS Semantic Network and MENELAS • A body substance in FMA • Body fluid in WordNet and SNOMED CT • All are right
Representation of Blood • Wordnet • Blood -> Liquid body substance -> Substance -> Entity • OpenGALEN • Blood -> Soft tissue -> Tissue -> Body substance -> Organic substance -> Substance -> Generalized substance -> Phenomenon • UMLS Semantic Network • Blood -> Tissue -> Fully-formed anatomical structure -> Anatomical structure -> Physical object -> Entity
Representation of Blood • SNOMED CT • Blood -> Blood material -> Body fluid -> Body substance -> Substance • FMA • Blood -> Body substance -> Material physical anatomical entity -> Physical anatomical entity -> Anatomical entity • MENELAS • Blood -> Body fluid -> Tissue -> Mass object -> Real object -> Physical object->Abstract object -> Substratum -> Entity
Composite Representation of Blood Material physical Anatomical structure Fully-formed Anatomical structure Substance WordNet SNOMED CT GALEN UMLS FMA Body substance GALEN Tissue SNOMED CT WordNet MENELAS GALEN UMLS Body fluid WordNet SNOMED CT MENELAS FMA Blood GALEN Coagulated Blood
Issues in Aligning Biomedical Ontologies • It is not easy to align different ontologies because: • It can give rise of inconsistencies • Lack of consistently applied classificatory principles • Convey different theories of a domain (Western vs. Oriental medicine)
Formal Ontologies • Built around the philosophical theories of identity, unity, rigidity and dependence that can reduce inconsistencies in ontologies • For e.g, some classes can represent quality and others process, so don’t allow them to be at the same place in heirarchy • Provide a domain and application independent view of reality, this leads to • Indefinite expandability: the ontology remains consistent with increasing content • Content and context independence: any kind of 'concept' can find its place • Accommodation of different levels of granularity • Helps create consistent upper-level ontologies to which domain ontologies can be hooked
Conclusions • Biomedical knowledge is vast with hundreds of thousands of concepts and relations between them • Several ontologies have been constructed by different communities with somewhat different goals to represent and organize biomedical knowledge for humans as well as for computer processing • UMLS is an attempt to unify biomedical knowledge from different knowledge sources and is very widely used