580 likes | 661 Views
Creating the CTSA Ontology Landscape: A Modular Strategy. Barry Smith. For modularity to work, developers must accept some basic principles. for formulating definitions of modularity of user feedback for error correction and gap identification for ensuring compatibility between modules
E N D
Creating the CTSA Ontology Landscape: A Modular Strategy Barry Smith
For modularity to work, developers must accept some basic principles • for formulating definitions • of modularity • of user feedback for error correction and gap identification • for ensuring compatibility between modules • for using ontologies to annotate legacy data • for using ontologies to create new data • for developing user-specific views
The Modular Approach • Create a small set of plug-and-play ontologies as stable monohierarchies with a high likelihood of being reused • Create ontologies incrementally • Reuse existing ontology resources • Use these ontologies incrementally in annotating heterogeneous data • Annotating = arms length approach; the data and data-models themselves remain as they are
Benefits of Modularity • Brings a clean division of labor amongst domain experts, who can manage governance aspects pertaining to their own domains • Automatic consistency of the results of the distributed efforts – no room for contradiction • Additivity of annotations even when multiple independently developed ontologies are used • Lessons learned in developing and using one module can be used by the developers and users of later modules
Benefits of Modularity • Increased likelihood of reuse, since potential users will be aware that they are investing in the results of an authoritative coordinated approach of proven reliability • Increased value and portability of training in any given module • Incentivization of those responsible for individual modules
Benefits of Modularity • All of those involved can more easily inspect and criticize the results of others’ work • Creates a collaborative environment for ontology development • serves as a platform for innovations which can be easily propagated throughout the whole system • Developing and using ontologies in a consistent fashion brings a number of network effects – the value of existing annotations increases as new annotations are added
You will need to embrace some strategy along these lines if you want to get funding for translational research NIH Mandates for Sharing of Research Data Investigators submitting an NIH application seeking $500,000 or more in any single year are expected to include a plan for data sharing (http://grants.nih.gov/grants/policy/data_sharing)
Logical standards can be only part of the solution OWL … bring benefits primarily on the side of syntax (language) What we need are standards on the semantics (content) side (via top-level ontologies), including standards for • top-level ontologies • common relations (part_of …) • relation of lower-level ontologies to each other and to the higher levels
BFO, DOLCE, SUMO All exist in FOL and OWL versions All have been tested in use BFO: very small, truly domain-neutral DOLCE: largely extends BFO, but built to support ‘linguistic and cognitive engineering’ SUMO: has its own tiny mathematics, tiny physics, tiny biology (‘body-covering’, ‘fruit-Or-vegetable’), …
120+ ontology projects using BFO http://www.ifomis.org/bfo/ • Open Biomedical Ontologies Foundry • Ontology for General Medical Science • eagle-I, VIVO, CTSAconnect • AstraZeneca • Elsevier
How a common upper level ontology can help resist ontology chaos • something to teach • training (expertise) is portable • each new ontology you confront will be more easily understood at the level of content • and more easily criticized, error-checked • provides starting-point for domain-ontology development • provides platform for tool-building and innovations • lessons learned in building and using one ontology can potentially benefit other ontologies • promote shareability of data across discilinary and other boundaries
Basic Formal Ontology (BFO) top level mid-level domain level OBO Foundry Modular Organization
BFO A simple top-level ontology to support information integration in scientific research No overlap with domain ontologies (organism, person, society, information, …) Based on realism No abstracta Tested in many natural science domains
Basic Formal Ontology Continuant Occurrent process, event Independent Continuant entity Dependent Continuant property property depends on bearer
depends_on Continuant Occurrent process, event Independent Continuant thing Dependent Continuant property event depends on participant
Basic Formal Ontology continuant occurrent biological processes independent continuant cellular component dependent continuant molecular function
roles, qualities Continuant Occurrent process, event Independent Continuant Dependent Continuant Quality Disposition
instance_of types Continuant Occurrent process, event Independent Continuant thing Dependent Continuant property .... ..... ....... instances
RELATION TO TIME GRANULARITY rationale of OBO Foundry coverage
Four distinct classificatory tasks of people (patients, carriers, …) of diseases (cases, instances, problems, …) of courses of disease (symptoms, treatments…) of representations (records, observations, data, diagnoses…) ICD confuses 1. & 2. Most standard terminologies confuse 2. and 4
Ontology for General Medical Science (OGMS) person (patient, carrier, …) – independent continuant disease (case, instance, problem, …) – specifically dependent continuant course of disease (symptom, treatment…)– occurrent representation (record, datum, diagnosis…)– generically dependent continuant http://code.google.com/p/ogms/
Four distinct BFO categories people (patients, carriers, …) – independent continuants disease (case, instance, problem, condition …) – disposition course of disease (symptom, episode, outbreak …)– realization of dispositions representations (records, data, diagnoses…)– generically dependent continuants
Elucidation of Primitive Terms • ‘extended organism’ = the organism and all the material entities located within it • ‘bodily feature’ = either a physical part of the extended organism, a bodily quality, or a bodily process.
Elucidation of Primitive Terms • clinically abnormal - some bodily feature that • (1) is not part of the life plan for an organism of the relevant type (unlike loss of milk teeth, aging or pregnancy), • (2) is causally linked to an elevated risk either of pain or other feelings of illness, or of death or dysfunction, and • (3) is such that the elevated risk exceeds a certain threshold level.* *Compare: baldness
Disorder A material entity (fiat object part) which is clinically abnormal and part of an extended organism Compare: Downtown Santa Barbara Mount Everest
Definitions - Foundational Terms • Pathological Process =def. – A bodily process that is clinically abnormal. • Disease =def. – A disposition (i) to undergo pathological processes that (ii) exists in an organism because of one or more disorders in that organism.
Disease Course =Def. The sum of processes through which a given disease instance is realized.
A disease is a disposition produces bears realized_in etiological process disorder disposition pathological process produces diagnosis interpretive process signs & symptoms abnormal bodily features produces used_in recognized_as
Cirrhosis - environmental exposure • Symptoms & Signs • used_in • Interpretive process • produces • Hypothesis - rule out cirrhosis • suggests • Laboratory tests • produces • Test results - elevated liver enzymes in serum • used_in • Interpretive process • produces • Result - diagnosis that patient X has a disorder that bears the disease cirrhosis • Etiological process - phenobarbitol-induced hepatic cell death • produces • Disorder - necrotic liver • bears • Disposition (disease) - cirrhosis • realized_in • Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death • produces • Abnormal bodily features • recognized_as • Symptoms - fatigue, anorexia • Signs - jaundice, splenomegaly
Influenza - infectious • Symptoms & Signs • used_in • Interpretive process • produces • Hypothesis - rule out influenza • suggests • Laboratory tests • produces • Test results - elevated serum antibody titers • used_in • Interpretive process • produces • Result - diagnosis that patient X has a disorder that bears the disease flu • Etiological process - infection of airway epithelial cells with influenza virus • produces • Disorder - viable cells with influenza virus • bears • Disposition (disease) - flu • realized_in • Pathological process - acute inflammation • produces • Abnormal bodily features • recognized_as • Symptoms - weakness, dizziness • Signs - fever
Huntington’s Disease - genetic • Symptoms & Signs • used_in • Interpretive process • produces • Hypothesis - rule out Huntington’s • suggests • Laboratory tests • produces • Test results - molecular detection of the HTT gene with >39CAG repeats • used_in • Interpretive process • produces • Result - diagnosis that patient X has a disorder that bears the disease Huntington’s disease • Etiological process - inheritance of >39 CAG repeats in the HTT gene • produces • Disorder - chromosome 4 with abnormal mHTT • bears • Disposition (disease) - Huntington’s disease • realized_in • Pathological process - accumulation of mHTT protein fragments, abnormal transcription regulation, neuronal cell death in striatum • produces • Abnormal bodily features • recognized_as • Symptoms - anxiety, depression • Signs - difficulties in speaking and swallowing
Dispositions and Predispositions Some dispositions are predispositions to other dispositions.
HNPCC - genetic pre-disposition • Etiological process - inheritance of a mutant mismatch repair gene • produces • Disorder - chromosome 3 with abnormal hMLH1 • bears • Disposition (disease) - Lynch syndrome • realized_in • Pathological process - abnormal repair of DNA mismatches • produces • Disorder - mutations in proto-oncogenes and tumor suppressor genes with microsatellite repeats (e.g. TGF-beta R2) • bears • Disposition (disease) - non-polyposis colon cancer • realized in • Symptoms (including pain)
Arterial Aneurysm Disposition – atherosclerosis realized in Pathological process – fatty material collects within the walls of arteries produces Disorder – artery with weakened wall bears Disposition – of artery to become distended realized_in Pathological process – process of distending produces Disorder – arterial aneurysm bears Disposition – of artery to rupture realized in Pathological process – (catastrophic event) of rupturing produces Disorder – ruptured artery, arterial system with dangerously low blood pressure bears Disposition – circulatory failure realized in Pathological process – exsanguination, failure of homeostasis produces Death
Systemic arterial hypertension • Symptoms & Signs • used_in • Interpretive process • produces • Hypothesis - rule out hypertension • suggests • Laboratory tests • produces • Test results - • used_in • Interpretive process • produces • Result - diagnosis that patient X has a disorder that bears the disease hypertension • Etiological process – abnormal reabsorption of NaCl by the kidney • produces • Disorder – abnormally large scattered molecular aggregate of salt in the blood • bears • Disposition (disease) - hypertension • realized_in • Pathological process – exertion of abnormal pressure against arterial wall • produces • Abnormal bodily features • recognized_as • Symptoms - • Signs – elevated blood pressure
Type 2 Diabetes Mellitus • Symptoms & Signs • used_in • Interpretive process • produces • Hypothesis - rule out diabetes mellitus • suggests • Laboratory tests – fasting serum blood glucose, oral glucose challenge test, and/or blood hemoglobin A1c • produces • Test results - • used_in • Interpretive process • produces • Result - diagnosis that patient X has a disorder that bears the disease type 2 diabetes mellitus • Etiological process – • produces • Disorder – abnormal pancreatic beta cells and abnormal muscle/fat cells • bears • Disposition (disease) – diabetes mellitus • realized_in • Pathological processes – diminished insulin production , diminished muscle/fat uptake of glucose • produces • Abnormal bodily features • recognized_as • Symptoms – polydipsia, polyuria, polyphagia, blurred vision • Signs – elevated blood glucose and hemoglobin A1c
Type 1 hypersensitivity to penicillin • Symptoms & Signs • used_in • Interpretive process • produces • Hypothesis - • suggests • Laboratory tests – • produces • Test results – occasionally, skin testing • used_in • Interpretive process • produces • Result - diagnosis that patient X has a disorder that bears the disease type 1 hypersensitivity to penicillin • Etiological process – sensitizing of mast cells and basophils during exposure to penicillin-class substance • produces • Disorder – mast cells and basophils with epitope-specific IgE bound to Fc epsilon receptor I • bears • Disposition (disease) – type I hypersensitivity • realized_in • Pathological process – type I hypersensitivity reaction • produces • Abnormal bodily features • recognized_as • Symptoms – pruritis, shortness of breath • Signs – rash, urticaria, anaphylaxis
Early Onset Alzheimer’s Disease Disorder – mutations in APP, PSEN1 and PSEN2 bears Disposition – impaired APP processing realized in Pathological process – accumulation of intra- and extracellular protein in the brain produces Disorder– amyloid plaque and neurofibrillary tangles bears Disposition – of neurons to die realized in Pathological process – neuronal loss produces Disorder – cognitive brain regions damaged and reduced in size bears Disposition (disease) – Alzheimer’s dementia realized in Symptoms – episodic memory loss and other cognitive domain impairment
Arterial Aneurysm • Disposition – atherosclerosis • realized in • Pathological process – fatty material collects within the walls of arteries • produces • Disorder – artery with weakened wall • bears • Disposition – of artery to become distended • realized_in • Pathological process – process of distending • produces • Disorder – arterial aneurysm • bears • Disposition – of artery to rupture • realized in • Pathological process– (catastrophic event) of rupturing • produces • Disorder – ruptured artery, arterial system with dangerously low blood pressure • bears • Disposition – circulatory failure • realized in • Pathological process – exsanguination, failure of homeostasis • produces • Death
Hemorrhagic stroke • Disorder – cerebral arterial aneurysm • bears • Disposition – of weakened artery to rupture • realized in • Pathological process – rupturing of weakened blood vessel • produces • Disorder – Intraparenchymal cerebral hemorrhage • bears • Disposition (disease) – to increased intra-cranial pressure • realized in • Pathological process – increasing intra-cranial pressure, compression of brain structures • produces • Disorder – Cerebral ischemia, Cerebral neuronal death • bears • Disposition (disease) – stroke • realized in • Symptoms – weakness/paralysis, loss of sensation, etc
Ontology modules extending of OGMS Sleep Domain Ontology (SDO) Infectious Disease Ontology (IDO) Ontology of Medically Relevant Social Entities (OMRSE) Vital Sign Ontology (VSO) Mental Disease Ontology (MD) Neurological Disease Ontology (ND)
Infectious Disease Ontology (IDO) • IDO Core: • General terms in the ID domain. • A hub for all IDO extensions. • IDO Extensions: • Disease specific. • Developed by subject matter experts. • Provides: • Clear, precise, and consistent natural language definitions • Computable logical representations (OWL, OBO)
How IDO evolves IDOMAL IDOCore IDOHIV CORE and SPOKES: Domain ontologies IDOFLU IDORatSa IDORatStrep IDOStrep IDOSa SEMI-LATTICE: By subject matter experts in different communities of interest. IDOMRSa IDOAntibioticResistant IDOHumanSa IDOHumanStrep IDOHumanBacterial
IDO Core • Contains general terms in the ID domain: • E.g., ‘colonization’, ‘pathogen’, ‘infection’ • A contract between IDO extension ontologies and the datasets that use them. • Intended to represent information along several dimensions: • biological scale (gene, cell, organ, organism, population) • discipline (clinical, immunological, microbiological) • organisms involved (host, pathogen, and vector types)
Sample IDO Definitions • Host of Infectious Agent (BFO Role): A role borne by an organism in virtue of the fact that its extended organism contains an infectious agent. • Extended Organism (OGMS):An object aggregate consisting of an organism and all material entities located within the organism, overlapping the organism, or occupying sites formed in part by the organism. • Infectious Agent:A pathogen whose pathogenic disposition is an infectious disposition.
IDO and IDOSa • Scale of the infection (disorder) from Shetty, Tang, and Andrews, 2009