500 likes | 595 Views
The Problem of Reusability of Biomedical Data OBO Foundry & HL7 RIM. Barry Smith. DCRI Project Goal. to facilitate research in CV and TB by increasing the re-usability of data collected in the healthcare setting. Knowledge Environments for Biomedical Research (KEBR).
E N D
The Problem of Reusability of Biomedical DataOBO Foundry & HL7 RIM Barry Smith
DCRI Project Goal • to facilitate research in CV and TB by increasing the re-usability of data collected in the healthcare setting. http://ontology.buffalo.edu/smith
Knowledge Environments for Biomedical Research (KEBR) • NIH Conference, December 11-12, 2006 • Knowledge environments must be characterized by: • sustainability • adaptability • interoperability • evolvability http://ontology.buffalo.edu/smith
sustainability • biologists have huge amounts of data, which they need to manage and make accessible • have worked out a sustainable way of achieving this result http://ontology.buffalo.edu/smith
adaptability • best achieved through modularity, • each portion of the knowledge environment controlled by appropriate domain experts http://ontology.buffalo.edu/smith
interoperability • the modules should use a common (simple) logic and a common, thoroughly-tested (simple) ontology • - unification with a light touch http://ontology.buffalo.edu/smith
evolvability • = change in light of scientific advance • knowledge environments must be tied to biological and clinical research • must be able to evolve incrementally • must • must ensure backwards compatibility with legacy annotations http://ontology.buffalo.edu/smith
how do we make different sorts of data combinablein ways useful to the human beings who carry out research? http://ontology.buffalo.edu/smith
how was this problem solved in the years before computers? • how did clinical researchers from different disciplines communicate? • how did they learn to communicate? http://ontology.buffalo.edu/smith
Ontology-based methodology • for clinical and translational research http://ontology.buffalo.edu/smith
through the basic biomedical sciencesanatomy, physiology, biochemistry, histology, ... http://ontology.buffalo.edu/smith
what is “metadata” ?? • data models (HL7 RIM, BRIDG, ... UML) • vs. • ontologies • How should we represent data? • vs. • How should we represent reality? http://ontology.buffalo.edu/smith
what cellular component? what molecular function? what biological process? http://ontology.buffalo.edu/smith
Gene Ontology http://ontology.buffalo.edu/smith
checking the ontology: everything can be traced back to instances in reality • serotonin is_a biogenic amine • every instance of serotonin is an instance of biogenic amine http://ontology.buffalo.edu/smith
Heparin therapy is_a written or spoken designation of a concept • mouse is_a common name for the species mus musculus • virus is_a environment ontology • unclassified Lulworthiales is_a environmental samples http://ontology.buffalo.edu/smith
Logical power of the ontology Example: Ontologies facilitate grouping of annotations brain 20 hindbrain15 rhombomere 10 Query brain without ontology 20 Query brain with ontology 45 http://ontology.buffalo.edu/smith
Biorepository Ontology • Chemical Entities of Biological Interest (ChEBI) • Clinical Investigation Ontology (CIO) • Common Anatomy Reference Ontology (CARO) • Disease Ontology (DO) • Foundational Model of Anatomy (FMA) • Cell Ontology (CL) • Gene Ontology (GO) • Mosquito Anatomy Ontology (MAO) • Ontology for Biomedical Investigations (OBI) • Phenotypic Quality Ontology (PaTO) • Plant Ontology (PO) • Protein Ontology (PRO) • Relation Ontology (RO) • RNA Ontology (RnaO) • Sequence Ontology (SO) • Xenopus Anatomy Ontology (XAO) • Zebrafish Anatomical Ontology (ZAO) http://ontology.buffalo.edu/smith
Chemical Entities of Biological Interest (ChEBI) • Clinical Investigation Ontology (CIO) • Common Anatomy Reference Ontology (CARO) • Disease Ontology (DO) • Foundational Model of Anatomy (FMA) • Cell Ontology (CL) • Gene Ontology (GO) • Mosquito Anatomy Ontology (MAO) • Ontology for Biomedical Investigations (OBI) • Phenotypic Quality Ontology (PaTO) • Plant Ontology (PO) • Protein Ontology (PRO) • Relation Ontology (RO) • RNA Ontology (RnaO) • Sequence Ontology (SO) • Xenopus Anatomy Ontology (XAO) • Zebrafish Anatomical Ontology (ZAO) http://ontology.buffalo.edu/smith
Biorepository Ontology • Chemical Entities of Biological Interest (ChEBI) • Clinical Investigation Ontology (CIO) • Common Anatomy Reference Ontology (CARO) • Disease Ontology (DO) interoperation with SNOMED CT • Foundational Model of Anatomy (FMA) • Cell Ontology (CL) • Gene Ontology (GO) • Mosquito Anatomy Ontology (MAO) • Ontology for Biomedical Investigations (OBI) • Phenotypic Quality Ontology (PaTO) signs and symptoms • Plant Ontology (PO) • Protein Ontology (PRO) • Relation Ontology (RO) • RNA Ontology (RnaO) • Sequence Ontology (SO) • Xenopus Anatomy Ontology (XAO) • Zebrafish Anatomical Ontology (ZAO) http://ontology.buffalo.edu/smith
Building out from the original GO http://ontology.buffalo.edu/smith
BFO Top-Level Ontology Continuant Occurrent (always dependent on one or more independent continuants) Independent Continuant Dependent Continuant http://ontology.buffalo.edu/smith
= A representation of top-level types Continuant Occurrent biological process biological process Independent Continuant Dependent Continuant cell component molecular function http://ontology.buffalo.edu/smith
Top-Level Ontology Continuant Occurrent Independent Continuant Dependent Continuant Quality Function instances (in space and time) http://ontology.buffalo.edu/smith
BFO as organising structure http://ontology.buffalo.edu/smith
http://obofoundry.org • clinical medicine rooted in the basic biological sciences via high-quality controlled vocabularies http://ontology.buffalo.edu/smith
next step: create a repertoire of disease ontologiesbuilt out of OBO Foundry elements http://ontology.buffalo.edu/smith
Ontology for Acute Respiratory Distress Syndrome http://ontology.buffalo.edu/smith
what data do we have? what data do the others have? what data do we not have? Draft Ontology for Multiple Sclerosis http://ontology.buffalo.edu/smith
HL7 http://hl7-watch.blogspot.com/ http://ontology.buffalo.edu/smith
Schadow • The RIM ‘defines the grammar of a language for information in healthcare’. • ‘All data is in a form in which Entities (people, places, things: NOUNS) are related in Roles (RELATORS) to other Entities, and through their participations (PREPOSITIONS) interact in Acts (VERBS).’ http://ontology.buffalo.edu/smith
Problems of scope • Act = intentional action • No processes (verb items) outside Act • How can the RIM deal with disease processes, drug interactions, traffic accidents, adverse events? http://ontology.buffalo.edu/smith
Problems of scope • Entity = persons, places, organizations, material • No things (noun items) outside Entity • How can the RIM deal with wounds, fractures,? • How can the RIM deal with diseases? http://ontology.buffalo.edu/smith
Mayo on ‘Act’ as “intentional action” • Is a snake bite or bee sting an intentional action? • Is a knife stabbing an intentional action? • Is a car accident an intentional action? • When a child swallows the contents of a bottle of poison is that an intentional action? http://informatics.mayo.edu/wiki/index.php/Intentionality_of_Act_and_the_Future_ of_Observations http://ontology.buffalo.edu/smith
Diseases in the RIM • ... are not Acts • ... are not Entities • ... are not Roles, Participations, Role-Links ... • So what are they? http://ontology.buffalo.edu/smith
Correct Answer • Diseases fall outside the scope of the RIM. The RIM is concerned to standardize the way in which data is represented in messages, etc. It is not concerned to standardize the way in which diseases, alleles, drug interactions, etc., are represented. http://ontology.buffalo.edu/smith
The RIM’s answer • Diseases are Acts of Observation • A case of pneumonia is an Act of Observation of a case of pneumonia • A diagnosis is an Act of Observation of an Act of Observation http://ontology.buffalo.edu/smith
HL7’s Clinical Genomics Standard Specifications • an individual allele as an Act of Observation • a phenotype is an Act of Observation of an Act of Observation http://ontology.buffalo.edu/smith
What should be done? • Create a clinical ontology which allows adequate treatment of all the types of entity relevant to information exchange in biomedicine, including: • non-intentional processes, diseases, infections, biomolecules, etc. http://ontology.buffalo.edu/smith
BFO top-Level Ontology Continuant Occurrent Independent Continuant Dependent Continuant Act Physical Event ... ... Quality Function http://ontology.buffalo.edu/smith
RIM Ontology Continuant Occurrent Entity (Intentional) Act Bio- molecule Disease Drug interaction ... Person Physical Thing Organ- ization http://ontology.buffalo.edu/smith
BFO normalized RIM Continuant Occurrent Independent Continuant Dependent Continuant Act Physical Event Everything made of molecules Condition Request Observation Drug interaction Temperature Disease http://ontology.buffalo.edu/smith
What is new Continuant Occurrent Independent Continuant Dependent Continuant Act Physical Event Everything made of molecules Condition Request Observation Drug interaction Temperature Disease http://ontology.buffalo.edu/smith
Coherent interoperation with ChEBI, PATO, SNOMED, MedDRA, etc. Continuant Occurrent Independent Continuant Dependent Continuant Act Physical Event Everything made of molecules Condition Request Observation Drug interaction Temperature Disease http://ontology.buffalo.edu/smith
? ? ? ? ? ? http://ontology.buffalo.edu/smith
what data do we have? what data do the others have? what data do we not have? Draft Ontology for Multiple Sclerosis http://ontology.buffalo.edu/smith
Methodology of cross-products • compound terms and definitions should be built out of constituent terms drawn from ontologies. E.g. • PaTO increased concentration’ • FMA ‘blood’ • CheBI term ‘glucose’ • blood glucose phenotypes. • Foundry provides rigor for post-coordination • Contributions to solving the silo problem http://ontology.buffalo.edu/smith
Open questions • relations to • generating forms • controlled vocabulary for clinical care • common data elements • clinical trials • treatments • signs and symptoms • (clinical and pre-clinical manifestations) http://ontology.buffalo.edu/smith
Open questions • role of • stakeholders • professional society support • champions who will test • role of rare disease researcher communities • mandates http://ontology.buffalo.edu/smith