1 / 48

What is an ontology and Why should you care? Barry Smith ontology.buffalo/smith

What is an ontology and Why should you care? Barry Smith http://ontology.buffalo.edu/smith. Uses of ‘ontology’ in PubMed abstracts. By far the most successful: GO (Gene Ontology). You’re interested in which genes control heart muscle development 17,536 results. time. Defense response

Renfred
Download Presentation

What is an ontology and Why should you care? Barry Smith ontology.buffalo/smith

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What is an ontology and Why should you care? Barry Smith http://ontology.buffalo.edu/smith

  2. Uses of ‘ontology’ in PubMed abstracts

  3. By far the most successful: GO (Gene Ontology)

  4. You’re interested in which genes control heart muscle development 17,536 results

  5. time Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Puparial adhesion Molting cycle hemocyanin Amino acid catabolism Lipid metobolism Peptidase activity Protein catabloism Immune response Immune response Toll regulated genes control attacked Microarray data shows changed expression of thousands of genes. How will you spot the patterns?

  6. You’re interested in which of your hospital’s patient data is relevant to understanding how genes control heart muscle development

  7. Lab / pathology data EHR data Clinical trial data Family history data Medical imaging Microarray data Model organism data Flow cytometry Mass spec Genotype / SNP data How will you spot the patterns? How will you find the data you need?

  8. One strategy for bringing order into this huge conglomeration of data is through the use of Common Data Elements • Discipline-specific (cancer, NIAID, …) • Do not solve the problems of balkanization (data siloes) • Do not evolve gracefully as knowledge advances • Support data cumulation, but do not readily support data integration and computation

  9. How does the Gene Ontology work? with thanks to Jane Lomax, Gene Ontology Consortium

  10. 1. GO provides a controlled system of representations for use in annotating data • multi-species, multi-disciplinary, open source • contributing to the cumulativity of scientific results obtained by distinct research communities • compare use of kilograms, meters, seconds … in formulating experimental results

  11. Definitions

  12. Gene products involved in cardiac muscle development in humans

  13. GO provides answers to three types of questions for each gene product • in what parts of the cell has it been identified? • exercising what types of molecular functions? • with what types of biological processes? when is a particular gene product involved • in the course of normal development? • in the process leading to abnormality with what functions is the gene product associated in other biological processes?

  14. Some pain-related terms in GO GO:0048265 response to pain GO:0019233 sensory perception of pain GO:0048266 behavioral response to pain GO:0019234 sensory perception of fast pain GO:0019235 sensory perception of slow pain GO:0051930 regulation of sensory perception of pain GO:0050967 detection of electrical stimulus during sensory perception of pain GO:0050968 detection of chemical stimulus involved in sensory perception of pain GO:0050966 detection of mechanical stimulus involved in sensory perception of pain

  15. GO:0050968 detection of chemical stimulus involved in sensory perception of pain

  16. GO provides a tool for algorithmic reasoning

  17. Hierarchical view representing relations between represented types

  18. GO allows a new kind of biological research, based on analysis and comparison of the massive quantities of annotations linking GO terms to gene products

  19. One standard method Sjöblöm T, et al. analyzed13,023 genes in 11 breast and 11 colorectal cancers using functional information captured by GO for given gene product types identified 189 as being mutated at significant frequency and thus as providing targets for diagnostic and therapeutic intervention. Science. 2006 Oct 13;314(5797):268-74.

  20. Uses of GO in studies of: • Biomedical discovery acceleration, with applications to craniofacial development. PMID: 19325874 • Persistent changes in spinal cord gene expression after recovery from inflammatory hyperalgesia: a preliminary study on pain memory. PMID: 18366630 • Spinal cord transcriptional profile analysis reveals protein trafficking and RNA processing as prominent processes regulated by tactile allodynia. PMID: 17069981 • Immune system involvement in abdominal aortic aneurisms (PMID 17634102)

  21. $100 mill. invested in literature curation using GO over 11 million annotations relating gene products described in the UniProt, Ensembl and other databases to terms in the GO experimental results reported in 52,000 scientific journal articles manually annoted by expert biologists using GO ontologies provide the basis for capturing biological theories in computable form

  22. GO is amazingly successful in overcoming problems of balkanization but it covers only generic biological entities of three sorts: • cellular components • molecular functions • biological processes and it does not provide representations of diseases, symptoms, …

  23. Extending the GO methodology to other domains of biology and medicine

  24. The Open Biomedical Ontologies (OBO) Foundry

  25. OBO Foundry recognized by NIH as framework to address mandates for re-usability of data collected through Federally funded research see NIH PAR-07-425: Data Ontologies for Biomedical Research (R01)

  26. OBO Foundry provides • tested guidelines enabling new groups to develop the ontologies they need in ways which counteract forking and dispersion of effort • an incremental bottoms-up approach to evidence-based terminology practices in medicine that is rooted in basic biology • automatic web-based linkage between biological knowledge resources (massive integration of databases across species and biological system)

  27. An ontology is not a terminology Existing term lists and CDEs • built to serve specific data-processing • in ad hoc ways Ontologies • designed from the start to ensure integratability and reusability of data • by incorporating a common logical structure

  28. The Open Biomedical Ontologies (OBO) Foundry

  29. What ontology can do for pain Cleveland Clinic Semantic Database – how to mine legacy data in cardiovascular surgery to reveal information on outcomes to identify subjects for clinical trials to allow virtual experimentation Goal to extend this approach across the entirety of medicine -- starting with signs, symptoms and other basic categories

  30. Three distinct classificatory tasks of people (patients, carriers, …) of diseases (cases, instances, problems, …) of presentations (diagnoses, signs, observations …) ICD confuses 1. & 2. HL7, most standard terminologies confuse 2. & 3.

  31. Big Picture

  32. A disease is a disposition rooted in a physical disorder in the organism and realized in pathological processes. produces bears realized_in etiological process disorder disposition pathological process produces diagnosis interpretive process signs & symptoms abnormal bodily features produces used_in recognized_as

  33. Elucidation of Primitive Terms • ‘bodily feature’ - an abbreviation for a physical component, a bodily quality, or a bodily process. • disposition - an attribute describing the propensity to initiate certain specific sorts of processes when certain conditions are satisfied. • clinically abnormal - some bodily feature that • (1) is not part of the life plan for an organism of the relevant type (unlike aging or pregnancy), • (2) is causally linked to an elevated risk either of pain or other feelings of illness, or of death or dysfunction, and • (3) is such that the elevated risk exceeds a certain threshold level.* *Compare: baldness

  34. Definitions - Foundational Terms • Disorder =def. – A causally linked combination of physical components that is clinically abnormal. • Pathological Process =def. – A bodily process that is a manifestation of a disorder and is clinically abnormal. • Disease =def. – A disposition (i) to undergo pathological processes that (ii) exists in an organism because of one or more disorders in that organism.

  35. Dispositions and Predispositions • All diseases are dispositions; not all dispositions are diseases. • A predisposition is a disposition. • Predisposition to Disease of Type X =def.– A disposition in an organism that constitutes an increased risk of the organism’s subsequently developing the disease X. • HNPCC is caused by a • disorder (mutation) in a DNA mismatch repair gene that • disposes to the acquisition of additional mutations from defective DNA repair processes, and thus is a • predisposition to the development of colon cancer.

  36. Definitions - Clinical Evaluation Terms • Sign =def. – A bodily feature of a patient that is observed in a physical examination and is deemed by the clinician to be of clinical significance. (Objectively observable features) • Symptom =def. – A bodily feature of a patient that is observed by the patient and is hypothesized by the patient to be a realization of a disease. (A restricted family of phenomena (including pain, nausea, anger, drowsiness), which are of their nature experienced in the first person)

  37. Cirrhosis - environmental exposure • Symptoms & Signs • used_in • Interpretive process • produces • Hypothesis - rule out cirrhosis • suggests • Laboratory tests • produces • Test results - elevated liver enzymes in serum • used_in • Interpretive process • produces • Result - diagnosis that patient X has a disorder that bears the disease cirrhosis • Etiological process - phenobarbitol-induced hepatic cell death • produces • Disorder - necrotic liver • bears • Disposition (disease) - cirrhosis • realized_in • Pathological process - abnormal tissue repair with cell proliferation and fibrosis that exceed a certain threshold; hypoxia-induced cell death • produces • Abnormal bodily features • recognized_as • Symptoms - fatigue, anorexia • Signs - jaundice, splenomegaly

  38. But the disorder also induces normal physiological processes (immune response) that can results in the elimination of the disorder (transient disease course). Influenza - infectious • Symptoms & Signs • used_in • Interpretive process • produces • Hypothesis - rule out influenza • suggests • Laboratory tests • produces • Test results - elevated serum antibody titers • used_in • Interpretive process • produces • Result - diagnosis that patient X has a disorder that bears the disease flu • Etiological process - infection of airway epithelial cells with influenza virus • produces • Disorder - viable cells with influenza virus • bears • Disposition (disease) - flu • realized_in • Pathological process - acute inflammation • produces • Abnormal bodily features • recognized_as • Symptoms - weakness, dizziness • Signs - fever

  39. Huntington’s Disease - genetic • Symptoms & Signs • used_in • Interpretive process • produces • Hypothesis - rule out Huntington’s • suggests • Laboratory tests • produces • Test results - molecular detection of the HTT gene with >39CAG repeats • used_in • Interpretive process • produces • Result - diagnosis that patient X has a disorder that bears the disease Huntington’s disease • Etiological process - inheritance of >39 CAG repeats in the HTT gene • produces • Disorder - chromosome 4 with abnormal mHTT • bears • Disposition (disease) - Huntington’s disease • realized_in • Pathological process - accumulation of mHTT protein fragments, abnormal transcription regulation, neuronal cell death in striatum • produces • Abnormal bodily features • recognized_as • Symptoms - anxiety, depression • Signs - difficulties in speaking and swallowing

  40. HNPCC - genetic pre-disposition • Etiological process - inheritance of a mutant mismatch repair gene • produces • Disorder - chromosome 3 with abnormal hMLH1 • bears • Disposition (disease) - Lynch syndrome • realized_in • Pathological process - abnormal repair of DNA mismatches • produces • Disorder - mutations in proto-oncogenes and tumor suppressor genes with microsatellite repeats (e.g. TGF-beta R2) • bears • Disposition (disease) - non-polyposis colon cancer • realized in • Symptoms (including pain)

  41. Definition: Etiology • Etiological Process =def. – A process in an organism that leads to a subsequent disorder. • Example: toxic chemical exposure resulting in a mutation in the genomic DNA of a cell; infection of a human with a pathogenic virus; inheritance of two defective copies of a metabolic gene • The etiological process creates the physical basis of that disposition to pathological processes which is the disease.

  42. Definitions - Diagnosis • Clinical Picture =def. – A representation of a clinical phenotype that is inferred from the combination of laboratory, image and clinical findings about a given patient. • Diagnosis =def. – A conclusion of an interpretive process that has as input a clinical picture of a given patient and as output an assertion to the effect that the patient has a disease of such and such a type.

  43. Definitions - Qualities • Manifestation of a Disease =def. – A bodily feature of a patient that is (a) a deviation from clinical normality that exists in virtue of the realization of a disease and (b) is observable. • Observability includes observable through elicitation of response or through the use of special instruments. • Preclinical Manifestation of a Disease =def. – A manifestation of a disease that exists prior to its becoming detectable in a clinical history taking or physical examination. • Clinical Manifestation of a Disease =def. – A manifestation of a disease that is detectable in a clinical history taking or physical examination. • Phenotype =def. – A (combination of) bodily feature(s) of an organism determined by the interaction of its genetic make-up and environment. • Clinical Phenotype =def. – A clinically abnormal phenotype.

  44. pain report 1. 2. 3. symptom (experience of pain) tissue damage

  45. trigeminal neuralgia in 50% of cases drugs  no pain trigeminal neuralgia pain gene: COM-T 16 possible polymorphisms high pain sensitivity end with disease-disorder-disposition-diagnosis let’s throw a cluster analysis at this Dominik – let’s agree on the variables (ontologically informed CDE approach) OPERRA is being driven by cluster analysis

More Related