550 likes | 659 Views
Discovery Seminar 025087/UU – Spring 2008 Translational Pharmacogenomics: Discovering New Genetic Methods to Link Diagnosis and Drug Treatment Ontology: Developing a Systematic Approach to Translational Pharmacogenomic Research Data Collection April 16, 2008. Werner CEUSTERS
E N D
Discovery Seminar 025087/UU – Spring 2008Translational Pharmacogenomics: Discovering New Genetic Methods to Link Diagnosis and Drug TreatmentOntology: Developing a Systematic Approach to Translational Pharmacogenomic Research Data CollectionApril 16, 2008 Werner CEUSTERS Center of Excellence in Bioinformatics and Life Sciences Ontology Research Group University at Buffalo, NY, USA
Google ‘define: ontology’: • the study of the broadest range of categories of existence, which also asks questions about the existence of particular kinds of objects; • an explicit representation of the meaning of terms in a vocabulary, and their relationships; • a common vocabulary for describing the concepts that exist in an area of knowledge and the relationships that exist between them; • specification of a conceptualisation of a knowledge domain; • a structured information model of a domain capable of supporting reasoning by human users and software agents; • a data model that represents a set of concepts within a domain and the relationships between those concepts; • …
One term, many definitions • This raises some questions: • Is it possible for a term to have so many meanings? • Can the authors of these definitions all be right at the same time? • Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ?
Merriam-Webster on ‘bank’ Entry term 27 occurrence types 3 different word types Noun Verb Part of compound term Is it possible for a term to have so many meanings? } http://www.merriam-webster.com/dictionary
Is it possible for a term to have so many meanings? • Merriam-Webster on ‘bank’ http://www.merriam-webster.com/dictionary
Is it possible for a term to have so many meanings? • Merriam-Webster on ‘bank’ http://www.merriam-webster.com/dictionary
Is it possible for a term to have so many meanings? • Clearly: yes ! • This phenomenon is called: Homonymy
Term-2 Term-3 Term Meaning Meaning-1 Meaning-2 Meaning-3 Term-1 Meaning-4 Meaning-5 Meaning-6 Meaning-7 This is called: synonymy
Homonymous use of the term ‘ontology’ • the studyof the broadest range of categories of existence, which also asks questions about the existence of particular kinds of objects; • an explicit representation of the meaning of terms in a vocabulary, and their relationships; • a common vocabularyfor describing the concepts that exist in an area of knowledge and the relationships that exist between them; • specificationof a conceptualisation of a knowledge domain; • a structured information model of a domain capable of supporting reasoning by human users and software agents; • a data model that represents a set of concepts within a domain and the relationships between those concepts; • …
Can the authors of these definitions all be right at the same time? • Yes, if we are dealing with a case of homonymy.
Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ? ? information model is an data model representation is a is a ‘that’ thing is a vocabulary is a is a specification study
Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ? Not for all ! Only for some
Homonymous use of the term ‘ontology’:at least one clear cut distinction • the studyof the broadest range of categories of existence, which also asks questions about the existence of particular kinds of objects; • an explicit representation of the meaning of terms in a vocabulary, and their relationships; • a common vocabularyfor describing the concepts that exist in an area of knowledge and the relationships that exist between them; • specificationof a conceptualisation of a knowledge domain; • a structured information model of a domain capable of supporting reasoning by human users and software agents; • a data model that represents a set of concepts within a domain and the relationships between those concepts; • …
‘Ontology’ as the study of what exists • Key questions: • What exists ? • How do things that exist relate to each other ? • Some hypotheses: • An external reality, time, space • Ideas, concepts • Particulars, universals, objects, processes • God • Ontologists from distinct ‘schools’ differ in opinion about the existence of some of the above: • Realism, nominalism, conceptualism, monism, …
Anontology as a representation • Terms WordNet, MedDRA • Concepts the majority of ‘ontologies’ But … overwhelming lack of clarity about what ‘concepts’ are: • meaning shared in common by synonymous terms ? • idea shared in common in the minds of those who use these terms ? • unit of knowledge describing meanings ? • feature or property shared in common by entities in the world ? • Universals Realism-based ontology Key question: of what ?
Pharmacogenomics • the branch of pharmacology which • deals with the influence of genetic variation on drug response in patients • by correlating gene expression or single-nucleotide polymorphisms with a drug's efficacy or toxicity. • By doing so, pharmacogenomics aims to develop rational means to optimise drug therapy, with respect to the patients' genotype, to ensure maximum efficacy with minimal adverse effects.
Translational research • is the movement of discoveries in basic research (the Bench) to application at the clinical level (the Bedside). • transforms scientific discoveries arising from laboratory, clinical, or population studies into clinical applications to reduce disease incidence, morbidity, and mortality.
Key challenge: understanding how disorders at molecular level lead to disorders at mesoscopic level
wisdom knowledge information data Reality What is there on the side of the patient How to get there ?Current mainstream thinking
Example of data generation at the bench Amit Sheth. Semantic Web Technology in Support of Bioinformatics for Glycan Expression. W3C workshop on Semantic Web for Life Sciences, October 28, 2004, Cambridge MA
Semantic annotation of Scientific Data Amit Sheth. Semantic Web Technology in Support of Bioinformatics for Glycan Expression. W3C workshop on Semantic Web for Life Sciences, October 28, 2004, Cambridge MA
Ontologies for annotating data Snomed-CT view on Serum hepatitis
Zoom on Hepatitis B with hepatitis D superinfection • Relationships to other concepts: • Causative agent Hepatitis D virus (organism) • Finding site Liver structure (body structure) • Causative agent Hepatitis B virus (organism) • Associated morphology Inflammation (morphologic abnormality) • Information about this concept: • PREFERRED_TERM Hepatitis B with hepatitis D superinfection • TERM Hepatitis B with delta agent coinfection • TERM Hepatitis B with delta agent superinfection • TERM Hepatitis B with hepatitis D superinfection • TERM Hepatitis D infection • TERM Viral hepatitis B with delta agent superinfection • TERM Viral hepatitis B with hepatitis D superinfection Comments ?
SNOMED-CT generated taxonomy (partial) General finding of abdomen (finding) Viscus structure finding (finding) Abdominal organ finding (finding) Disorder of abdomen (disorder) Liver finding (finding) Disorder of liver (disorder) Infectious disease of abdomen (disorder) Inflammatory disease of liver (disorder) Viral hepatitis (disorder) Is a Comments ? Type B viral hepatitis (disorder) Hepatitis B with hepatitis D superinfection (disorder)
wisdom knowledge information data • Questions not often enough asked: • What part of our data corresponds with something out there in reality ? • What part of reality is not captured by our data, but should because it is relevant ? Reality What is there on the side of the patient Problem with mainstream thinking:
The solution:the RIGHT sort of ontologyRealism-based ontology
Realist ontology: assumes three levels of reality • The world exists ‘as it is’ prior to a cognitive agent’s perception thereof; • Cognitive agents build up ‘in their minds’ cognitive representations of the world; • To make these representations publicly accessible in some enduring fashion, they create representational artifacts that are fixed in some medium. Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA
Three levels of reality • The world exists ‘as it is’ prior to a cognitive agent’s perception thereof; Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA
R Reality exists before any observation
R And also most structures in reality are there in advance. Reality exists before any observation • Humans had a brain well before they knew they had one. • Trees were green before humans started to use the word “green”.
Three levels of reality • The world exists ‘as it is’ prior to a cognitive agent’s perception thereof; • Cognitive agents build up ‘in their minds’ cognitive representations of the world; Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA
The cognitive agent acknowledges the existence of some Portion of Reality (POR) B R
B Some portions of reality escape his attention. R
Three levels of reality • The world exists ‘as it is’ prior to a cognitive agent’s perception thereof; • Cognitive agents build up ‘in their minds’ cognitive representations of the world; • To make these representations publicly accessible in some enduring fashion, they create representational artifacts that are fixed in some medium. Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA
He represents only what he considers relevant B • Both RU1B1 and RU1O1 are representational units referring to #1; • RU1O1 is NOT a representation of RU1B1; • RU1O1 is created through concretization of RU1B1 in some medium. RU1B1 RU1O1 O #1 R
A realism-based ontology • is a representation of some pre-existing domain of realitywhich • (1) reflects the properties of the objects within its domain in such a waythat there obtains a systematic correlation between realityand the representation itself, • (2) is intelligible to a domain expert • (3) is formalized in a way that allows it to support automatic information processing
Our foundations: Basic Formal Ontology • An ontology which is • Realist: • Fallibilist: • Perspectivalist: • Adequatist: • There is only one reality and its constituents exist independently of our (linguistic, conceptual, theoretical, cultural) representations thereof, • theories and classifications can be subject to revision, • there exists a plurality of alternative, equally legitimate perspectives on that one reality these alternative views are not reducible to any single basic view.
Basic Formal Ontology • The world consists of • entities that are • Either particulars or universals; • Either occurrents or continuants; • Either dependent or independent; and, • relationships between these entities of the form • <particular , universal> e.g. is-instance-of, • <particular , particular> e.g. is-member-of • <universal , universal> e.g. isa (is-subtype-of) Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, November 8, 2006, Baltimore MD, USA
Accept that everything may change: • changes in the underlying reality: • Particulars and universals come and go • changes in our (scientific) understanding: • The planet Vulcan does not exist • reassessments of what is considered to be relevant for inclusion (notion of purpose). • encoding mistakes introduced during data entry or ontology development.
human being t me me child adult Instance-of in 1960 Instance-of since 1980 living creature animal caterpillar butterfly Example: continuants preserve identity while changing
Are we done ? • Is an accurate coding system, classification system, terminology, ontology, …, a necessary and sufficient condition for obtaining “better” information ? Necessary: yes ! Sufficient: no !
The same type of location code used in relation to three different events might or might not refer to the same location. PtID Date ObsCode Narrative Three references of hypertension for the same patient denote three times the same disease. 5572 5572 5572 5572 2309 5572 298 5572 298 5572 47804 03/04/1993 12/07/1990 01/04/1997 04/07/1990 22/08/1993 01/04/1997 21/03/1992 12/07/1990 04/07/1990 22/08/1993 17/05/1993 26442006 2909872 9001224 26442006 9001224 79001 9001224 81134009 58298795 79001 26442006 Essential hypertension Accident in public building (supermarket) closed fracture of shaft of femur closed fracture of shaft of femur Accident in public building (supermarket) Closed fracture of radial head Other lesion on other specified region Essential hypertension closed fracture of shaft of femur Accident in public building (supermarket) Fracture, closed, spiral 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract If the same fracture code is used for the same patient on different dates, then these codes might or might not refer to the same fracture. 2309 21/03/1992 26442006 closed fracture of shaft of femur If two different fracture codes are used in relation to observations made on the same day for the same patient, they might refer to the same fracture If two different tumor codes are used in relation to observations made on different dates for the same patient, they may still refer to the same tumor. The same fracture code used in relation to two different patients can not refer to the same fracure. 0939 20/12/1998 255087006 malignant polyp of biliary tract Using codes does not prevent ambiguities as to what is described: how many disorders are listed?
Consequences • Very difficult to: • Count the number of (numerically) different diseases • Bad statistics on incidence, prevalence, ... • Bad basis for health cost containment • Relate (numerically the same or different) causal factors to disorders: • Dangerous public places (specific work floors, swimming pools), • dogs with rabies, • HIV contaminated blood from donors, • food from unhygienic source, ... • Hampers prevention • ...
Now! That should clear up a few things around here ! • Purpose: • explicitreference to the concrete individual entities relevant to the accurate description of each patient’s condition, therapies, outcomes, ... Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform. 2006 Jun;39(3):362-78.
78 235 5678 321 322 666 427 Referent Tracking: Numbers instead of words! • Method: • Introduce an Instance Unique Identifier(IUI) for each relevant particular (individual) entity Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform. 2006 Jun;39(3):362-78.
instance-of at t1 inst-of at t2 #1 #10 person instance-ofat t1 inst-of at t2 #2 #20 liver instance-of at t1 inst-of at t2 #3 #30 tumor instance-of at t1 inst-of at t2 #4 #40 treating clinic #5 #5 instance-of at t1 inst-of at t2 device #6 #6 The principle of Referent Tracking ‘John Doe’s liver tumor was treated with RPCI’s irradiation device’ ‘John Doe’s ‘John Smith’s liver liver tumor tumor was treated was treated with with RPCI’s RPCI’s irradiation device’ irradiation device’