1.61k likes | 1.9k Views
ICBO Tutorial Introduction to Referent Tracking July 22, 2009 112 Norton Hall, UB North Campus. Werner CEUSTERS Center of Excellence in Bioinformatics and Life Sciences Ontology Research Group University at Buffalo, NY, USA. (corrected version: August 10, 2009). 1959 - 2009.
E N D
ICBO TutorialIntroduction to Referent TrackingJuly 22, 2009112 Norton Hall, UB North Campus Werner CEUSTERS Center of Excellence in Bioinformatics and Life Sciences Ontology Research Group University at Buffalo, NY, USA (corrected version: August 10, 2009)
1959 - 2009 Short personal history ? 1977 2006 2004 1989 1992 2002 1995 1998 1993
House keeping rules • Feel free to ask clarifications at any time if you don’t understand something I just said (but not more than three slides earlier); • Please do not interrupt me if you ‘just’ disagree with something I say until: • near beginning of the break, • near end of the tutorial; • Everybody in the audience may sleep except those students who are here for credit, • I’ll test them • redundancy in my slides serves thus a purpose: to help them !
Tutorial overview • Setting the scene: a rough description of what Referent Tracking is and why it is important • Review the basics of BFO relevant to RT • The crucial distinction between representations and what they represent • Implementation of RT systems • Examples of use
Bodyguards’shooting of Weiss When did Weiss kill Senator Long ? time Carl Weiss’ living Weiss’ shooting of Long Long’s pathological body reactions Weiss’s path. body reactions Senator Long’s living
What is Referent Tracking ? • A paradigm under development since 2005, • based on Basic Formal Ontology, • designed to keep track of relevant portions of reality and what is believed and communicated about them, • enabling adequate use of realism-based ontologies, terminologies, thesauri, and vocabularies, • originally conceived to track particulars on the side of the patient and his environment denoted in his EHR, • but since then studied in and applied to a variety of domains, • and now evolving towards tracking absolutely everything, not only particulars, but also universals.
? ‘The spectrum of the Health Sciences’ Turning data in knowledge http://www.uvm.edu/~ccts
Source of all data Reality !
A digital copy of the world Ultimate goal of Referent Tracking
Requirements for this digital copy • R1: A faithful representation of reality • R2 … of everything that is digitally registered, what is generic scientific theories what is specific what individual entities exist and how they relate • R3: … throughout reality’s entire history, • R4 … which is computable in order to … … allow queries over the world’s past and present, … make predictions, … fill in gaps, … identify mistakes, ...
The ‘binding’ wall How to do it right ? I don’t want a cartoon of the world
Distinction between Ontologies and Information Models • Ontologies should represent only what is always true about the entities of a domain (whether or not it is known to the person that reports), • Information models (or data structures) should only represent the artifacts in which information is recorded. • Such information may be incomplete and error-laden which needs to be accounted for in the information model rather than in the ontology itself.
Perfect ‘semantic’ tools are useless … • … if data captured at the source is not of high quality • Prevailing EHR systems don’t allow data to be stored at acceptable quality level: • No formal distinction between disorders and diagnosis • Messy nature of the notions of ‘problem’ and ‘concern’ • No unique identification of the entities about which data is stored • Unique IDs for data-elements cannot serve as unique IDs for the entities denoted by these data-elements
PtID Date ObsCode Narrative 5572 5572 5572 298 5572 5572 298 2309 47804 5572 5572 12/07/1990 01/04/1997 12/07/1990 17/05/1993 22/08/1993 21/03/1992 22/08/1993 04/07/1990 01/04/1997 04/07/1990 03/04/1993 81134009 9001224 26442006 9001224 79001 79001 9001224 26442006 2909872 58298795 26442006 Essential hypertension Accident in public building (supermarket) Closed fracture of radial head closed fracture of shaft of femur Essential hypertension Accident in public building (supermarket) Other lesion on other specified region closed fracture of shaft of femur Fracture, closed, spiral closed fracture of shaft of femur Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 0939 20/12/1998 255087006 malignant polyp of biliary tract Terminologies for ‘unambiguous representation’ ???
PtID Date ObsCode Narrative 5572 5572 47804 5572 2309 5572 298 298 5572 5572 5572 21/03/1992 04/07/1990 17/05/1993 12/07/1990 01/04/1997 12/07/1990 03/04/1993 22/08/1993 01/04/1997 22/08/1993 04/07/1990 79001 26442006 9001224 26442006 81134009 26442006 9001224 9001224 79001 58298795 2909872 closed fracture of shaft of femur closed fracture of shaft of femur Fracture, closed, spiral Essential hypertension closed fracture of shaft of femur Closed fracture of radial head Accident in public building (supermarket) Essential hypertension Other lesion on other specified region Accident in public building (supermarket) Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur If two different fracture codes are used in relation to observations made on the same day for the same patient, do they denote the same fracture ? 0939 20/12/1998 255087006 malignant polyp of biliary tract Terminologies for ‘unambiguous representation’ ???
PtID Date ObsCode Narrative 5572 5572 47804 5572 2309 5572 298 298 5572 5572 5572 21/03/1992 04/07/1990 17/05/1993 12/07/1990 01/04/1997 12/07/1990 03/04/1993 22/08/1993 01/04/1997 22/08/1993 04/07/1990 79001 26442006 9001224 26442006 81134009 26442006 9001224 9001224 79001 58298795 2909872 closed fracture of shaft of femur closed fracture of shaft of femur Fracture, closed, spiral Essential hypertension closed fracture of shaft of femur Closed fracture of radial head Accident in public building (supermarket) Essential hypertension Other lesion on other specified region Accident in public building (supermarket) Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract If the same fracture code is used for the same patient on different dates, can these codes denote the same fracture? 2309 21/03/1992 26442006 closed fracture of shaft of femur 0939 20/12/1998 255087006 malignant polyp of biliary tract Terminologies for ‘unambiguous representation’ ???
PtID Date ObsCode Narrative 5572 5572 47804 5572 2309 5572 298 298 5572 5572 5572 21/03/1992 04/07/1990 17/05/1993 12/07/1990 01/04/1997 12/07/1990 03/04/1993 22/08/1993 01/04/1997 22/08/1993 04/07/1990 79001 26442006 9001224 26442006 81134009 26442006 9001224 9001224 79001 58298795 2909872 closed fracture of shaft of femur closed fracture of shaft of femur Fracture, closed, spiral Essential hypertension closed fracture of shaft of femur Closed fracture of radial head Accident in public building (supermarket) Essential hypertension Other lesion on other specified region Accident in public building (supermarket) Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur Can the same fracture code used in relation to two different patients denote the same fracture? 0939 20/12/1998 255087006 malignant polyp of biliary tract Terminologies for ‘unambiguous representation’ ???
PtID Date ObsCode Narrative 5572 5572 47804 5572 2309 5572 298 298 5572 5572 5572 21/03/1992 04/07/1990 17/05/1993 12/07/1990 01/04/1997 12/07/1990 03/04/1993 22/08/1993 01/04/1997 22/08/1993 04/07/1990 79001 26442006 9001224 26442006 81134009 26442006 9001224 9001224 79001 58298795 2909872 closed fracture of shaft of femur closed fracture of shaft of femur Fracture, closed, spiral Essential hypertension closed fracture of shaft of femur Closed fracture of radial head Accident in public building (supermarket) Essential hypertension Other lesion on other specified region Accident in public building (supermarket) Accident in public building (supermarket) 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur Can two different tumor codes used in relation to observations made on different dates for the same patient, denote the same tumor ? 0939 20/12/1998 255087006 malignant polyp of biliary tract Terminologies for ‘unambiguous representation’ ???
PtID Date ObsCode Narrative Do three references of ‘hypertension’ for the same patient denote three times the same disease? 5572 5572 47804 2309 5572 5572 298 5572 5572 5572 298 21/03/1992 22/08/1993 04/07/1990 17/05/1993 12/07/1990 01/04/1997 03/04/1993 01/04/1997 22/08/1993 12/07/1990 04/07/1990 79001 9001224 9001224 26442006 26442006 81134009 9001224 2909872 26442006 79001 58298795 closed fracture of shaft of femur Fracture, closed, spiral Essential hypertension closed fracture of shaft of femur Accident in public building (supermarket) Essential hypertension Other lesion on other specified region Accident in public building (supermarket) closed fracture of shaft of femur Accident in public building (supermarket) Closed fracture of radial head 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 0939 20/12/1998 255087006 malignant polyp of biliary tract Terminologies for ‘unambiguous representation’ ???
Can the same type of location code used in relation to three different events denote the same location? PtID Date ObsCode Narrative 5572 5572 298 5572 5572 5572 298 2309 47804 5572 5572 12/07/1990 01/04/1997 22/08/1993 12/07/1990 01/04/1997 04/07/1990 21/03/1992 04/07/1990 03/04/1993 17/05/1993 22/08/1993 81134009 9001224 2909872 58298795 26442006 9001224 79001 26442006 26442006 79001 9001224 Closed fracture of radial head Accident in public building (supermarket) Essential hypertension closed fracture of shaft of femur Accident in public building (supermarket) Other lesion on other specified region closed fracture of shaft of femur Fracture, closed, spiral closed fracture of shaft of femur Accident in public building (supermarket) Essential hypertension 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 0939 20/12/1998 255087006 malignant polyp of biliary tract Terminologies for ‘unambiguous representation’ ???
PtID Date ObsCode Narrative 5572 5572 47804 5572 2309 5572 298 298 5572 5572 5572 01/04/1997 04/07/1990 01/04/1997 04/07/1990 03/04/1993 12/07/1990 22/08/1993 17/05/1993 21/03/1992 12/07/1990 22/08/1993 79001 9001224 26442006 9001224 26442006 81134009 2909872 26442006 9001224 58298795 79001 closed fracture of shaft of femur Accident in public building (supermarket) Accident in public building (supermarket) Essential hypertension Fracture, closed, spiral closed fracture of shaft of femur Closed fracture of radial head Accident in public building (supermarket) Essential hypertension closed fracture of shaft of femur Other lesion on other specified region 5572 04/07/1990 79001 Essential hypertension 0939 24/12/1991 255174002 benign polyp of biliary tract 2309 21/03/1992 26442006 closed fracture of shaft of femur 0939 20/12/1998 255087006 malignant polyp of biliary tract How will we ever know ?
The problem in a nutshell • Generic terms used to denote specific entities do not have enough referential capacity • Usually enough to convey that some specific entity is denoted, • Not enough to be clear about which one in particular. • For many ‘important’ entities, unique identifiers are used: • UPS parcels • Patients in hospitals • VINs on cars • …
Fundamental goals of ‘our’ Referent Tracking • explicitreference to the concrete individual entities relevant to the accurate description of some portion of reality, ... Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform. 2006 Jun;39(3):362-78.
78 235 5678 321 322 666 427 Method: numbers instead of words • Introduce an Instance Unique Identifier(IUI) for each relevant particular (individual) entity Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform. 2006 Jun;39(3):362-78.
Fundamental goals of ‘our’ Referent Tracking • Use these identifiers in expressions using a language that acknowledges the structure of reality e.g.: a yellow ball: #1: the ball #2: #1’s yellow Then not: ball(#1) and yellow(#2) and hascolor(#1, #2) But: instance-of(#1, ball, since t) instance-of(#2, yellow, since t) inheres-in(#1, #2, since t) • Strong foundations in realism-based ontology
PtID Date ObsCode Narrative IUI-001 5572 5572 5572 298 2309 47804 5572 298 5572 5572 5572 03/04/1993 04/07/1990 04/07/1990 01/04/1997 12/07/1990 01/04/1997 22/08/1993 22/08/1993 21/03/1992 17/05/1993 12/07/1990 9001224 9001224 26442006 9001224 26442006 26442006 79001 79001 2909872 81134009 58298795 closed fracture of shaft of femur Essential hypertension Accident in public building (supermarket) closed fracture of shaft of femur Essential hypertension Accident in public building (supermarket) Accident in public building (supermarket) closed fracture of shaft of femur Closed fracture of radial head Fracture, closed, spiral Other lesion on other specified region IUI-001 IUI-001 IUI-007 5572 04/07/1990 79001 IUI-005 Essential hypertension 0939 24/12/1991 255174002 IUI-004 benign polyp of biliary tract 2309 21/03/1992 26442006 IUI-002 closed fracture of shaft of femur IUI-007 IUI-006 IUI-005 IUI-003 IUI-007 IUI-012 IUI-005 0939 20/12/1998 255087006 IUI-004 malignant polyp of biliary tract Codes for ‘types’ AND identifiers for instances 7 distinct disorders
‘Principles for Success’ • Evolutionary change • Radical change: • Principle 6: Architect Information and Workflow Systems to Accommodate Disruptive Change • Organizations should architect health care IT for flexibility to support disruptive change rather than to optimize today’s ideas about health care. • Principle 7: Archive Data for Subsequent Re-interpretation • Vendors of health care IT should provide the capability of recording any data collected in their measured, uninterpreted, original form, archiving them as long as possible to enable subsequent retrospective views and analyses of those data.NOTE Willam W. Stead and Herbert S. Lin, editors; Committee on Engaging the Computer Science Research Community in Health Care Informatics; National Research Council. Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions (2009)
‘Principles for Success’ (continued) • The NOTE: • ‘See, for example, Werner Ceusters and Barry Smith, “Strategies for Referent Tracking in Electronic Health Records” Journal of Biomedical Informatics 39(3):362-378, June 2006.’ Willam W. Stead and Herbert S. Lin, editors; Committee on Engaging the Computer Science Research Community in Health Care Informatics; National Research Council. Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions (2009)
Words, words, words, … • A paradigm under development since 2005, • based on Basic Formal Ontology, • designed to keep track of relevant portions of reality and what is believed and communicated about them, • enabling adequate use of realism-based ontologies, terminologies, thesauri, and vocabularies, • originally conceived to track particulars on the side of the patient and his environment denoted in his EHR, • but since then studied in and applied to a variety of domains, • and now evolving towards tracking absolutely everything, not only particulars, but also universals.
Therefore:Part 1: the BasicsNo (good) Referent Trackingwithout (good) Realism-based Ontology
Basic axioms • There is an external reality which is ‘objectively’ the way it is; • That reality is accessible to us; • We build in our brains cognitive representations of reality; • We communicate with others about what is there, and what we believe there is there. Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, Biomedical Ontology in Action, November 8, 2006, Baltimore MD, USA
What is there ? The parts of BFO relevant for Referent Tracking (1) some universal instanceOf … some particular
The shift envisioned • From: • ‘this person is a 40 year old patient with a stomach tumor’ • To (something like): • ‘this-1 on which depend this-2 and this-3 has this-4’, where • this-1 instanceOf human being … • this-2 instanceOf age-of-40-years … • this-2 qualityOf this-1 … • this-3 instanceOf patient-role … • this-3 roleOf this-1 … • this-4 instanceOf tumor … • this-4 partOf this-5 … • this-5 instanceOf stomach … • this-5 partOf this-1 … • …
The shift envisioned • From: • ‘this man is a 40 year old patient with a stomach tumor’ • To (something like): • ‘this-1 on which depend this-2 and this-3 has this-4’, where • this-1 instanceOf human being … • this-2 instanceOf age-of-40-years … • this-2 qualityOf this-1 … • this-3 instanceOf patient-role … • this-3 roleOf this-1 … • this-4 instanceOf tumor … • this-4 partOf this-5 … • this-5 instanceOf stomach … • this-5 partOf this-1 … • … denotators for particulars
The shift envisioned • From: • ‘this man is a 40 year old patient with a stomach tumor’ • To (something like): • ‘this-1 on which depend this-2 and this-3 has this-4’, where • this-1 instanceOf human being … • this-2 instanceOf age-of-40-years … • this-2 qualityOf this-1 … • this-3 instanceOf patient-role … • this-3 roleOf this-1 … • this-4 instanceOf tumor … • this-4 partOf this-5 … • this-5 instanceOf stomach … • this-5 partOf this-1 … • … denotators for appropriate relations
The shift envisioned • From: • ‘this man is a 40 year old patient with a stomach tumor’ • To (something like): • ‘this-1 on which depend this-2 and this-3 has this-4’, where • this-1 instanceOf human being … • this-2 instanceOf age-of-40-years … • this-2 qualityOf this-1 … • this-3 instanceOf patient-role … • this-3 roleOf this-1 … • this-4 instanceOf tumor … • this-4 partOf this-5 … • this-5 instanceOf stomach … • this-5 partOf this-1 … • … denotators for universals or particulars
The shift envisioned • From: • ‘this man is a 40 year old patient with a stomach tumor’ • To (something like): • ‘this-1 on which depend this-2 and this-3 has this-4’, where • this-1 instanceOf human being … • this-2 instanceOf age-of-40-years … • this-2 qualityOf this-1 … • this-3 instanceOf patient-role … • this-3 roleOf this-1 … • this-4 instanceOf tumor … • this-4 partOf this-5 … • this-5 instanceOf stomach … • this-5 partOf this-1 … • … something I’ll come to later
instance-of at t caused by #105 Relevance: the way RT-compatible systems ought to interact with representations of generic portions of reality
entities on either site cannot ‘cross’ this boundary What is there ? The parts of BFO relevant for Referent Tracking (1) for every universal there is or has been at least one instance some universal instanceOf … every particular is an instance of at least one universal some particular
My terminology (1) • ‘entity’: • denotes either a universal or a particular • ‘instance’: • denotes a particular to which I refer in the context of some universal: • If A instanceOf B … then • ‘B is a universal’ • ‘A is a particular’ • ‘A is an instance’
do not denote isa !!! My terminology (1) • ‘entity’: • denotes either a universal or a particular • ‘instance’: • denotes a particular to which I refer in the context of some universal: • If A instanceOf B … then • ‘B is a universal’ • ‘A is a particular’ • ‘A is an instance’
My terminology (2) • ‘entity’: • denotes either a universal or a particular • ‘instance’: • denotes a particular to which I refer in the context of some universal: • If A instanceOf B … then • ‘B is a universal’ • ‘A is a particular’ • ‘A is an instance’ • ‘denotes’: (roughly for now) a relation between an entity and a representational construct (sign, symbol, term,…) such that the latter stands for the former in descriptions about reality.
What is there ? The parts of BFO relevant for Referent Tracking (1) some universal ? instanceOf … some particular
What is there ? The parts of BFO relevant for Referent Tracking (2) some continuant universal some occurrent universal instanceOf at t instanceOf some continuant particular some occurrent particular
The importance of temporal indexing malignant tumor benign tumor stomach instanceOf at t2 instanceOf at t1 instanceOf at t2 instanceOf at t1 partOf at t1 this-4 this-1’s stomach partOf at t2
Things do change indeed child adult vampire person t Living creature animal caterpillar butterfly
The continuants relevant for Referent Tracking spatial region independent continuant dependent continuant specifically dependent continuant generically dependent continuant material object site information content entity … terminology ontology