150 likes | 287 Views
Extending Temporal Databases to Deal with Telic/Atelic Medical Data. Paolo Terenziani 1 , Richard T. Snodgrass 2 , Alessio Bottrighi 1 , Mauro Torchio 3 , Gianpaolo Molino 3 1 DI, Univ. Piemonte Orientale “A. Avogadro”, Alessandria, Italy
E N D
Extending Temporal Databases to Deal with Telic/Atelic Medical Data Paolo Terenziani1, Richard T. Snodgrass2, Alessio Bottrighi1, Mauro Torchio3, Gianpaolo Molino3 1 DI, Univ. Piemonte Orientale “A. Avogadro”, Alessandria, Italy 2 Department of Computer Science, University of Arizona, Tucson, AZ, USA 3 Lab. Informatica Clinica, Az. Ospedaliera S. G. Battista, Torino, Italy - The problem: an example - The problem: a general perspective - Need for a general (not ad-hoc) solution - The solution (sketch) (see AIME and IEEE TKDE papers) - Conclusions
Introduction - Temporal information plays a basic role in Medical data - Need for suitable data models and query languages - Lack of specific supports makes the task of managing medical temporal data quite complex - Many approaches (mostly extensions/modifications of relational model) - All approaches share the same limitation: the underlying semantics is point-based, so that telic (medical) data connot be properly dealt with
The problem: an example John had two i.v. infusions of the drug Y, one starting at 10:00 and ending at 10:50, and the other from 10:51 to 11:30 (all extremes included); John had an i.v. of drug Z from 17:05 to 17:34, Mary had two i.v. infusions of Z, one from 10:40 to 10:55 and the other from 10:56 to 11:34, Ann had an i.v. from 10:53 to 11:32. Point-based semantics 10:00 <John, Y> 10:01 <John, Y> ……
The problem: an example NOTICE (1): All approaches in the literature adopt point- based semantics, even if they adopt different representations e.g., TSQL2 Point-based semantics 10:00 <John, Y> 10:01 <John, Y> …… 10:50 <John, Y>,<Mary,Z> 10:51 <John, Y>,<Mary,Z>
The problem: an example Point-based semantics 10:00 <John, Y> 10:01 <John, Y> …… 10:50 <John, Y>,<Mary,Z> 10:51 <John, Y>,<Mary,Z> Answers must be based on the semantics of data (and not on the representation!) Some pieces of information cannot be captured in any approach based on point based semantics!
The problem: an example UPWARD INHERITANCE (Q1) Who had one i.v. of Y lasting more than 1 hour? Answer: {<John ¦ {10:00, …., 11:30}>} COUNTABILITY (Q3) How many i.v. did John have? Answer: 2
The problem: a general perspective Not only a problem in case of “consecutive” time intervals! PROJECTION e.g., Select Drug, VT from PHLEBO GRANULARITY CHANGES e.g., Scale up to hours
The problem: a general perspectiveTelic vs atelic facts Aristotle: Telic facts (countable, no upward inheritance, ……) (e.g., i.v. infusion) Atelic facts (not-countable, upward inheritance, …..) (e.g., “patient X having a temperature > 38”) Linguistics (e.g., [Bennet & Partee, 78]) Cognitive Science (e.g., [Bloom et al., 80]) … and, more recently, in AI NOTICE: Classical TDB approaches (and point-based semantics) perfectly cope with atelic facts!
… about 15 years of DBT reasearch responding to the second question! Basically: - time is a special status, and deeply impacts the semantics of the other attributes - thus, it needs a specialised treatment (e.g, definition of the temporal algebraic operators) - ad-hoc solutions are: difficult, likely to be erroneous, not economical, not compatible Need for a general (not ad-hoc) solution e.g., “Why not just adding an additional (surrogate) attribute to keep all occurrences of telic tables separate?” Analogous to the question: “why not just adding an additional attribute to relational tables to deal with validity times?”
Ex.1 “Who was having an i.v. while John was having an i.v.?” Ex.2 “Who had a (complete) i.v. while having an i.v.”? Need for a general (not ad-hoc) solutionA further problem: aktionsart coercions E.g., progressive forms coerce telic statements into atelic ones “John had an i.v. infusions starting at 10:00 and ending at 10:50” “John was having an i.v. infusions at 10:30” What is the impact on TDB’s? In short: general problems need to be solved once-and-forall in a general (not ad-hoc) way
Our solution:Point-based + Interval-based Semantics Interval-based semantics for telic facts (e.g., i.v. infusion) [10:00 – 10:50] <John, Y> [10:40 – 10:55] <Mary, Z> [10:51 – 11:30] <John, Y> …… Point-based semantics for atelic facts (e.g., temperature > 38) 10:00 <John> 10:01 <John> 10:02 <John>, <Mary> …… Coercion functions to switch from a interval to point semantics, and viceversa
Telic tables for telic facts (e.g., i.v. infusion) Atelic tables for atelic facts (e.g., temperature > 38) Our solution:Data model: telic + atelic tables Also non-temporal (snapshot) tables
Our solution:Three-sorted algebra Telic algebraic operators Atelic algebraic operators Non-temporal (snapshot) operators Coercion functions [Terenziani & Snodgrass, 04] IEEE Transactions on Knowledge and Data Engineering, 16(4), 540-551, April 2004.
Our solution:Query language Extension to TSQL2: AIME’05 paper Ex. Who had one (complete) i.v., while John was having an Y i.v.? TELIC SELECT P2.P_CODE FROMT_PHLEBO (ATELIC PERIOD) AS P, T_PHLEBO (PERIOD) AS P2 WHERE P.P_CODE='John' AND P.Drug='Y' AND P CONTAINS P2 Limited extensions to the language are sufficient (“substantial” extensions in the semantics)
ATELIC(T_PHLEBO) P P.P_CODE=‘John’, … P2 P’ CONTAINS R1 TELIC Our solution: Query language TELIC SELECT P2.P_CODE FROMT_PHLEBO (ATELIC PERIOD) AS P, T_PHLEBO (PERIOD) AS P2 WHERE P.P_CODE='John' AND P.Drug='Y' AND P CONTAINS P2 T_PHLEBO Principled and general solution, and increased expressiveness without much user effort (only limited extensions to the query language)!