690 likes | 923 Views
The TelicAtelic Distinction in Temporal Databases. Paolo Terenziani Institute of Computer Science, DISIT, Univ. Piemonte Orientale “A. Avogadro”, Viale Teresa Michel 11, Alessandria, Italy terenz@mfn.unipmn.it. Acknowledgements: R.T. Snodgrass
E N D
The Telic\Atelic Distinction in Temporal Databases Paolo Terenziani Institute of Computer Science, DISIT, Univ. Piemonte Orientale “A. Avogadro”, Viale Teresa Michel 11, Alessandria, Italy terenz@mfn.unipmn.it • Acknowledgements: • R.T. Snodgrass • A. Bottrighi, V. Khatri, G. Molino, S. Ram, M.Torchio • L. Lesmo, P. Torasso ER 2012 – ECDM-NoCoDA Workshop – October 15th, Florence
The Telic\Atelic Distinction in Temporal Databases • Summary: • Telic\atelic dichotomy (Linguistics) • (Data) semantics of relational temporal databases • The impact of the dichotomy on TDBs • New data model (semantics) • New query language (semantics: relational algebra) • Extensions to SQL & implementations • Conceptual modeling • Conclusions & open issues
The Telic\Atelic Distinction Aristotle’s “Categories” Telic facts: facts with a goal or culmination (e.g., John slept) Atelic facts: facts without goal\culmination (e.g., John build a house) (in Greek, “telos”=“goal”, “a” as a prefix indicates negation) • Studied\used in • Philosophy • Linguistics • Cognitive Science • … • - Logics, Artificial Intelligence
The Telic\Atelic Distinction Deeply rooted in the Western culture E.g., from Cognitive studies: the aktionsart distinctions (and, in particular, the telic/atelic distinction) play a fundamental role in the acquisition of verbal paradigms by children: - [Bloom et al. 1980]: English, - [Bronckart & Sinclair, 1973]: French, - [Aksu, 1978]: Turkish.
The Telic\Atelic Distinction WHY SHOULD WE CARE? Moens and Steedman: “Effective exchange of information between people and machines is easier if the data structures that are used to organize the information in the machine correspond in a natural way to the conceptual structures people use to organize the same information” … BUT ALSO “MORE PRACTICAL” REASONS (current TDBs cannot cope correctly with telic facts!!)
The Telic\Atelic Distinction in Linguistics Linguistic sentences can be classified into different aktionsart classes according to their linguistic behavior and to their semantic properties E.g., [Vendler, 1967] Activities (e.g., “John slept”) Accomplishments (e.g., “John built a house”) Achievements (e.g., “John reached the top of the mountain”) States (e.g., “John had fever”)
The Telic\Atelic Distinction in Linguistics Different linguistic behavior E.g., progressive form Activities (e.g., “John was sleeping”) Accomplishments (e.g., “John was building a house”) Achievements (e.g., “John was reaching the top”) States (e.g., “John was having fever”)
The Telic\Atelic Distinction in Linguistics Semantic properties E.g., [Dowty, 1986] states vs. accomplishments: (1) A sentence is stative if it follows from the truth of at an interval I that is true at all subintervals of I (e.g., if John was asleep from 1:00 to 2:00 PM, then he was asleep at all subintervals of this interval: be asleep is a stative). (2) A sentence is an accomplishment/achievement (or kinesis) if it follows from the truth of at an interval I that is false at all subintervals of I (e.g., if John built a house in exactly the interval from September 1 until June 1, then it is false that he built a house in any subinterval of this interval: building a house is an accomplishment/achievement)
The Telic\Atelic Distinction in Linguistics Semantic properties Property 1 of states (and activities): downward inheritance. Property 2 of states (and activities): upward inheritance. E.g., if John was asleep from1:00 to 2:00 and from 2:00 to 3:00, then he was asleep from 1:00 to 3:00 NOTICE: neither Property 1 nor Property 2 holds for TELIC sentences (accomplishments)
The Telic\Atelic Distinction in Linguistics Language is FLEXIBLE “Basic” sentences can be classified as activities, accomplishments, achievements, and states Languages provides linguistic tools to switch from one class to the other [Verkuyl], [Moens & Steedman] e.g., when applied to an accomplishment, the progressive form converts it into an activity, since it strips out its culmination (i) “John built a house” is TELIC (ii) “John was building a house” is ATELIC (indeed, the culmination is not implied by (ii))
The Telic\Atelic Distinction IMPACT ON TEMPORAL RELATIONAL DBs Definition (Telic\Atelic facts). Atelic facts(data) are facts (data) for which both downward and upward inheritance hold; Telic factsare facts for which neither downward nor upward inheritance hold.
The Telic\Atelic Distinction IMPACT ON TEMPORAL RELATIONAL DBs • three distinctions: • Representation versus semantics of the language (concrete vs. abstract databases [Chomicki, 1994]) • Data language versus query language. • Data semantics versus query semantics
DATA SEMANTICSBCDM (Bitemporal Conceptual Data Model [Jensen & Snodgrass, 96]) • Temporal Domains • Time is linear and totally ordered • Chronons are the basic time unit • Time domains are isomorphic to subsets of the domain of Natural numbers DVT = {t1,t2, …, tk} (valid time) DTT = {t’1,t’2, …, t’h} {UC} (transaction time) DTT DVT (bitemporalchronons)
BCDM • Data • Attribute names: DA={A1, A2, …, An} • Attribute domains DD={D1, D2, …, Dn} • Schema of a bitemporal relation: • R = Ai1, Ai2, …, Aij T • Domain of a bitemporal relation: • Di1 Di2 … Dij DTT DVT • Tuple of a relation r(R): • x = (a1, a2, …, aj | tB)
BCDM Example. Relation Employee with Schema: (name,salary,T) “Andrea was earning 60K at valid times 10, 11, 12” Such a tuple has been inserted into Employee at time 12, and is current now (say now=13)” (Andrea, 60k | {(12,10), (12,11), (12,12),(13,10), (13,11), (13,12), ……}) VT 12 11 10 12 13 TT
BCDM Example. Relation Employee with Schema: (name,salary,T) “Andrea was earning 60K at valid times 10, 11, 12 Such a tuple has been inserted into Employee at time 12, and is current now (say now=13)” (Andrea, 60k | {(12,10), (12,11), (12,12),(13,10), (13,11), (13,12), (UC,10), (UC,11), (UC,12)}) VT 12 11 10 12 13 UC TT
BCDM Bitemporal relation: set of bitemporaltuples. Constraint: Value equivalent tuples are not allowed. (Bitemporal) DB: set of (bitemporal) relations
BCDM Data Semantics (another viewpoint) (12,10) {<Andrea,60K>} (12,11) {<Andrea, 60K>} (12,12) {<Andrea, 60K>, <John,50K>} (12,13) {<John,50K>} (13,10) {<Andrea,60K>} (13,11) {<Andrea, 60K>} …….. (UC,12) {<Andrea, 60K>}
BCDM HENCEFORTH: focus on VALID TIME 10 {<Andrea,60K>} {<Andrea,60K>} {<Andrea,60K>, <John, 50K>} 13 {<John, 50K>}
BCDM PROPERTIES Consistent extension (of “classical” SQL DB) A temporal DB is a set of “classical” DBs, one for each bitemporalchronon Uniqueness of representation (from the constraint about value equivalent tuples)
BCDM & al. POINT-BASED Data Semantics A temporal database is a function from time points to a standard (non-temporal) database “SNAPSHOT semantics” Artificial Intelligence \ logics: the truth of facts is evaluated at points in time
BCDMQUERY semantics Algebraic Operators (E.g. Union) r1B r2 = {z | (∃xr1∃yr2 x[A]=y[A]=z[A] ∧ z[T]=x[T] y[T]) (∃xr1 x[A]=z[A] ∧ (∃yr2 y[A]=z[A]) ∧ z[T]=x[T]) (∃yr2 y[A]=z[A] ∧ (∃xr1 x[A]=z[A]) ∧ z[T]=y[T])} • No value-equivalent tuple generated • (uniqueness of representation!) • Coalescing!
BCDM Reducibility ρtT rT ρtT (rT) opT op op(ρtT (rT)) ρtT = ρtT(opT (rT)) opT (rT)
An example of implementation: TSQL2(Snodgrass et al., 1995) Temporal attribute T four temporal attributes (TTS, TTE, VTS, VTE) Attribute value: a timestamp or UC Bitemporaltuple: A1,….An| TTS, TTE, VTS, VTE Bitemporal relation: set of bitemporaltuples
An example of implementation: TSQL2(Snodgrass et al., 1995) TSQL2 SEMANTICS BCDM 10 {<Andrea,60K>} {<Andrea,60K>} {<Andrea,60K>, <John, 50K>} 13 {<John, 50K>}
BCDM (TSQL2 & al.) vs. TELIC\ATELIC TSQL2 SEMANTICS 10 {<Andrea,60K>} {<Andrea,60K>} {<Andrea,60K>} Downward inheritance: ([10,12]) (10) (11) (12)
BCDM (TSQL2 & al.) vs. TELIC\ATELIC TSQL2 SEMANTICS 10 {<Andrea,60K>} {<Andrea, 60K>} {<Andrea, 60K>} {<Andrea, 60K>} Upward inheritance: ([10,11]) ([12,13]) ([10,11] [12,13])
BCDM (TSQL2 & al.) vs. TELIC\ATELIC Current temporal databases adopt BCDM semantics Naturally support Upward and Downward inheritance Naturally support ATELIC facts (data) WHAT ABOUT TELIC FACTS (DATA) ?
TELIC FACTS & BCDM (point-based) semantics E.g. Sue had an administration of 500 mg of cyclophosphamide (a cancer drug) starting at 1 and ending at 3 (inclusive) 1 {<Sue, cyclophosphamide, 500>} 2 {<Sue, cyclophosphamide, 500>} 3 {<Sue, cyclophosphamide, 500>} Downward inheritance is (erroneously) enforced! How many administrations? What is their duration? How many mg. administered?
TELIC FACTS & BCDM (point-based) semantics E.g. Sue had - an administration of 500 mg of cyclophosphamide starting at 1 and ending at 3 (inclusive), and - an administration of 500 mg of cyclophosphamide at 4 1 {<Sue, cyclophosphamide, 500>} 2 {<Sue, cyclophosphamide, 500>} 3 {<Sue, cyclophosphamide, 500>} 4 {<Sue, cyclophosphamide, 500>} Upward inheritance is (erroneously) enforced! How many administrations? What is their duration? How many mg. administered?
TELIC FACTS & BCDM (point-based) semantics 1 {<Sue, cyclophosphamide, 500>} 2 {<Sue, cyclophosphamide, 500>} 3 {<Sue, cyclophosphamide, 500>} 4 {<Sue, cyclophosphamide, 500>} As well known in Linguistics (and Artificial Intelligence, Logics) POINT-BASED SEMANTICS IS NOT EXPRESSIVE ENOUGH TO COPE WITH TELIC FACTS (DATA)
SYNTAX vs. SEMANTICS of data NOTICE: this discussion is INDEPENDENT OF THE IMPLEMENTATION (representation syntax) For instance, TSQL2 uses time intervals in the representation
SYNTAX vs. SEMANTICS of data But they are merely a compact representation for a set of point, since the semantics is 1 {<Sue, cyclophosphamide, 500>} 2 {<Sue, cyclophosphamide, 500>} 3 {<Sue, cyclophosphamide, 500>} 4 {<Sue, cyclophosphamide, 500>}
SYNTAX vs. SEMANTICS of data Needless to remember that … Answers to query must be provided on the basis of the SEMANTICS of data (independently of the representation syntax) 1 {<Sue, cyclophosphamide, 500>} 2 {<Sue, cyclophosphamide, 500>} 3 {<Sue, cyclophosphamide, 500>} 4 {<Sue, cyclophosphamide, 500>} How many administrations? What is their duration? How many mg. administered?
TELIC FACTS & BCDM (point-based) semantics As predicted, e.g., by the Linguistic literature POINT-BASED (snapshot, BCDM, …) SEMANTICS IS NOT ADEQUATE (expressive enough) TO COPE WITH TELIC FACTS INTERVAL-BASED SEMANTICS IS NEEDED !!
SEMANTICS for TELIC FACTS E.g. Sue had - an administration of 500 mg of cyclophosphamide starting at 1 and ending at 3 (inclusive), and - an administration of 500 mg of cyclophosphamide at 4 [1,3] {<Sue, cyclophosphamide, 500>} [4,4] {<Sue, cyclophosphamide, 500>} INTERVAL-BASED SEMANTICS: function from INTERVALS to facts
SEMANTICS for TELIC FACTS:INTERVAL-BASED SEMANTICS INTERVALS ARE PRIMITIVE AND ATOMIC NOTIONS (i.e., NOT A NOTATION FOR A SET OF TIME POINTS !!!)
SEMANTICS for TELIC FACTS:INTERVAL-BASED SEMANTICS DiSIT has - A 50K contract with IBM from 1 to 12 - A 50K contract with IBM from 7 to 18 [1,12] {<DiSIT,IBM,50K>} [7,18] {<DiSIT,IBM,50K>}
INTERVAL-BASED SEMANTICS vs. TELIC\ATELIC [1,3] {<Sue, cyclophosphamide, 500>} [4,4] {<Sue, cyclophosphamide, 500>} Does not imply (mean) that Sue had a (complete) cyclophosphamide administration, e.g., at 2 The administration was exactly from 1 to 3 (and nowhere else) NO downward inheritance: ([1,3]) ↛(1) ([1,3]) ↛(2) ([1,3]) ↛(3)
INTERVAL-BASED SEMANTICS vs. TELIC\ATELIC [1,3] {<Sue, cyclophosphamide, 500>} [4,4] {<Sue, cyclophosphamide, 500>} Does not imply (mean) that Sue had a (complete) cyclophosphamide administration from 1 to 4 She had two distinct administrations! NO upward inheritance: ([1,3]) ([4,4]) ↛([1,4])
(DATA) SEMANTICS FOR TEMPORAL DATABASES Both ATELIC and TELIC facts exist [Aristotle] … POINT-BASED semantics is needed for ATELIC facts INTERVAL-BASED semantics is needed for TELIC facts Two-sorted data model, where - Atelic relations have a POINT-BASED semantics - Telic relations have an INTERVAL-BASED semantics (independently of the chosen implementation\representation) P. Terenziani, Proc. TIME’00, pp. 191-199, 2000. P. Terenziani, R.T. Snodgrass, IEEE TKDE 16(5), pp. 540- 551, 2004.
QUERY SEMANTICS(ALGEBRA) • Queries must operate on: • Atelic relations • Telic relations • Telic and Atelic relations together • From Natural Language: • Flexibility is needed (conversion from\to telic\atelic)
ATELIC ALGEBRA e.g., BCDM r1A r2 = {z | (∃xr1∃yr2 x[A]=y[A]=z[A] ∧ z[T]=x[T] y[T]) (∃xr1 x[A]=z[A] ∧ (∃yr2 y[A]=z[A]) ∧ z[T]=x[T]) (∃yr2 y[A]=z[A] ∧ (∃xr1 x[A]=z[A]) ∧ z[T]=y[T])} Union between atelic relations Standard union between two sets of time points
ATELIC ALGEBRA UNION supports upward inheritance (COALESCING) A 10 {<Andrea,60K>} {<Andrea, 60K>} 12 {<Andrea, 60K>} 13 {<Andrea, 60K>} 10 {<Andrea,60K>} {<Andrea, 60K>} {<Andrea, 60K>} {<Andrea, 60K>}
TELIC ALGEBRA Terenziani & Snodgrass, TKDE 2004 r1T r2 = {z | (∃xr1∃yr2 x[A]=y[A]=z[A] ∧ z[T]=x[T] y[T]) (∃xr1 x[A]=z[A] ∧ (∃yr2 y[A]=z[A]) ∧ z[T]=x[T]) (∃yr2 y[A]=z[A] ∧ (∃xr1 x[A]=z[A]) ∧ z[T]=y[T])} Union between telic relations Standard union between two sets of time intervals E.g., {[10,15], [20,25]} {[5,30], [20,40]} = {[5,30],[10,15], [20,25], [20,40]}
TELIC ALGEBRA TELIC UNION does not support upward inheritance T [4,4] {<Sue, cycloph, 500>} [1,3] {<Sue, cycloph, 500>} [1,3] {<Sue, cycloph, 500>} [4,4] {<Sue, cycloph, 500>}
TELIC ALGEBRA In principle: a polymorphic adaptation of a “consensus” atelic algebra (e.g., BCDM algebra) where set operators on temporal elements operate on sets of time intervals instead that on sets of time points However, both commonsense and semantic restrictions might\should be considered in the definition of both atelic and telic operators
ATELIC\TELIC ALGEBRA Ex.1 TELIC CARTESIAN PRODUCT Atelic Cartesian Product involves the INTERSECTION of temporal elements (i.e., of sets of time points - semantic level) Its polymorphic TELIC adaptation would involve the INTERSECTION of sets of time intervals, giving in output only common time intervals e.g., {[10,15], [20,30]} {[10,15], [18, 40]} = {[10,15]} Is such an operation useful \ commonsense \ meaningless for users?
ATELIC\TELIC ALGEBRA Ex.2 TEMPORAL SELECTION (e.g., duration) All existing temporal algebrae are atelic. Indeed, most of them provide temporal selection (e.g., query about duration) But is duration something that can be evaluated “point-by-point” (snapshot-by-snapshot, i.e., in an atelic context) ? Indeed, duration regards the duration of minimal intervals covering convex sets of points! Thus, duration is about TIME INTERVALS, and thus regards the TELIC context.
ATELIC\TELIC ALGEBRA • Indeed, a lot of confusion in the literature, since • The atelic view is chosen (nice properties!) • However, telic operators are useful, and thus “improperly” supported Part of the confusion is probably due to failing to recognize the distinction between syntax (representation) and semantics: Many approaches adopt intervals in the representation, probably not considering the fact that this does not mean supporting intervals in their semantics