150 likes | 271 Views
CS240A: Databases and Knowledge Bases Time Ontology and Representations. Notes From Chapter 5 of Advanced Database Systems by Zaniolo, Ceri, Faloutsos, Snodgrass, Subrahmanian and Zicari. Morgan Kaufmann, 1997. Carlo Zaniolo Department of Computer Science
E N D
CS240A: Databases and Knowledge BasesTime Ontology and Representations Notes From Chapter 5 of Advanced Database Systems by Zaniolo, Ceri, Faloutsos, Snodgrass, Subrahmanian and Zicari. Morgan Kaufmann, 1997 Carlo Zaniolo Department of Computer Science University of California, Los Angeles
Time Properties • Structure • Boundedness • Density • Time Data Types • Time and Facts
Time Structure • Linear Time • Branching Time • Directed Acyclic Graph • Periodic/Cyclic Time—e.g. the week. • Boundedness of Time • Boundedness • Unbounded • Time origin exists (bounded from the left): from now on • Bounded time (bounds on both ends)—the end of time.
Time Density • Discrete: • Time line is isomorphic to the integers • Timeline composed of a sequence of nondecomposable time periods of some fixed, minimal duration, termed chronons. • Between each pair of chronons is a finite number of other chronons. • Dense: (difficult to implement) • Time line is isomorphic to the rational numbers. • Between any two chronons is an infinite number of other instants. • Continuous: (very difficult to implement) • Time line is isomorphic to the real numbers. • Between each pair of instants is a infinite number of other instants. • A bounded discrete representation of time is the simplest option used in SQL-2 and most temporal DBs.
Internal Representation of Time • UNIX time is specified as the number of seconds since January 1, 1970. (Unix Epoch) • Windows NT time is specified as the number of 100 nanosecond intervals since January 1, 1601. • There are 134,774 days (or 11,644,473,600 seconds) between these dates. • Nt2unix Divide by 10,000,000 and subtract 11,644,473,600. • unix2nt:Add 11,644,473,600 and multiply by 10,000,000. • SQL standards are independent from OS and based on: • Date • Time • Timestamp: Date || Time • Support for leap seconds, Julian calendar (almost)
Date and Time Data Types in SQL-2 • Date: four digits for the year and two for month and day. Multiple formats allowed: • E.g., 2001-12-08 or 12/08/2001 or 12.08.2001 • ISO, USA, EUR, JIS representations supported---DBA selects which one is used in specific system. • Time: 2 digits for hour, 2 for minutes, and 2 for seconds, plus optional fractional digits (system dependent). E.g., 13:50:00, 13:50, 1:50 PM denote the same time. • These are external representations: internal representation is unique, independent of external ones.
Seconds The Maximum Second [Trudy Pelzer, DBAzine, 2005-04-18 09:41 PM] The largest second in a minute is given as 59, since the seconds in a minute are generally numbered from 0 to 59. Although that might seem obvious and uncontroversial, it's not 100% true: • First exception: DB2 happens to allow a time of '24:00:00' for midnight. ('24:00:01' is not legal.) This is an extension that is probably best avoided, because it's not compatible with other DBMSs, and because it's hard to distinguish '1999-01-01 24:00:00' from '1999-01-02 00:00:00'. Nevertheless, it might be necessary to correspond to legal documents that refer to "midnight on January 1" because in such documents midnight means the end of the day, not the beginning. • Second exception: At irregular intervals, but always as the last second at the end of a calendar quarter, the International Earth Rotation Service (IERS) declares a leap second. (June 30th and December 31st are the preferred dates; since the system was introduced in 1972, only dates in June and December have been used.) That means the day has (24 x 60 x 60 + 1) 86,401 seconds instead of the usual mere 86,400 – which is why counting seconds between two wide-apart timestamps is almost certainly unreliable. The SQL Standard actually allows for two leap seconds, so the last second on June 30 1972, for example, could have been '1972-06-30 23:59:61'. But in fact, it wasn't — the IERS has never declared a double leap second. SQL Server Magazine Leap Second for 2005: The year 2005 will be one second longer than a usual non-leap year. For the first time since 1998, the IERS has announced that the earth's rotation is running slow relative to atomic time standards, and the last minute of December 31 will contain 61 seconds in order to get things back on track: ftp://hpiers.obspm.fr/iers/bul/bulc/bulletinc.dat. • Hmmm, interesting. But how does this impact SQL Server? • All versions of SQL Server don’t support valid DATETIME values such as '2005-12-31T23:59:60.500' that occur during leap seconds. In addition, SQL Server's DATEDIFF function does not take into account leap seconds. Be prepared!
Time Datatype in SQL-2 • TIMESTAMP: date+time with six fractional digits for the second field. E.g., 2001-01-05-13.01.59.000000 • TIME(STAMP) WITH ZONE: offset according to UTC (universel temps coordonné) • INTERVAL: I.e. a time span. In DB2 is called a labeled duration. E.g. , 10 DAYS • Time expressions. Using the labeled duration in arithmetic: orderdate + 10 DAYS < CURRENT DATE --- this is valid, but CURRENT DATE - orderdate > 10 DAYS ---this is not valid. • CAST expressions: E.g. CAST(2 DAYS AS HOURS)returns 48 HOURS
Various Temporal Typesused in temporal DBs • A time instant is a time point on the time line. • An event is an instantaneous fact, i.e, something occurring at an instant. The event occurrence time of an event is the instant at which the event occurs in the real world. • An instant set is a set of instants. • A time period is the set of time instants between two instants (start time and end time).
Periods versus Time Intervals • Periods are frequently called (time) intervals. • but this conflicts with the SQL data type INTERVAL and we will try to avoid it. • In SQL:1999, a time interval is a directed duration of time. A duration is an amount of time with a known length, but no specific starting or ending instants.Also called a span. • A positive interval denotes forward motion of time; a negative interval denotes backwards motion of time.
Valid Time and Transaction Time • Valid Time of a fact: when the fact is true in the modeled reality • Transaction Time of a fact: when it was recorded in the database. • Thus we have four different kinds of tables: • Snapshot • Valid-time • Transaction-time • Bitemporal
Example: Tom's Employment History • On January 1, 1984, Tom joined the faculty as an Instructor. • On December 1, 1984, Tom completed his doctorate, and so was promoted to Assistant Professor effective retroactively on July 1, 1984. • On March 1, 1989, Tom was promoted to Associate Professor, effective July 1, 1989 (proactive update).
Queries and Updates • A transaction time table is append only it keeps the history of the updates made on the database. • Transaction time tables supports rollback/flashback queries, such as: • On October 1, what rank was our database showing for Tom? • A valid time table can be updated: e.g., Tom’s past record is changed once his rank is changed retroactively. • Valid time tables support historical queries, such as: • What was Tom’s rank on October 1 (according to our current database)?
Bitemporal Tables • Bitemporal Tables are appendonly and supports queries of both kinds (rollback& historical) such as: • On October 1, 1984, what did we think Tom's rank was at that date? • TSQL2: SELECT Rank FROM Faculty AS F WHERE Name = 'Tom‘ AND VALID(F) OVERLAPS DATE '19841001‘ AND TRANSACTION(F) OVERLAPS DATE '19841001'
Desiderata for a Temporal Data Modeland Query Language • Capture the semantics of timevarying information • Retain simplicity of the relational model: Strict superset of the relational model • Present all the information concerning an object in a coherent fashion • Ensure ease of implementation • Ensure high performance Regrettably SQL-2 falls way short of these desiderata, and extending SQL to achieve them has also proven to be very difficult!