640 likes | 824 Views
Bitemporal Student Information System. Chris Hance Advisor: Dr. Farimah Fleschute. About Chris. Sr. Programmer for Polk County School Board, since April 2006 Microsoft Certified Database Administrator Former USF IT Program Tutor Absolutely no blog, MySpace, or Facebook. What is this?.
E N D
Bitemporal Student Information System Chris Hance Advisor: Dr. Farimah Fleschute
About Chris • Sr. Programmer for Polk County School Board, since April 2006 • Microsoft Certified Database Administrator • Former USF IT Program Tutor • Absolutely no blog, MySpace, or Facebook
What is this? • The School Board maintains a Student Information System. • Need student data for multiple school years, and records of who made changes for auditing. • Dr. Richard Snodgrass’ Developing Time-Oriented Database Applications in SQL appeared to be a solution.
What is this? • Project: Implement a Student Information System database, using Snodgrass’ Bitemporal model. • Data elements intentionally reduced. Entities need to exist, but can be sparse. • Notes and code available at:http://incautio.us/seniorproject/
Why might I care? • “Increasingly, businesses need real-time access to historical data.” • “For those IT managers who think that modeling and database development are commodity skills, the management of temporal data is the strongest disproof possible.” • -Tom Johnston, Randall Weis, DM Review Magazine
Topics • Bitemporal Definition • Domain Background (Student Data) • Bitemporal Implementation • Primary Key Constraints • Foreign Key Constraints • Conclusions
Bitemporal Definition • Most databases are “Snapshot” • Values right now
“Actual” Time • When a change occurred • A student’s grade and/or school may change every year:
“Actual” Time • Snapshots don’t include history. • Can’t ask questions like:“What schools did this student attend?”“Were there any gaps in enrollment?”“Was the student retained (held back)?” • Actual-time history answers these questions.
“Record” Time • When a change occurred • A person cannot have multiple dates of birth, but the system may have the wrong date. • What if we made a decision, or printed a report based on invalid data?
“Record” Time • Allows auditing:Q: “Why did we enroll a four-year-old in Kindergarten?”A: “Because we had the month & day of birth switched.”
“Record” Time • What’s “forever”? • don’t use today for RecordEnd. • have to update it every day. • don’t pick a value in the near future. • Nulls make comparisons difficult. • Most DBs support 12/31/9999; close enough.
Bitemporal Tables Two “dimensions” of time: • When did it really change? • When did we record the change? • Mistyped grade level on 8/1/2005 • Corrected on 9/15/2005 • Two Record time rows for the same Actual time
Bitemporal Database • Every entity models Actual and/or Record time. • Some entities may not seem to require both. • even name and SSN can change. • Marriage, legal name change, citizenship. • Assumed all entities used both temporal dimensions for this project.
Topics • Bitemporal Definition • Domain Background (Student Data) • Bitemporal Implementation • Primary Key Constraints • Foreign Key Constraints • Conclusions
Domain Background • Tables to represent • student enrollment • schedules • attendance • Grades were omitted to reduce complexity.
School • Names change occasionally, and schools open and close. • Identifier • Current Name • Date Opened • Date Closed
Calendar / Term • Determine days and hours of operation. Some schools have multiple calendars (e.g. year-round). Calendars may be used by multiple schools. • Identifier • School Year • Start Date • End Date
SchoolCalendar • Many-to-many relationship between school and calendar. • Identifier • School Calendar ID • School ID
Student • A person enrolled in a school and/or scheduled course. • Student ID • First Name • Last Name • Date of Birth
SchoolEnrollment • A student registers at a particular school, and must be assigned to a calendar. • Identifier • School Calendar ID • Student ID • Start Date • End Date
Course • State defines a list of courses and titles. • Identifier (Local) • Name • Description • DOE Course Number
AdoptedCourse • A school includes a state-defined course in the school’s course catalog. • Elementary schools shouldn’t see Precalculus when selecting courses to schedule. • Identifier • School ID • Course ID
ScheduledCourse • One section of a course. • Identifier • School Calendar (or Term) ID • Adopted Course ID • Start Date / Time • End Date / Time • Seats
CourseEnrollment • Student enrollment in a Scheduled Course (section). A student may enroll late or drop early. • Identifier • Scheduled Course ID • Student ID • Start Date • End Date
CourseAttendance • Records of student absences and tardies in a scheduled course section. • Course Enrollment ID • Date • Attendance Code • Note
Domain Summary • Representative of a Student Information System (SIS). • Often used synthetic keys for convenience in joins. • Bitemporal joins are difficult enough • a common pattern reduces error • Not a perfect model. • would be a project in itself.
Topics • Bitemporal Definition • Domain Background (Student Data) • Bitemporal Implementation • Primary Key Constraints • Foreign Key Constraints • Conclusions
Bitemporal Implementation • First, add 4 columns to every table: • RecordBegin • RecordEnd • ActualBegin • ActualEnd • ActualBegin & ActualEnd may replace an existing StartDate & EndDate pair.
Record Begin / End • RecordBegin: date on which a bitemporal row was entered in the database. • Usually GetDate() / CURRENT_DATE. • RecordEnd: date on which a bitemporal row is corrected or deleted. • Starts at 12/31/9999. • Replaced with GetDate() on correction / deletion.
Actual Begin / End • ActualBegin: date on which a bitemporal row is effective. • Date of birth, date a school opened, date of legal name change for new name. • ActualEnd: date on which a bitemporal row is no longer effective. • Death, date school closed, date of legal name change for old name. • Still correct for the ActualBegin to ActualEnd range.
Bitemporal Implementation • Either Record or Actual time may overlap • If both overlap, we don’t know which record is correct and effective. • Was this student in 10th or 9th grade from 8/1/2005 to 6/5/2006? • If you ask after 9/15/2005, it’s not clear.
Bitemporal Implementation Also: • A student cannot be enrolled before Date of Birth or after Death. • A course cannot be scheduled after a school is closed, or without a calendar. • These are Integrity Constraints. • Primary Key • Foreign Key
Topics • Bitemporal Definition • Domain Background (Student Data) • Bitemporal Implementation • Primary Key Constraints • Foreign Key Constraints • Conclusions
Bitemporal Primary Keys • Built-in primary keys only check if each column/attribute matches another row. • Can’t use StudentID, because there can be multiple records for different Record time ranges. • Can’t use StudentID and all dates, because we could add 9/1/2008 to 12/31/9999, then add 9/2/2008 to 12/31/9999 for the same ID. • Per-column uniqueness not violated • Temporal uniqueness definitely violated
Overlapping Time s1.Begin s1.End s1 Overlap s2 s2.Begin s2.End s1.Begin < s2.Begins1.Begin < s2.Ends1.End > s2.Begins1.End < s2.End s1.Begin > s2.Begins1.Begin < s2.Ends1.End > s2.Begins1.End > s2.End s1.Begin s1.End s1 Overlap s2 s2.Begin s2.End
Overlapping Time (2) s1.Begin < s2.Begins1.Begin < s2.Ends1.End > s2.Begins1.End > s2.End s1.Begin s1.End s1 Overlap s2 s2.Begin s2.End s1.Begin > s2.Begins1.Begin < s2.Ends1.End > s2.Begins1.End < s2.End s1.Begin s1.End s1 Overlap s2 s2.Begin s2.End
Overlapping Time (3) s1.Begin < s2.Begins1.Begin < s2.Ends1.End > s2.Begins1.End < s2.End s1.Begin < s2.Begins1.Begin < s2.Ends1.End > s2.Begins1.End > s2.End s1.Begin > s2.Begins1.Begin < s2.Ends1.End > s2.Begins1.End > s2.End s1.Begin > s2.Begins1.Begin < s2.Ends1.End > s2.Begins1.End < s2.End • Overlap occurs when: • s1.Begin < s2.End • s1.End > s2.Begin, rewrite as s2.Begin < s1.End • Comparing s1.Begin to s2.Begin, or s1.End to s2.End is irrelevant.
Bitemporal Primary Keys • Violated if two rows in the same table match on Identifier, and overlap on both time dimensions: Select nullFrom Student s1Join Student s2 On s1.StudentID = s2.StudentID And s1.RecordBegin < s2.RecordEnd And s2.RecordBegin < s1.RecordEnd And s1.ActualBegin < s2.ActualEnd And s2.ActualBegin < s1.ActualEnd • Exception: the “two” records are really the same record, since we’re joining the table to itself.
Bitemporal Primary Keys If Exists (Select nullFrom Student s1Join Student s2 On s1.StudentID = s2.StudentID And s1.RecordBegin < s2.RecordEnd And s2.RecordBegin < s1.RecordEnd And s1.ActualBegin < s2.ActualEnd And s2.ActualBegin < s1.ActualEnd And Not ( s1.RecordBegin = s2.RecordBegin And s1.RecordEnd = s2.RecordEnd And s1.ActualBegin = s2.ActualBegin And s1.ActualEnd = s2.ActualEnd ) )
Triggers as Primary Keys • If a match exists to the query, then the data is already in the database. • How do we know which one is the original, and which to delete? • Much easier to use a database trigger. • Triggers are queries that run before, after, or instead of an insert, update, and/or delete. • After is easiest to implement, and can roll back (undo) the change.
Triggers as Primary Keys CREATE TRIGGER [dbo].[trStudentTemporalUniqueStudentID]ON [dbo].[Student]AFTER INSERT, UPDATEASBegin If Exists ( <Query goes here> ) Begin RAISERROR('Row overlaps an existing Student row for the same Record and Actual time period.',16,1); /*Note: this rolls back ALL open transactions. If stored procs are the only access mechanism, we may want to raise the error and let the stored proc manage transaction state*/ Rollback Transaction EndEnd
What about equality? • Trigger doesn’t stop duplicate inserts. • Can’t tell whether it’s a duplicate or the same record. • Just use a normal unique index. • The database can prevent exact duplicates. Alter Table StudentAdd Constraint ixStudentTemporalUniqueUnique Clustered (RecordEnd Desc, ActualEnd Desc, StudentID, RecordBegin, ActualBegin, Sequence)
Topics • Bitemporal Definition • Domain Background (Student Data) • Bitemporal Implementation • Primary Key Constraints • Foreign Key Constraints • Conclusions
Bitemporal Foreign Keys Nontemporal Foreign Keys • Every row in the child table must have exactly one corresponding record in the parent table. Bitemporal Foreign Keys • For each instant in Record and Actual time of the child table, exactly one corresponding row must exist in the parent table.
Bitemporal Foreign Keys Nontemporal: • An enrollment requires a student. Bitemporal: • A “current” Record-time enrollment (now to 12/31/9999) for a given Actual-time range (8/1/2008 to 6/5/2009) must have student: • Record-time rows from Now to 12/31/9999. • Actual-time rows for every record time in that range, covering the range from 8/1/2008 to 6/5/2009,
Nontemporal Foreign Key Example Student Enrollment
Bitemporal Foreign Key Example Student Enrollment
Bitemporal Foreign Keys • Now we see what they prevent. • Like Primary Keys, we can’t use the built-in version. • What if we try triggers again? • Let’s use school as the example…
Bitemporal Foreign Key Example Student Enrollment • We find out the user mistyped the date of birth. • Student.ActualBegin should be 1/6/2005.