80 likes | 240 Views
Crime Ref. Ward code. Name. Population. date. Ward location. Deprivation. 1. WARD. N. CRIME. N. In district. N. 1. 1. 1. N. DISTRICT. Grid ref. Date charged. Name. X. Y. Perpetrated by. Pub in ward. Lives in. Home office code. 1. N. CRIME TYPE. PUBLIC HOUSE. N.
E N D
Crime Ref. Ward code Name Population date Ward location Deprivation 1 WARD N CRIME N In district N 1 1 1 N DISTRICT Grid ref. Date charged Name X Y Perpetrated by Pub in ward Lives in Home office code 1 N CRIME TYPE PUBLIC HOUSE N Gender Pub. Id. M Ethnicity OFFENDER HO_code Crime descriptor DoB Grid ref. Pub. name Offender id. Home Grid ref. X Y X Y FINAL ERM FOR CRIME DATABASE (CHEN NOTATION)
DDI 3: From ERM to relational database schema • Reminder of RDBMS concepts first: • Primary key (PK) – uniquely identifies each row in table • Foreign key (FK) – a column that provides a link to the PK of another table, FK-PK matches mediate “joins” • Attributes (columns) must be single-valued (i.e. no lists of values in cells – e.g. keywords) and atomic (i.e. not complex/compound) • Rules for translating ERM to RDBMS: • Entities tables • Attributes columns • 1 column for each component of complex attribute • Multi-valued attributes need table of their own • 1-to-1 and 1-to-many: add FKs to existing tables • Many to many relationships: need table of their own with two FKs linking the two entity types together
Join attributes are indicated, i.e. FK/PK links TABLES FOR THE CRIME DATABASE (“database schema”) DERIVED FROM THE ERM NB: No table needed for DISTRICT as no specific data is required
Normalization – slightly simplified • Functional dependency (FD): • An attribute (X, the determinant ) functionally determines another (Y, the dependent attribute ) if the X value necessarily “fixes” the value of the Y attribute • E.g. HO_code -> crime_descriptor, Offender_ID -> DoB • NB: vice versa generally not the case • Relations (tables) should always be in something called Boyce-Codd Normal Form (BCNF) • Informally, this means that there should be no FDs amongst the non-key attributes • i.e. all non-key attributes should be functionally dependent only on the primary key
Normalization cont’d • For example, if we had the following table for CRIMES: • Crime ref., HO_code, crime_descriptor, date etc. • This would not be in BCNF, because HO_code-> crime_descriptor • This is problematic in two ways: • Information is redundantly repeated (i.e. the crime_descriptor) wasting space • Updating is complicated. If, for instance, one of the crime_descriptors were reworded, every mention of it would have to be changed, rather than a single instance as in the normalized version • Doing an ERM should mean your database schema is in good shape. It is especially helpful in spotting many-to-many relationships, which always require a table of their own • But (here is the moral): • Always check your tables and if an FD is found amongst the non-key attributes, extract the FD and put it in a table of its own • Think of this as a health check!
Integrity rules • Entity integrity: • Every entity must have a primary key • Referential integrity: • There must be a matching value for every value of a foreign key in the relation “pointed at” by the FK • NB. The FK can have a NULL value • Both of these integrity rules will be automatically enforced by the RDBMS, e.g. • Inserting a record with a NULL PK will not be allowed, nor a duplicate value • Inserting a record with a value for a FK, for which no matching PK value can be found, will not be allowed
CREATING TABLES IN MS ACCESS The following SQL commands set up two of the tables we need for the CRIME database (NB. SQL is the general purpose language for manipulating data in a RDBMS, including creating tables and adding data) : CREATE TABLE CRIME_TYPE ( HO_CODE varchar(4) PRIMARY KEY, CRIME_DESCRIPTOR varchar(100)) CREATE TABLE CRIME( CRIME_REF varchar(8) PRIMARY KEY, HO_code varchar(4), FOREIGN KEY (HO_code) REFERENCES CRIME_TYPE(HO_code), CRIME_DATE date, X int, Y int, WARD_CODE varchar(8)) NB2 – WARD_CODE not yet designated as a FK, this can be done later with an ALTER TABLE command NB3 – there is actually a “wizard” for creating tables in Access, which you will use in the lab class NB – the CRIME_TYPE table has to be created first. Why?
PS. Multi-valued attributes Title ISBN • Key word is multi-valued – an arbitrarily ordered list of one to several words. Need two tables: • BOOK: ISBN, Author, Title • KEYWORD: ISBN, keyword • The latter has a separate entry for each keyword for a given book BOOK Author Key word