320 likes | 481 Views
ZEIT2301- Database Design Normalisation. School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays only) k.merrick@adfa.edu.au. Topic 08: Database Normalisation. Designing a ‘good’ relational database Normal forms
E N D
ZEIT2301- Database Design Normalisation School of Engineering and Information Technology UNSW@ADFA Dr Kathryn Merrick Bldg 16, Rm 212 (Thursdays and Fridays only) k.merrick@adfa.edu.au
Topic 08: Database Normalisation • Designing a ‘good’ relational database • Normal forms • First normal form • Second normal form • Third normal form
A Motivating Example • Suppose we want to develop a database of bike statistics for a program that permits users to find out whether or not each bike can stoppie.
A file for such data might be… This column contains two types of data (bike type and number of passengers) This column contains duplicate (redundant) data Wheelbase Can stoppie? Bike description Harley 1.588 0.724 0.9 false Harley one pax 1.588 0.775 0.9 false Honda 1.458 0.831 0.9 true Honda one pax 1.458 0.881 0.9 true Centre of mass height Coefficient of friction Data in this column is not even bike dependent
Relations Relations are data tables with rows and columns Column Row
Attributes Attributes are named columns of relations Out bike example has five attributes:
Domains The domain of an attribute is the set of allowable values for that attribute Attribute domains in our bike example:
Records (Tuples) A record or tuple is one row of data in a relation Our bike example has four records
Relational Databases A relational database is a collection of normalised relations Normalisation is a technique for producing a set of tables that conform to desirable redundancy and integrity constraints There are three common normal forms: First normal form Second normal form Third normal form
First Normal Form (1NF) A table is in 1NF if: The intersection of every column and record contains only one value and It has a primary key attribute that uniquely identifies every record
Keys (Revision) Superkey: a column or set of columns that uniquely identifies a record in a relation Candidate key: a superkey with the minimum number of columns Primary key: the candidate key selected for identification purposes Foreign key: a column or set of columns in one table that matches a candidate key of another table
Decomposition to 1NF Remove multi-valued attributes Add a primary key 12
Second Normal Form (2NF) A table is in 2NF if: It is in 1NF and The values in each non-primary-key column depend on value in all primary key columns (ie: not a subset of the primary keys)
Decomposition for 2NF Remove non-primary key attributes that are not fully functionally dependant on the primary key Place them in a new relation with the part of the primary key on which they are functionally dependant (i.e. their determinant) Consider replacing compound primary keys with non-compound keys 15
Full Functional Dependency Examine the non key attributes in “creditRecord”: address employer interestRate limit From the FDs given, the attributes address, employer and interestRate are not dependent on the whole primary key The attribute limitis fully functionally dependent on the primary key creditRecord(customer, creditCard, address, employer, limit, interestRate) 16
Third Normal Form (3NF) A table is in 3NF if: It is in 1NF and It is in 2NF and The values in each non-primary-key column depend on values in only the primary key columns
Transitive Functional Dependency Cnsider a relation with attributes A, B, and C. If B is functional dependent on A (A B), and C is functional dependent on B (B C), then C is transitively dependent on A. A B, B C If any non-key attribute is transitively dependent on the primary key, the relation is not in 3NF. 19
“Codd’s Law of Normalization” Thou shalt depend upon the key (1NF), the whole key (2NF), and nothing but the key (3NF)! 21
Normalization Exercise: 1NF? member(memberNo, name, homeCity, hobby, sport, sportHQ) 22
Normalization Exercise: 1NF member(memberNo, name, homeCity,hobby, sport, sportHQ) memberHobby(memberNo, hobby) Both tables are in 1NF: All attributes are single valued and depend on PK Session 1, 2009 23
Normalization Exercise: 2NF? member(memberNo, name, homeCity, sport, sportHQ) Do all non-key attributes depend on the whole PK? FDs: memberNo name, homeCity, sport, sportHQ sport sportHQ 24
Normalization Exercise: 2NF member(memberNo, name, homeCity, sport, sportHQ) • Do all non-key attributes depend on the whole PK? • PK is not composite. • All attributes depend on the PK. • Table is in 2NF FDs: memberNo name, homeCity, sport, sportHQ sport sportHQ Session 1, 2009 25
Normalization Exercise: 3NF? member(memberNo, name, homeCity, sport, sportHQ) Are there any transitive dependencies between non-key attributes? FDs: memberNo name, homeCity, sport, sportHQ sport sportHQ 26
Normalization Exercise: 3NF member(memberNo, name, homeCity, sport) sport(sport, sportHQ) Decompose based on transitive dependencies. NB. Maintain a link between tables FDs: memberNo name, homeCity, sport, sportHQ sport sportHQ Session 1, 2009 27
Normalization Exercise: 3NF • sport table: • 1NF (all attributes single valued and dependant on PK) • 2NF (attribute depends on the whole key) • 3NF (no transitive dependencies between non-key attributes) sport(sport, sportHQ) FDs: memberNo name, homeCity, sport, sportHQ sport sportHQ Session 1, 2009 28
Normalization Exercise: 2NF 3NF memberHobby(memberNo, hobby) • memberHobby table: • 1NF: All attributes are single-valued • 2NF: Table is all key. No non-key attributes so table is in 2NF • 3NF: No non-key attributes so no transitive dependencies between them! 29
Summary After today’s lecture you should be able to: Design a normalised relational database in First normal form Second normal form Third normal form