260 likes | 444 Views
Normalisation Introduction. Outline. motivation: database design – validation redundancy / update anomalies basis: functional dependencies (FDs) definitions examples concepts and terminology semantic assumtpions (more) advanced theoretical issues (in brief) normal form: illustration
E N D
Outline • motivation: • database design – validation • redundancy / update anomalies • basis: functional dependencies (FDs) • definitions • examples • concepts and terminology • semantic assumtpions • (more) advanced theoretical issues (in brief) • normal form: illustration • definition • example
Database Design • relational model • how do we know whether a relational model is good or not? • how do we know whether a relation is well designed or not? • normal forms • a (semi-)formal way of validating a relational model, from the point of view of reducing the redundancy of data
Redundancy Student-Modules
Redundancy Student-Modules
Redundancy • a relation contains redundant data if it stores the same information more than once • a relational model may have redundancy and at the same time have no redundant relations • how? give an example • redundant data may cause update anomalies and may lead to inconsistencies • normalisation deals with redundant data at the level of individual relations
Update anomalies - insertion • insert the fact that 50012 takes “Networks - Introduction”; the name of the student and the name of the personal tutor have to be entered as well; this is prone to errors inconsistent data • the structure of the relation does not prevent such errors from happening • can you identify other kinds of update anomalies on this relation?
Update anomalies - deletion • delete the fact that 41002 takes “HCI”, in the original table; relevant information will be also deleted - about “T.A Flo” and about “HCI” • the structure of the relation does not prevent such errors from happening
Update anomalies - modification • it is possible to modify an attribute and to bring the relation in an inconsistent state; e.g. it is possible (e.g. by mistake) to modify the value of “Database Systems” to “1/2cu” in just some rows; such situations must be avoided • the structure of the relation does not prevent such errors from happening
Update anomalies • update anomalies • may lead to inconsistent data • are caused by redundancy • normal forms • are a “measure” of the amount of redundancy in a relation • are defined on the basis of a simpler concept: functional dependencies • normalisation • a way of transforming relations to eliminate redundancies • no data should be lost/changed through normalisation
Functional dependency (FD) R - relation, X and Y - subsets of attributes of R X Y iff in every possible legal value of R each X-value has a single Y-value associated
Examples (S_id, S_name, P_tutor, Module, Val, Res) S_id S_name S_id P_Tutor S_id S_id (S_id, S_name) P_tutor (S_id, S_name, P_tutor) P_tutor Module Val (S_id, Module) Res (S_id, S_name, P_tutor, Module, Val) Res
Concepts • FD is a semantic concept • you must understand the meaning of the attributes • determinant / dependent • trivial / non-trivial • left-irreducible • yes: (S_id, S_name) P_tutor • no: (S_id, Module) Res • closure • irreducible set
Semantic assumptions • FDs are “deduced” from the semantic assumptions (that define the application) • (patient, symptom, doctor, practice, diagnosis) • a patient is seen only by one doctor • patient doctor • a patient, for a given symptom, is seen by only one doctor • patient, symptom doctor • a doctor gives only one diagnosis for a symptom of one patient • patient, symptom, doctor diagnosis
Operations with FDs • inference rules • augmentation: if AB then ACBC • transitivity : if A B and BC then AC • decomposition: if ABC then AB and AC • union: if AB and AC then ABC • composition: if AB and CD then ACBD
Functional diagram S_name S_id S_id Res P_tutor Module Module City
FDs and Keys • define a candidate key (CK) in terms of FDs • how is a FD expressed in a relation?
Closure • all FDs that can be derived from a given set S • notation S+ • Armstrong’s inference rules • for a partial set refer to slide “Operations with FDs”
Irreducible set • S1 covers S2 iff S2+ S1+ • S is irreducible iff • RightHandSide of every FD is non-composite • all FDs in S are left-irreducible • no FD ca be discarded from S without changing S+ • a database that enforces S enforces, in fact, S+ • the irreducible set of S is S’ iff • S’ - irreducible • S’+ = S+ • more efficient to work with the irreducible set
1NF – First Normal Form • not based on FDs • a relation is in 1NF if and only if all the domains of its attributes contain only scalar values • the relational model can only contain relations in 1NF
2NF – Second Normal Form • a relation (with just one CK) is in 2NF if and only if it is in 1NF and there is no FD from a subset of attributes of the PK to a non-key attribute
2NF – Examples • not 2NF • (S_id, S_name, S_add, M_id, M_name, M_type, M_val, Result) • why? • 2NF • (S_id, S_name, S_add) • (M_id, M_name, M_type, M_val) • (S_id, M_id, Result)
3NF – Third Normal Form • a relation (with just one CK) is in 3NF if and only if it is in 2NF and there is no FD between non-key attributes
3NF - Examples • not 3NF • (M_id, M_name, M_type, M_val) • why? • 3NF • (M_id, M_name, M_type) • (M_type, M_val)
Normalisation • the process of transforming a relation with redundancies into an “equivalent” set of relations that have less redundancies • equivalent – non-loss decomposition
Conclusion • redundancy • update anomalies • normal forms – solution • functional dependencies • normal forms – simple definitions and examples