590 likes | 690 Views
DBMAN 1. Expectations Data modeling basics Normalization Correct decomposition. DBMAN 1. Expectations Data modeling basics Normalization Correct decomposition. Expectations. Three test s during the semester ( Oracle dialect) 1. Basic SQL command s 2. DML, DDL 3. PL/SQL
E N D
DBMAN1. Expectations Data modeling basics Normalization Correct decomposition OE NIK 2013
DBMAN1. Expectations Data modeling basics Normalization Correct decomposition OE NIK 2013
Expectations • Three testsduring the semester(Oracle dialect) • 1. Basic SQL commands 2. DML, DDL 3. PL/SQL • Create database that can be used for demonstration of SQL commands (choose a topic): • Create ER diagram to plan the data (at least three inter-connected entities) • Create the table structure diagram • Verify that we made good tables (BCNF normal form + correct decomposition) • Create SQL file that creates the database • Create test queries OE NIK 2013
Expectations • Test queries • Use DDL constraints (NOT NULL, PK, FK) • 3 simple single-table select • 3 simple single-table GROUP BY • 5 complex multi-table select • 5 complex subquery select • Must use at least 5 different row-level functions and all grouping aggregate functions • Demonstrate DML operations: 2 insert, 2 update, 2 delete statements – 3 of those must use sub-queries OE NIK 2013
Expectations • Documentation • Task description • Diagrams • For every SQL command: id number of query, task description, SQL command, output results • Deadline: last Tuesday of the semester OE NIK 2013
DBMAN1. Expectations Data modeling basics(http://en.wikipedia.org/wiki/Entity–relationship_modelhttp://okaram.spsu.edu/~curri/classes/12/spring-12/DB/StudyNotes/Ch6-FromERToTables/RelationalModelER2Tables.pdf) Normalization Correct decomposition OE NIK 2013
Logical data modelling • Hierarchical model tree • Sets graph • Relational data model Relations between tables (~Excel worksheets) • Object-oriented data model Encapsulation of data + commands (exists as a principle, can be well approached using ORM systems, but the storage is still relational) OE NIK 2013
Relational data model • Columns: data fields (field, attribute) • Rows: records • Several RELATED tables: database • - the order of rows doesn't matter - the order of columns doesn't matter- there can't be two completely identical rows (entities must be different) OE NIK 2013
Relations • One-to-one connection (1:1) • Quite rare (people + entrepreneurs – dependencies???) OE NIK 2013
Relations keys One-to-many relation (1:N) OE NIK 2013
Keys • Primary key, foreign key • Simple key, complex key • When we use complex keys, then the key must not be split up (Complex keys shouldn't be used this semester) • Relations between keys = relations between tables = database • We'll discuss the actual SQL implementation of keys later OE NIK 2013
Relations Many-to-many relation (M:N) OE NIK 2013
Elements of an ER diagram • Relation + Entities • Attribute • Relation-attribute • Primary key OE NIK 2013
Programmers, Projects, Modules • Entities ER Diagram Relations FieldsorEntities Fields ER Diagram Relations • Fields of Coders? • Fields of Modules? • Fields of Projects? • Relations • 1 module 1 project, 1 project several modules1:N • 1 coder several modules, one module several codersN:M extra attributes for the relation? OE NIK 2013
From ER to Tables (examples) OE NIK 2013
From ER to Tables (examples) OE NIK 2013
From ER to Tables (examples) OE NIK 2013
Table structure diagram OE NIK 2013
Order number Item number Item name Amount Dead-line Address Name 991201 0001 Barna kenyér 25 991201 Bpest. Barna u.1 Julius Meinl 991201 0001 Barna kenyér 35 991201 Bpest. Barna u.1 Julius Meinl 991202 0002 Fehér kenyér 24 991201 Bpest Fehér u.2 Penny Market Example for a typical excel spreadsheet • REDUNDANCY, INCONSISTENCY • AIM: decrease these factors by introducing new tables based on the relations between the fields OE NIK 2013
DBMAN1. Expectations Data modeling basics Normalization (http://en.wikipedia.org/wiki/Database_normalization) Correct decomposition OE NIK 2013
Normalization • The method described here has a practical approach, and uses tables to show how the tables are split up • Usually normalization is done BEFORE we have any data, so BEFORE we have tables • In addition, it is basically a mathematical process to find DEPENDENCIES amongst the attributes in a relation:Frelation_name: {key1, key2} {field1, field2, field3} • The same relation can be written as ER diagram too: 1 square + circles + underline in two circles OE NIK 2013
Base model • List of available/possible data fields that we have at our disposal – without any system or principles, just "look around and list what we have" • For example: we want to store the orders in our food shop • Item number, Item name, Price, VAT category, VAT percentage, Order number, Amount, Shipping address, Color, Weight, Country, Packaging mode, etc etc etc OE NIK 2013
0NF • We eliminate the unnecessary data we only keep the data we really want to store let's try and store these data in a table • Item number, Item name, Price, VAT category, VAT percentage, Order number, Amount, Shipping address, • Color, Weight, Country, Packaging mode OE NIK 2013
0NF Forders: {Item number, Item name, Price, VAT category, VAT percentage, Amount, {Order number, Shipping address}} OE NIK 2013
1NF „In every row of the relation, there must be only atomic attributes.” • The table must not have merged rows/columns • We eliminate the merged cells or • We eliminate the complex data attributes OE NIK 2013
1NF Anomalies: modification, deletion, insertion !!! OE NIK 2013
2NF Functional Dependency (FD):We speak about an FD if the value of an attribute is directly and clearly determined by the value of another attribute. It is always a one-way relationship. For example:By knowing the passport number, we can determine all other data of a person (name, mother's name, birthdate, birthplace, etc.), so it determines all data for the entity person. In the lack of such a number, we should use alternate combination of attributes (e.g. Full name + Birthdate + Mother's maiden name) OE NIK 2013
2NF „Every secondary attribute must functionally depend on the key of the relation” • Expectation: in a table, a field must be either a key field or a field that functionally depends on the key field(s) • We must eliminate the multiple FDs: we have to restructure the tables, and create several tables based on the FDs that we find • To do that, we have to write up the FDs OE NIK 2013
2NF • Fitems: {Item number} {Item name, Price, VAT category, VAT percentage} • Forders: {Order number} {Address} • Amount??? • ForderItems: {Item number, Order number} {Amount} • FD = Table • Primary key, Foreign key, Simple key, Complex key !!! OE NIK 2013
2NF Keys??? OE NIK 2013
3NF • Transitive Dependency: A non-key attribute is transitively dependent on the entity's key if it is also functionally dependent on another attribute in the relation „secondary attributes must NOT have transitive dependencies on the relation's key(s)” • Every field must depend on the key field, AND ONLY on the key field – no other dependencies are allowed in a table! • Currently: Fvat: {VAT category} {VAT percentage} • New FD = new table new FK in the items table OE NIK 2013
3NF Anomalies ??? OE NIK 2013
Benefits • Almost no redundancy, absolutely no unnecessary redundancy • The overall size is usually smaller • No anomalies, no data loss • Consistent and easy to expand or modify • Every field must depend on the key field AND only the key field data modification is easier OE NIK 2013
To summarize 0NF: if there are no unnecessary data 1NF: if 0NF and there are no complex attributes 2NF: if 1NF and every field is functionally dependent on the key(s) 3NF: if 2NF and there is no field that is transitively dependent on the key(s) [or: if 2NF and every field is functionally dependent on the key(s), BUT NOT ON OTHER FIELDS] BCNF: if 3NF and there is no inner FD between the any of the complex keys' fields 4NF, 5NF, DKNF ......... OE NIK 2013
To summarize • If we have a decomposition, we have to verify if it is BCNF or not. • A decomposition is BCNF if • Every relation has PK field(s) • There is no FD that crosses tables • Inside a table, the only FD is the Frelation_name: {key1, key2} {field1, field2, field3} • There is no dependency inside any of the PK fields (thus, it's good if we don't use complex keys) OE NIK 2013
DBMAN1. Expectations Data modeling basics Normalization (http://en.wikipedia.org/wiki/Database_normalization) Correct decomposition OE NIK 2013
Armstrong axioms • A functional dependency: • Is reflexive: The simple key functionally depends on itself. Any subset of a complex key functionally depends on the complex key (if , then ) • Is transitive: If and then • Is expandable: If , then • After splitting up the relation into smaller tables, we have to use these axioms to verify if our decomposition is usable/correct OE NIK 2013
Correct decomposition • Two aspects: • Dependency preserving decomposition : we had the original relation, and then we wrote up the FDs. After the decomposition, can we still access all the original FDs? • Lossless decomposition: we had the original relation with the original fields. After the decomposition, using the FDs that we wrote up, can we reconstruct the original relation? • 3NF and BCNF are ALWAYS lossless decompositions! • The 3NF is always dependency preserving, there are some cases where it is impossible with the BCNF! OE NIK 2013
Dependency preserving decomposition • Given an R relation and an F set of depencenies, we say that a decomposition of R into R1, R2, …Rk is dependency preserving if by projecting the original functional dependencies to R1, R2, …Rk we can get back the original set F • Given: • R=(A, B, C) , F={AB, BC} • Decomposition: R1=(A, C) R2=(B, C) • Projected FDs in R1: F1={AC} (transitive) • Projected FDs in R2: F2= {BC} • F1 union F2 = {BC, AC}. The original AB is lost OE NIK 2013
Dependency preserving decomposition • Given: • R=(A, B, C) , F={AB, BC} • Decomposition: R1’=(A, B) R2’=(B, C) • Projected FDs in R1’ : F1={AB} • Projected FDs in R2’ : F2= {BC} • F1 union F2: {AB, BC} , this is correct OE NIK 2013
Dependency preserving decomposition • R={Item number, Item name, Price, VAT category, VAT percentage, Amount, Order number, Shipping address} • F={ Fitems: {Item number} {Item name, Price, VAT category, VAT percentage}, Forders: {Order number} {Address}, ForderItems: {Item number, Order number} {Amount}, Ftax: {VAT category} {VAT percentage} } • Decomposition: R1’=(Item number, Item name, Price, VAT category) R2’=(Order number, Shipping address) R3’=(Item number, Order number, Amount) R4’=(VAT category, VAT percentage) • Projected FDs are the same as the original FDs, since we used the FD=table approach no FD is lost OE NIK 2013
Lossless decomposition • In a lossy decomposition, the original data is lost BAD structure, a usable decomposition must always be lossless! • We use a simple algorithm, do determine if I can get back from the small tables to the original big table • Basically, we check which other fields can be "attached" to our existing fields using the FDs that we have written up OE NIK 2013
EXAMPLE R1 (F1, F2, F3, F5) R2 (F1, F3, F4) R3 (F4, F5) FD1: {F1} {F3 F5} FD2: {F5} {F1 F4} FD3: {F3 F4} {F2} • Starting point: a 3x5 table with unique values b(i,j) (rows=relations, columns=attributes) • Then modify the rows so that they represent the relations: if field j is present in relation i, the b(i,j) is replaced with a(j) OE NIK 2013
EXAMPLE R1 (F1, F2, F3, F5) R2 (F1, F3, F4) R3 (F4, F5) OE NIK 2013
EXAMPLE FD1: {F1} {F3 F5} OE NIK 2013
EXAMPLE FD2: {F5} {F1 F4} OE NIK 2013
EXAMPLE FD3: {F3 F4} {F2} OE NIK 2013
Lossless? • Decision: If there is no FD to apply, then we check if there is a row in a table with nothing but aj values or not • If there is a row like that, then the decomposition is lossless. If not, then the decomposition is lossy • We don't even have to loop through all FDs: we can stop if we reach to a point where we see that a good row is created with nothing but aj values OE NIK 2013