1 / 27

Designing Relational Databases

Explore the problems faced in designing relational databases, such as repetition of information and inability to represent certain information. Find solutions including decomposition, lossless join, and normalization.

wgriffin
Download Presentation

Designing Relational Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Designing Relational Databases B.Ramamurthy B.ramamurthy

  2. Problems with Simple Relations • Repetition of information • Inability to represent certain information • Loss of information from decomposing relations B.ramamurthy

  3. Example • Consider a lending schema lending-schema (bname, bcity, assets, cname,loanNo, amt) when you want to insert a new loan do you have insert assets too? Ex: insert (Perry, Horseneck, 170000, Adams, l-13, 1500) B.ramamurthy

  4. Example (contd.) How will you represent branch details if there was no loan in the branch. We will need null values. Null values are difficult to handle. B.ramamurthy

  5. Solution : alternative design • Decomposition • Decompose lending-schema into branch-customer-schema(bcs) and customer-loan-schema(cls) • bcs(bname, bcity, assets, cname) • cls(cname, loanNo, amt) • Is this decomposition correct? Lossless? B.ramamurthy

  6. Decomposition • bcs natural-join cls should result in our original lending but does it? • For example, we cannot answer the question “Find all branches that have made loan < 100” on “lending-schema” will result in Mianus and Round Hill. • But the same query on bcs joined with cls results in Mianus, Round Hill and Downtown. • Some information was lost in decomposition that leads us to the incorrect answer. B.ramamurthy

  7. How can we make decomposition lossless? • Impose set of constraints (including functional dependencies). • For example • bname --> {bcity, assets} would have taken care of the problem above. B.ramamurthy

  8. Formally Stating.. • Let C be a set of constraints. • A decomposition {R1, R2, .. Rn} of a relation scheme R is a lossless join for if all relations r on schema R that are legal under C, r = R1 ( r) join R2 ( r ) … join Rn ( r ) B.ramamurthy

  9. Normalization using Functional Dependencies • Using functional dependencies we can define several “normal forms” that represent good database deign. • There are several normal forms : 1NF, 2NF, etc. • We will cover 3NF and Boyce-Codd Normal Form (BCNF) B.ramamurthy

  10. Desirable Properties • Lossless-join decomposition • Dependency preservation • Lack of redundancy B.ramamurthy

  11. BCNF • A relation schema R is in BCNF with respect to a set of functional dependencies if for all functional dependencies in F+ of the form a-->b at least one of the following holds, • a-->b is a trivial functional dependency (that is b is subset of a) • a is superkey for schema R B.ramamurthy

  12. Example customer-scheme(cname,cstreet,ccity) cname-->cstreet,ccity branch-scheme(banme, assets, bcity) bname-->asset,bcity loan-info-schema(bname, cname, loanNo, amt) loanNo-->amt,bname Is not BCNF B.ramamurthy

  13. Solution • Decompose loan-info-scheme into • loan-schema (bname, loanNo, amt) • Borrower-schema (cnmae, loanNo) B.ramamurthy

  14. BCNF Decomposition result = {R} done = false Compute F+ while not done if there is a schema Ri that is not BCNF let a --> b be the non-trivial fn.dep in Ri s.t. a --> Ri is not in F+ and ab =0 result = (result -Ri)  (Ri-b)  (a,b) else done=true; B.ramamurthy

  15. Lossless Decomposition • If R is split into R1 and R2, for the decomposition to be loss less the at least one of the two should hold: R1  R2 --> R1 R1  R2 --> R2 B.ramamurthy

  16. Dependency Preservation • When a relational schema D defined by functional dependency F is decomposed into Ri {i = 1,2,3 ..n}, each functional dependency should be testable by at least one of Ri. • Formally, let F+ be the closure F and let F’+ be the closure of dependencies covered by Ri. F+ == F+’ for dependency preservation. B.ramamurthy

  17. 3NF : Third Normal Form • BCNF is quite stringent that sometimes it may not be possible to cover all the functional dependencies after decomposition. • 3NF solves this problem by relaxing the BCNF requirements as follows: B.ramamurthy

  18. 3NF : Definition • A relation schema R is in BCNF with respect to a set of functional dependencies if for all functional dependencies in F+ of the form a-->b at least one of the following holds, • a-->b is a trivial functional dependency (that is b is subset of a) • a is superkey for schema R • Each attribute A in b-->a is contained in a candidate key for R. B.ramamurthy

  19. 3NF : Algorithm Let Fc be the canonical cover of F. j = 0; for each dependency a-->b if none of schemes in Ri contains ab then j = j+1; Rj = ab; if none of the schemas contains a candidate key for R then j = j+ 1; Rj = any candidate key for R; return (Rj) B.ramamurthy

  20. Examples : 7.2, 7.3, 7.8 B.ramamurthy

  21. Functional Dependencies (review) • A functional dependency is a relationship between attributes • Examples: • Student ID determines Major: StuID => Major • StuID => (Name, Major) • (StuID, Course) => Grade • Attribute(s) to the left of arrow called determinant(s) B.ramamurthy

  22. Anomalies in a Relation (review) • An anomaly is a weakness in the way a relation is set up • Assume a single table representing student-courses (!) • Consider the following table: • IdNameMajorCourse-IdCourse-Desc 45 Meyer CS CSE421 Operating Systems 56 Beaver CE CSE462 Database Concepts 23 Teller CE CSE506 Architecture • This table has all three anomalies: insert, update, and delete B.ramamurthy

  23. Insert, Update and Delete Anomalies (review) • Insertanomaly is caused when a new course needs to be inserted that is not registered by a student • Updateanomaly caused when Major is updated in one row only for Id 45. • Deleteanomaly caused when Id 56 removed from the table; we lose a CSE course description • Anomalies are avoided by splitting the offending relation into multiple relations (decomposition) B.ramamurthy

  24. Fourth Normal Form (4NF) • A multi-valued dependency exists when • there are at least three attributes A, B, and C in a relation and • for each value of A there is a well-defined set of values for B, and a well-defined set of values for C, • but the set of values of B is independent of set C • A relation is in 4NF if it is already in 3NF and has no multi-valued dependencies • Every possible combination of the two multi-valued attributes have to be stored in the database thus leading to redundancy and consequent anomalies B.ramamurthy

  25. Fourth Normal Form (4NF) - Example • Course-IdInstructor TextbookMGS404 Clay HansenMGS404 Clay KroenkeMGS404 Drake HansenMGS404 Drake Kroenke • By placing the multi-valued attributes in tables by themselves, we can convert the above to 4NF • Change to:COURSE-INST (Course-Id, Instructor)COURSE-TEXT (Course-Id, Textbook) B.ramamurthy

  26. Normal Forms 5NF 4NF BCNF 3NF 2NF 1NF B.ramamurthy

  27. Homework #3 :Due date 4/11 in class : Hardcopy 1. Consider a scheme R (A,B,C,D,E) and the two set of functional dependencies. Are the two sets F1 and F2 equivalent? F1 : A-->B, AB--> C, D--> AC, D --> E F2 : A--> BC , D--> AE 2. Consider the schema R (A,B,C,D,E,F). AB--> C, C --> A, BC -->D, AC D--> B BE-->C, CE-->FA, CF-->BD, D-->EF a. Find the closure. b. Find the set candidate keys. Show all the steps. B.ramamurthy

More Related