1 / 48

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design. Functional Dependency Theory Decomposition Using Functional Dependencies Normal Forms Boyce-Codd Normal Form Database-Design Goals. The Banking Schema. branch = ( branch_name , branch_city , assets )

meris
Download Presentation

Chapter 7: Relational Database Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 7: Relational Database Design

  2. Chapter 7: Relational Database Design • Functional Dependency Theory • Decomposition Using Functional Dependencies • Normal Forms • Boyce-Codd Normal Form • Database-Design Goals

  3. The Banking Schema • branch = (branch_name, branch_city, assets) • customer = (customer_id, customer_name, customer_street, customer_city) • loan = (loan_number, amount) • account = (account_number, balance) • employee = (employee_id. employee_name, telephone_number, start_date) • dependent_name = (employee_id, dname) • account_branch = (account_number, branch_name) • loan_branch = (loan_number, branch_name) • borrower = (customer_id, loan_number) • depositor = (customer_id, account_number) • cust_banker = (customer_id, employee_id, type) • works_for = (worker_employee_id, manager_employee_id) • payment = (loan_number, payment_number, payment_date, payment_amount) • savings_account = (account_number, interest_rate) • checking_account = (account_number, overdraft_amount)

  4. Functional Dependencies • Let R be a relation schema   R and   R • The functional dependency  holds onR if and only if for any legal relations r(R), whenever any two tuples t1and t2 of r agree on the attributes , they also agree on the attributes . That is, t1[] = t2 []  t1[ ] = t2 [ ] • Example: Consider r(A,B ) with the following instance of r. • On this instance, AB does NOT hold, but BA does hold. • 4 • 1 5 • 3 7

  5. Functional Dependencies • Constraints on the set of legal relations. • Require that the value for a certain set of attributes determines uniquely the value for another set of attributes. • A functional dependency is a generalization of the notion of a key. • Ex) A  C C  A : not satisfied AB  D • Trivial functional dependency if it is satisfied by all relations. Ex) A → A, AB → A In general, α → β is trivial if β ⊆ α.

  6. Functional Dependencies • K is a superkey for relation schema R if and only if K R • K is a candidate key for R if and only if • K R, and • for no   K,  R • Functional dependencies allow us to express constraints that cannot be expressed using superkeys. Consider the schema: bor_loan = (customer_id, loan_number, amount ). We expect this functional dependency to hold: loan_numberamount but would not expect the following to hold: amount customer_id

  7. Use of Functional Dependencies • We use functional dependencies to: • test relations to see if they are legal under a given set of functional dependencies. • If a relation r is legal under a set F of functional dependencies, we say that rsatisfies F. • specify constraints on the set of legal relations • We say that Fholds onR if all legal relations on R satisfy the set of functional dependencies F.

  8. Functional Dependencies • A functional dependency is trivial if it is satisfied by all instances of a relation • Example: • customer_name, loan_number customer_name • customer_name customer_name • In general,   is trivial if   

  9. Closure of a Set of Functional Dependencies • Given a set F set of functional dependencies, there are certain other functional dependencies that are logically implied by F. • For example: If AB and BC, then we can infer that AC • The set of all functional dependencies logically implied by F is the closure of F. • We denote the closure of F by F+. • F+ is a superset of F.

  10. Functional-Dependency Theory • We now consider the formal theory that tells us which functional dependencies are implied logically by a given set of functional dependencies.

  11. Closure of a Set of Functional Dependencies • Given a set F set of functional dependencies, there are certain other functional dependencies that are logically implied by F. • For example: If AB and BC, then we can infer that A C • The set of all functional dependencies logically implied by F is the closure of F. • We denote the closure of F by F+. • We can find all ofF+by applying Armstrong’s Axioms: • if   , then   (reflexivity) • if  , then    (augmentation) • if  , and   , then   (transitivity) • These rules are • sound (generate only functional dependencies that actually hold) and • complete (generate all functional dependencies that hold).

  12. Example • R = (A, B, C, G, H, I)F = { A BA CCG HCG IB H} • some members of F+ • A H • by transitivity from A B and B H • AG I • by augmenting A C with G, to get AG CG and then transitivity with CG I • CG HI • by augmenting CG I to infer CG  CGI, and augmenting of CG H to inferCGI HI, and then transitivity

  13. Closure of Functional Dependencies • We can further simplify manual computation of F+ by using the following additional rules. • If   holds and  holds, then    holds (union) • If    holds, then   holds and  holds (decomposition) • If   holds and   holds, then   holds (pseudotransitivity) The above rules can be inferred from Armstrong’s axioms.

  14. Closure of Functional Dependencies • EX) R = (A, B, C, G, H, I ) F = {A B, A C, CG H, CG I, B H} • Some members of F+: - A H (by transitivity rule) - AG I (by pseudotransitivity rule) - CG HI (by union rule)

  15. Combine Schemas? • Suppose we combine borrowerand loan to get bor_loan = (customer_id, loan_number, amount ) • Result is possible repetition of information (L-100 in example below) • Amount: repetition; Bad!!

  16. A Combined Schema Without Repetition • Consider combining loan_branch and loan loan_amt_br = (loan_number, amount, branch_name) • No repetition (as suggested by example below): Good!!

  17. What About Smaller Schemas? • Suppose we had started with bor_loan. How would we know to split up (decompose) it into borrower and loan? • Write a rule “if there were a schema (loan_number, amount), then loan_number would be a candidate key” • Denote as a functional dependency: loan_numberamount • In bor_loan, because loan_number is not a candidate key, the amount of a loan may have to be repeated. This indicates the need to decompose bor_loan. • Not all decompositions are good. Suppose we decompose employee into employee1 = (employee_id, employee_name) employee2 = (employee_name, telephone_number, start_date) • The next slide shows how we lose information -- we cannot reconstruct the original employee relation -- and so, this is a lossy decomposition.

  18. A Lossy Decomposition

  19. Decomposition • Let R be a relation schema. A set of relation schemas {R1,R2, …Rn} is a decomposition of R if R = R1 R2  …Rn • Let r be a relation an schema R, and let ri = Ri(r) for r=1~n. That is, {r1,r2, …rn} is the database that results from decompositing R into {R1,R2, …Rn}. • It is always the case that r  r1 r2 … rn In general, r  r1 r2 … rn • A relation is legal if it satisfies all rules, or constraints, that we impose an our DB.

  20. Decomposition • Let C represent a set of constraints on database. A decomposition {R1,R2, …Rn} of a relation schema R is a lossless-join decomposition for R if, for all relations r on schema R that are legal under C, r = R1(r) R2(r) … Rn(r)

  21. Lossless-join Decomposition • For the case of R = (R1, R2), we require that for all possible relations r on schema R r = R1(r ) R2(r ) • A decomposition of R into R1 and R2 is lossless join if and only if at least one of the following dependencies is in F+: • R1 R2R1 • R1 R2R2

  22. Example • R = (A, B, C)F = {A B, B C) • Can be decomposed in two different ways • R1 = (A, B), R2 = (B, C) • Lossless-join decomposition: R1  R2 = {B}and B BC • Dependency preserving • R1 = (A, B), R2 = (A, C) • Lossless-join decomposition: R1  R2 = {A}and A  AB • Not dependency preserving (cannot check B C without computing R1 R2)

  23. 정규화(Normalization) • 정규화(Normalization) - 데이터베이스의 릴레이션이나 튜플을 불필요한 중복 없이, 또한 수정할 때 의도하지 않았던 불필요한 사항이 추가, 삭제, 변경되는 일이 없도록 재구성하는 과정을 공식화하는 방법. • 정규형(Normal Form) - Codd's Normalization definitions • 1NF -> 2NF -> 3NF(BCNF(Boyce/Codd) • 4NF 5NF(PJ/NF)

  24. Universe of relations ( normalized and unnormalized ) 1NF relations ( normalized relations ) 2NF relations 3NF relations BCNF 4NF relations PJ/NF ( 5NF) relations 정규형(Normal Form)의 예

  25. Fully Functional Dependency(FFD) • Fully Functional Dependency(FFD) - If and only if Attr. Y is functional dependent on Attr. X and is not functional dependent on any proper subset of X.(그림 3참조)

  26. Tables

  27. PNAME COLOR S# QTY P# WEIGHT P# CITY S# STATUS SNAME CITY 그림 3: 릴레이션 S, P, SP와 그들의 functional dependences

  28. S# STATUS QTY P# CITY 제 1,2,3 정규형(Normal Forms) • The 1STNormal Form(1STNF) - if and only if all underlying domains contain atomic values only (any normalized relation) 그림 4: FIRST 릴레이션과 Functional Dependences

  29. 1,2,3 Normal Forms • 그림 4의 FIRST( S#, STATUS, CITY, P#, QTY ) - Redundancy(중복): 모든 attributes in FIRST → redundancies => update anomalies 발생 • Anomalies의 예 - Insert: (S5,Athens) insert시 primary key가 존재하지 않음. insert 불가능 - Delete: delete a tuple with (S3,P2) → (S3, 10, Paris) also lost - Update: S1 move from London to Amsterdam => find every tuples related with S1 and change London to Amsterdam. → redundancies => update anomalies 발생

  30. 1,2,3 Normal Forms • Solution of the above anomalies - FIRST -> SECOND( S#, STATUS, CITY)와 SP(S#, P#, QTY)로 분리: 그림 5. - Insert: Primary Key S# 통해 (S5, Athens)의 insert가능 - Delete:(S3, P2, 200) 삭제, (S3, 10, Paris) 는 SECOND내 존재 - Update : (S1, London->Amsterdam) 가능 => 따라서 그림 5는 : overcomes all the above problems : non-fully functional dependencies를 제거

  31. STATUS S# CITY a ) S# QTY P# b ) 1,2,3 Normal Forms 그림5: SECOND, SP릴레이션과 그들의 Functional Dependences

  32. 1,2,3 Normal Forms • The 2NDNormal Form - 2NDNF: if and only if 1STNF, every nonkey attribute is fully dependent on Primary key - SECOND: primary key ( S#)와 SP: primary key (S#, P#)는 모두 2NF이다. - FIRST : 1NF, not 2NF → An equivalent collection of 2NFrelation (by suitable projection) → Reduction processing: by projection - SECOND, SP는 문제가 아직도 발생함

  33. STATUS S# CITY 1,2,3 Normal Forms 그림 6: SECOND릴레이션과 Transitive dependency S#와 STATUS는 FD이지만, S#와 STATUS는 CITY를 통하여 Transitive Dependency이다. - Anomalies in a relation SECOND의 예 (caused by dependence transitive ) → Insert ( null, 50, ROME ): primary Key ? → Delete ( S5, Athens ) : S#를 Athens에대한 status 정보(30)까지 손실됨

  34. 1,2,3 Normal Forms →Update : Update the status for London from 20 to 30 => inconsistent result 초래 • Solution of anomalies - SECOND → SC(S#, CITY) → CS(CITY, STATUS) - overcome all above problems - eliminate the transitive dependence 그림 7: SC와 CS 릴레이션들

  35. 1,2,3 Normal Forms • The 3RDNormal Form - 3RDNF : if and only if 2NDNF, every nonkey attributes is nontransitivelydependent on the primary key. - SC, CS: 3NF( S# 와 CITY : primary keys ) SECOND: 3NF아님.

  36. Goal — Devise a Theory for the Following • Decide whether a particular relation R is in “good” form. • In the case that a relation R is not in “good” form, decompose it into a set of relations {R1, R2, ..., Rn} such that • each relation is in good form • the decomposition is a lossless-join decomposition • Our theory is based on: • functional dependencies

  37. Dependency Preservation Dependency Preservation • Let F be a set of functional dependency on R and let R1,R2, …Rn be a decomposition of R. The restriction of F to Ri is the set Fi of all functional dependency in F+ that include only attributes of Ri • Ask whether testing only the restrictions is sufficient? • Let F  = F1 F2 …  Fn In general, F F However, even if F F , it may be that F+= F + • We say that a decomposition having the property F+=F + is a dependency preserving decomposition

  38. BCNF not BCNF Boyce-Codd Normal Form • A relation schema R is in BCNF if for all     F+, at least one of the following holds:   is a trivial functional dependency(that is,   )   is a superkey for R • Ex) customer(c-name, c-street, c-city) c-name  c-street c-city Branch(b­name, b­city, assets) b-name  b-city assets Loan­info(b­name, c­name, l­no, amount) l-no  amount b-name

  39. Boyce-Codd Normal Form Loan(b­name, l­no, amount) Borrower(c-name, l-no) l-no  amount b-name • It is a lossless-join decomposition R1 R2 = {l-no}  b­name amount  F l­no  b­name l-no amount  F+ • It is a BCNF l-no: candidate key forLoan

  40. Boyce-Codd Normal Form • Not every BCNF decomposition is dependency preserving • Counter-example: Banker= (branch­name, c­name, banker-name) banker­name  branch­name branch­name c­name  banker­name - Banker is not in BCNF(∵banker-name is not a superkey) - BCNF decomposition: Banker-branch = (banker­name, branch­name) Customer-banker = (c­name, banker­name) It is lossless-join decomposition

  41. Boyce-Codd Normal Form But not dependency preserving Fbanker-branch = {banker­name  branch­name} Fcutomer-banker =  F = Fbanker-branchFcutomer-banker F = {banker­name  branch­name, branch­name c­name  banker­name} ∴ F+F+ , so not dependency preserving

  42. Boyce-Codd Normal Form Example schema not in BCNF: bor_loan = ( customer_id, loan_number, amount ) because loan_numberamount holds on bor_loan but loan_number is not a superkey

  43. Decomposing a Schema into BCNF • Suppose we have a schema R and a non-trivial dependency causes a violation of BCNF. We decompose R into: • (U ) • ( R - (  -  ) ) • In our example, •  = loan_number •  = amount and bor_loan is replaced by • (U ) = ( loan_number, amount ) • ( R - (  -  ) ) = ( customer_id, loan_number )

  44. Goals of Normalization • Let R be a relation scheme with a set F of functional dependencies. • Decide whether a relation scheme R is in “good” form. • In the case that a relation scheme R is not in “good” form, decompose it into a set of relation scheme {R1, R2, ..., Rn} such that • each relation scheme is in good form • the decomposition is a lossless-join decomposition • Preferably, the decomposition should be dependency preserving.

  45. Example • R = (A, B, C )F = {AB B  C}Key = {A} • R is not in BCNF (B  C but B is not superkey) • Decomposition R1 = (A, B), R2 = (B, C) • R1and R2 in BCNF • Lossless-join decomposition • Dependency preserving

  46. Comparison of BCNF and 3NF • It is always possible to decompose a relation into a set of relations that are in 3NF such that: • the decomposition is lossless • the dependencies are preserved • It is always possible to decompose a relation into a set of relations that are in BCNF such that: • the decomposition is lossless • it may not be possible to preserve dependencies.

  47. Design Goals • Goal for a relational database design is: • BCNF. • Lossless join. • Dependency preservation. • If we cannot achieve this, we accept • 3NF. • Lossless join. • Dependency preservation.

  48. End of Chapter 7

More Related