1 / 31

Database Design and Normal Forms

Database Design and Normal Forms. why normal forms? - format standardization (1NF) - reduction/elimination of redundancies (2NF, 3NF, ...) theoretical tool for improving/maintaining database design quality in practice, however: redundancy vs. efficiency

Download Presentation

Database Design and Normal Forms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Design and Normal Forms • why normal forms? • - format standardization (1NF) • - reduction/elimination of redundancies (2NF, 3NF, ...) • theoretical tool for improving/maintaining database design quality • in practice, however: redundancy vs. efficiency • - redundant data may lead to inconsistencies after updates • - but useful for efficiency reasons • (shorter response times) • tradeoff problem: to be decided case by case O. Günther: Database Management Systems

  2. 1st Normal Form (1NF) • all attributes have to be atomic • no „repeating groups“ • important foundation of the relation model • but: may lead to increased redundancy • Ex.: relation Supplies (a) not in 1NF (b) in 1NF repeating groups O. Günther: Database Management Systems

  3. 1NF + for all attributes A and attribute sets X in relation R: • X  A in R X is no real subset of at least one key of R • AND  OR • A not in X A is key attribute (i.e., it belongs to at least one key of R) • note: if R has only one key, this is equivalent to: • 1 NF + each non-key attribute is fully functionally dependent on the key, • i.e., it can not be inferred from part of the key • trivially true for one-column keys • Ex.: relation Supplies • - Supplies (Name, Product, Price) is in 2NF if and only if Price • depends on both Name and Product (free pricing) • - with fixed prices (e.g. books in Germany), Supplies is no longer in 2NF • - possibly decomposition into Supplies’ (Name, Product) and • Costs (Product, Price) 2nd Normal Form (2NF) O. Günther: Database Management Systems

  4. 2NF + for all attributes A and attribute sets X in relation R: • X is a key of R • X  A in R OR • AND  X contains a key of R • A not in X OR • A is a key attribute • note: if there is only one key, this is equivalent to: • 2NF + non-key attributes are mutually independent • sufficient (but not necessary) condition:: • if an FD in the minimal cover contains all attributes of R then R is in 3NF • Ex.: relation Customers (Name, Address, Balance) • - all attributes atomic  1NF • - keys have only one column  2NF • - Address and Balance are mutually independent  3NF 3rd Normal Form (3NF) O. Günther: Database Management Systems

  5. 3NF - An Example • relation R = (C, S, Z) • functional dependencies: F = { CS Z, Z  C} • R in 3NF? • keys of R: • key attributes of R: • 1NF • 2NF • 3 NF O. Günther: Database Management Systems

  6. 3NF - An Example • relation R = (C, S, Z) • functional dependencies: F = { CS Z, Z  C} • R in 3NF? • keys of R: CS, ZS • key attributes of R: • 1NF • 2NF • 3 NF O. Günther: Database Management Systems

  7. 3NF - An Example • relation R = (C, S, Z) • functional dependencies: F = { CS Z, Z  C} • R in 3NF? • keys of R: CS, ZS • key attributes of R: C, S, Z • 1NF • 2NF • 3 NF O. Günther: Database Management Systems

  8. 3NF - An Example • relation R = (C, S, Z) • functional dependencies: F = { CS Z, Z  C} • R in 3NF? • keys of R: CS, ZS • key attributes of R: C, S, Z • 1NF: no problem • 2NF • 3 NF O. Günther: Database Management Systems

  9. 3NF - An Example • relation R = (C, S, Z) • functional dependencies: F = { CS Z, Z  C} • R in 3NF? • keys of R: CS, ZS • key attributes of R: C, S, Z • 1NF: no problem • 2NF: o.k. because Z and C are key attributes • 3 NF O. Günther: Database Management Systems

  10. 3NF - An Example • relation R = (C, S, Z) • functional dependencies: F = { CS Z, Z  C} • R in 3NF? • keys of R: CS, ZS • key attributes of R: C, S, Z • 1NF: no problem • 2NF: o.k. because Z and C are key attributes • 3 NF: o.k. for the same reason O. Günther: Database Management Systems

  11. Decompositon into 3NF • given: relation R, set of FD's F • find: decomposition of R into a set of 3NF relations Ri • algorithm: • IF R in 3NF • THEN stop • ELSE • compute minimal cover F of F; • create a separate relation Ri = A for each attribute Athat does not • occur in any FD in F; • create a relation Ri = XA for each FD X A in F; • if the key K of R does not occur in any relation Ri, create one • more relation Ri = K. • decomposition fulfills • - lossless join • - preservation of dependencies O. Günther: Database Management Systems

  12. Decomposition into 3NF - Example Attributes: L ... Lecture R ... Room I ... Instructor S ... Student T ... Time G ... Grade Relational Schema: R= (L, I, T, R, S, G) Functional Dependencies: O. Günther: Database Management Systems

  13. Decomposition into 3NF - Example (cont.) Attributes: L ... Lecture R ... Room I ... Instructor S ... Student T ... Time G ... Grade Relational Schema: R= (L, I, T, R, S, G) Functional Dependencies: F = { L  I , TR  L, TI  R, LS  G, TS  R, TRI  LR} O. Günther: Database Management Systems

  14. Decomposition into 3NF - Example (cont.) • Keys: • Key Attributes: O. Günther: Database Management Systems

  15. Decomposition into 3NF - Example (cont.) • Keys: ST • Key Attributes: S, T O. Günther: Database Management Systems

  16. Decomposition into 3NF - Example (cont.) • F = { L  I , • TR  L, • TI  R, • LS  G, • TS  R, • TRI  LR} • Minimal Cover • Decomposition into Ri O. Günther: Database Management Systems

  17. Decomposition into 3NF - Example (cont.) • F = { L  I , • TR  L, • TI  R, • LS  G, • TS  R, • TRI  LR} • Minimal Cover • F = {L  I , • TR  L, • TI  R, • LS  G, • TS  R} • Decomposition into Ri O. Günther: Database Management Systems

  18. Indices • data structures (often tree structures) that serve to accelerate database searches • frequent synonyms: index structures, access methods • Ex.: Supplies (Name, Product, Price) O. Günther: Database Management Systems

  19. Indices (cont.) • Name and Product are the indexed columns • Index on Name is primary index • - indexed column is part of the primary key • - relation is sorted by increasing primary key • - well suited for processing range queries (Ex.: Find all suppliers • whose name starts with B, C or D) • all other indices: secondary indices • tradeoff: queries vs. updates • - indices accelerate many queries ... • - ... but slow down updates O. Günther: Database Management Systems

  20. Dense vs. Sparse Indices • relations are stored in blocks (pages) on the magnetic disk • crucial cost factor: how many blocks to I have to transfer from disk • to main memory in order to answer the query? • non-dense (or sparse) index: one index entry per block • - for a primary index it suffices to store the smallest key value per block • - index supports the system when looking for the relevant block(s) • - inside each block: local search (cf. telephone directory) • - useful for large relations because very compact • - only possible for columns according to which the relation • has been sorted (cf. phone directory) • - therefore: at most one sparse index per relation • dense index: one index entry per tuple O. Günther: Database Management Systems

  21. How Does a Disk Access Work? Disk Drive Read block Main Memory Write block O. Günther: Database Management Systems

  22. Indexon Name (sparse) Index on Product (dense) Dense vs. sparse indices: An Example Price Oysters Peanuts Oysters Peanuts Lettuce Lettuce O. Günther: Database Management Systems

  23. Layered Indices • large relations  large indices • indexing a larger index leads to a smaller index etc. • tree structure • root fits on one page (= one block) Index (often dense) File (Relation) O. Günther: Database Management Systems

  24. B+ Tree • tree structure as described above A O. Günther: Database Management Systems

  25. B+ Tree(cont.) • B+ trees are balanced (i.e., all leaves are on the same level) • lowest level (leaves): dense, otherwise : sparse • each node fits on one page (  N entries) • N = page size / space requirements per entry (Ex. above: N = 3) • minimal page utilization (guaranteed): N/2 entries O. Günther: Database Management Systems

  26. B+ Tree (cont.) • each node has between N/2 and N entries • problems: overflow, underflow • Ex.: N = 3 A O. Günther: Database Management Systems

  27. B+ Tree (cont.) O. Günther: Database Management Systems

  28. B+Baum (cont.) O. Günther: Database Management Systems

  29. Hashing - An Alternative to Indices • hash function h: • data value  storage address • Ex.: storage address = data value MOD p • (p typically a prime number) • Ex.: p = 13 Hash Field O. Günther: Database Management Systems

  30. Hashing (cont.): Storage Structure • only one hash field per relation! • advantage: very fast access • disadvantage: • - relation dispersed across the disk • - collisions O. Günther: Database Management Systems

  31. Hashing (cont.): Collision Chains O. Günther: Database Management Systems

More Related