1 / 16

Decomposition and Higher Forms of Normalization

Decomposition and Higher Forms of Normalization. Chapter 15.8 (skim). "Lossless" Joins. The main idea: if you decompose a relation schema, then join the parts of an instance via a natural join, you might get more rows than you started with, i.e., spurious tuples This is bad!

vhorton
Download Presentation

Decomposition and Higher Forms of Normalization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Decomposition and Higher Forms of Normalization Chapter 15.8 (skim)

  2. "Lossless" Joins • The main idea: if you decompose a relation schema, then join the parts of an instance via a natural join, you might get more rows than you started with, i.e., spurious tuples • This is bad! • Called a "lossy join". • Goal: decompositions which produce only "lossless" joins • "non-additive" join is more descriptive

  3. Preserving FDs • What if, when a relation is decomposed, the X of an XY ends up only in one of the new relations and the Y ends up only in another? • Such a decomposition is not “dependency-preserving.” • Goal: Always have FD-preserving decompositions

  4. Decomposing R Into BCNF • Pick an X-> A which violates BCNF, where A is a single attribute. • Decompose so that R1={X,A} is in one relation, R2=R-A is in the other. • Repeat if necessary on R2. • Guaranteed lossless... but may lose dependencies • SBD with SB->D, D->B

  5. Decomposing R into 3NF • The algorithm is much more complicated • 1. Get a “minimal cover” of FDs • 2. Find a lossless-join decomposition of R (which might miss dependencies) • 3. Add additional relations to the decomposition to cover any missing FDs of the cover • Result will be lossless, will be dependency-preserving 3NF; might not be BCNF

  6. Fact of life... Finding a decomposition which is both lossless and dependency-preserving is not always possible.

  7. Contracts schema CSJDPQV Contracts (cid, supplierid, projectid, deptid, partid, qty, value) cid is a key; JP -> C; SD -> P What are candidate keys? BCNF, 3NF decompositions? What if J->S is added?

  8. Multivalued Dependencies (MVDs) • XY means that given X, there is a unique set of possible Y values (which do not depend on other attributes of the relation) • PARENTNAMECHILDNAME • An FD is also a MVD • MVD problems arise if there are two independent 1:N relationships in a relation.

  9. Textbook Example (3-way pun) • course, teacher, book • Each teacher teaches a set of courses • teacher  courses • Each course has a recommended set of books (independent of teacher) • course  book • No FDs!

  10. MVD Theory • A whole theory exists, analogous to FD theory • Armstrong’s axioms plus 5 additional rules

  11. Fourth Normal Form • A relation R is in 4NF if for every nontrivial XY, X is a superkey of R. • Decomposition into 4NF: If there is a non-trivial XY, form one relation with only X and Y, and another with R-Y. • In other words, allow only one MVD per relation • This will be lossless, but not necessarily FD-preserving. • Achieving 4NF is a trade-off

  12. Fifth Normal Form • Sometimes a relation cannot be losslessly decomposed into two relations, but can be into three or more. • 5NF captures the idea that a relation scheme must have some particular lossless decomposition ("join dependency"). • Finding actual 5NF cases is difficult.

  13. Normalization Summary • 1NF: usually part of the woodwork • even so, know how to decompose • 2NF: usually skipped • but lots of defs. that make great exam Q's! • 3NF: a biggie • Always aim for this • BCNF and 4NF: tradeoffs start here • in re: d-preserving and losslessness • 5NF: You can say you've heard of it...

  14. Caveat • Normalization is not the be-all and end-all of DB design • Example: suppose attributes A and B are always used together, but normalization theory says they should be in different tables. • Decomposition might produce unacceptable performance loss (extra disk reads) • Plus -- there are constraints other than FDs and MVDs

  15. Looking Ahead: One Query Is Usually Not Enough • A meaningful unit of DB work might involve doing one or more queries, checking constraints, performing calculations, interacting with user, etc. • Such a unit is termed a "transaction" • "Transaction Processing" will be our next topic

  16. Current Trends • Object DBs and Object-Relational DB’s • May permit complex attributes • 1st normal form unnecessary • Data Warehouses • Huge historical databases, seldom or never updated after creation • Joins expensive or impractical • Argues against normalization • Every-day Relational DBs • Aim for BCNF, settle for 3NF

More Related