280 likes | 416 Views
Relation Decomposition. Given a relation R with attributes. A , A , … A. 1. 2. n. Create two relations R1 and R2 with attributes. B , B , … B. C , C , … C. 1. 2. m. 1. 2. l. Such that:. . =. B , B , … B. C , C , … C. A , A , … A. 1. 2. m. 1. 2. l. 1. 2. n.
E N D
Relation Decomposition Given a relation R with attributes A , A , … A 1 2 n Create two relations R1 and R2 with attributes B , B , … B C , C , … C 1 2 m 1 2 l Such that: = B , B , … B C , C , … C A , A , … A 1 2 m 1 2 l 1 2 n Can we always decompose a relation with no dire consequences? When we see a BCNF violation, how do we decompose?
Wild Decompositions Name Address Move-Date Name Address Name Move-Date What’s wrong?
Not This Example, Again... SSN Name Phone Number 123-321-99 Fred (201) 555-1234 123-321-99 Fred (206) 572-4312 909-438-44 Joe (908) 464-0028 909-438-44 Joe (212) 555-4000
More Careful Strategy Given a dependency that violates the BCNF condition: A , A , … A B , B , … B 1 2 m 1 2 n Decompose the relation as follows: Continue until there are no BCNF violations left. Others A’s B’s Find a 2-attribute relation that is not in BCNF. R1 R2
Example Decomposition Name Social-security-number Age Eye Color Phone Number Functional dependencies: Name + Social-security-number Age, Eye Color What if we also had an attribute Draft-worthy, and the FD: Age Draft-worthy
A Problem Unit Company Product FD’s: unit -> company; company, product -> unit So, there is a BCNF violation, and we decompose. Unit Company Unit Product
So What’s the Problem? Unit Company Unit Product Galaga99 UW Galaga99 databases NullValue UW NullValue databases No problem so far. All local FD’s are satisfied. Let’s put all the data back into a single table again: Unit Company Product Galaga99 UW databases NullValue UW databases Violates the dependency: company, product -> unit!
Solution: 3rd Normal Form A simple condition for removing anomalies from relations: A relation R is in 3rd normal form if and only if: Whenever there is a nontrivial dependency for R , it is the case that { } a super-key for R, or B is part of a key. A , A , … A B 1 2 n A , A , … A 1 2 n What happened to first and second normal forms? Will we have more normal forms?
Multi-valued Dependencies Name SSN Phone Number Course Fred 123-321-99 (206) 572-4312 CSE-444 Fred 123-321-99 (206) 572-4312 CSE-341 Fred 123-321-99 (206) 432-8954 CSE-444 Fred 123-321-99 (206) 432-8954 CSE-341 The multi-valued dependencies are: Name, SSN Phone Number Name, SSN Course 4th Normal form: replace FD by MVD.
Summary of This Section • Functional dependencies: basic definition. • Properties of functional dependencies. • Entailment and equivalence of FD’s: • Compare what FD’s say about sets of databases. • Algorithm for finding the closure of a set of attributes A: • Start with A, and add attributes implied by the FDs.
Summary Continued • Algorithm for entailment: Does the set of FDs S entail the dependency A -> B? • Compute the closure of A. If it includes the attribute B, return Yes. Otherwise, return No. • Algorithm for entailment of a set of FDs: does S entail S1? • Apply the previous algorithm for each FD in S1. Return Yes only if it’s Yes for all. • Algorithm for equivalence: apply entailment in both directions.
More Summary • Decompose a relation by identifying BCNF violations. • You can weaken BCNF by considering 3rd normal form. • If multi-valued dependencies are present, consider 4th normal form. • Given a relational schema and a set of dependencies, we now know how to put it into normalized form. • Do we always want to do that??
Summary Beyond This Section • We built a conceptual model (in E/R or ODL): • We chose entities, relationships • We articulated the constraints on the domain. • We translated it into a relational schema in the proper way. • What do we do with the data now that it’s in tables? • How do we query it? • How do we combine data from multiple tables?
Querying the Database • How do we specify what we want from our database? Find all the employees who earn more than $50,000 and pay taxes in New Jersey. • We design high-level query languages: • SQL (used everywhere) • Datalog (used by database theoreticians, their students, friends and family) • Relational algebra: a basic set of operations on relations that provide the basic principles.
Relational Algebra at a Glance • Operators: sets as input, new set as output • Basic Set Operators • union, intersection, difference, but no complement. • Selection: s • Projection: p • Cartesian Product: X • Joins (natural,equi-join, theta join, semi-join) • Renaming: r
Set Operations • Binary operations • Result is table(set) with same attributes • Watch out for naming of attributes in resulting relation. • Union: all tuples in R1 or R2 • Intersection: all tuples in R1 and R2 • Difference: all tuples in R1 and not in R2 • No complement. Why? • Bags later.
Selection • Produce a subset of the tuples in a relation which satisfy a given condition • Unary operation… returns set with same attributes, but ‘selects’ rows • Use and, or, not, >, <… to build condition • Find all employees with salary more than $40,000:
Projection • Unary operation, selects columns • Eliminates duplicate tuples • Example: project social-security number and names.
Cartesian Product • Binary Operation • Result is tuples combining any element of R1 with any element of R2, for R1 X R2 • Schema is union of Schema(R1) & Schema(R2)
Join (Natural) • Most important, expensive and exciting. • Combines two relations, selecting only related tuples • Equivalent to a cross product followed by selection • Resulting schema has all attributes of the two relations, but one copy of join condition attributes
Other Joins and Renaming • Theta join: the join involves a predicate • R S • Semi-join: the attributes of one relation are included in the other. • Renaming:
Complex Queries Product ( name, price, category, maker) Purchase (buyer, seller, store, product) Company (name, stock price, country) Person( name, phone number, city) Find phone numbers of people who bought gizmos from Fred. Find telephony products that somebody bought
Exercises Product ( name, price, category, maker) Purchase (buyer, seller, store, product) Company (name, stock price, country) Person( name, phone number, city) Ex #1: Find people who bought telephony products. Ex #2: Find names of people who bought American products Ex #3: Find names of people who bought American products and did not buy French products Ex #4: Find names of people who bought American products and they live in Seattle. Ex #5: Find people who bought stuff from Joe or bought products from a company whose stock prices is more than $50.
Operations on Bags (and why we care) Basic operations: Projection Selection Union Intersection Set difference Cartesian product Join (natural join, theta join)