780 likes | 1.03k Views
CSE544 Data Modeling, Conceptual Design, XML. Wednesday, April 5, 2006. Outline. ER diagrams (Chapter 2) Conceptual Design (Chapter 19) XML (Web). Database Design. Data Modeling. SQL Tables. Refinement. Files. E/R diagrams. Relations. name. buys. Person. Product. price. name.
E N D
CSE544Data Modeling,Conceptual Design, XML Wednesday, April 5, 2006
Outline • ER diagrams (Chapter 2) • Conceptual Design (Chapter 19) • XML (Web)
Database Design Data Modeling SQL Tables Refinement Files E/R diagrams Relations
name buys Person Product price name ssn Relational Schema Design Conceptual Model: Relational Model: plus FD’s Normalization: Eliminates anomalies
Entity / Relationship Diagrams Attributes address Product Entity sets buys Relationships
name category name price makes Company Product stockprice buys employs Person name ssn address
Keys in E/R Diagrams • Every entity set must have a key name category price Product
1 2 3 1 2 3 1 2 3 a b c d a b c d a b c d Multiplicity of E/R Relations • one-one: • many-one • many-many
name category name price makes Company Product stockprice What doesthis say ? buys employs Person name ssn address
Product Purchase Store Person Multi-way Relationships
Invoice VideoStore Rental Movie Person Arrows in Multiway Relationships Q: what does the arrow mean ? A: if I know the store, person, invoice, I know the movie too
Arrows in Multiway Relationships Q: what do these arrow mean ? Invoice VideoStore Rental Movie Person A: store, person, invoice determines movie and store, invoice, movie determines person
Invoice VideoStore Rental Movie Person Arrows in Multiway Relationships Q: how do I say: “invoice determines store” ? A: no good way; best approximation: Incomplete (why ?)
Roles in Relationships What if we need an entity set twice in one relationship? Product Purchase Store buyer salesperson Person
Attributes on Relationships date Product Purchase Store Person
Converting Multi-way Relationships to Binary ProductOf date Product Purchase StoreOf Store BuyerOf Person Need arrows here ! Which direction ?
From E/R Diagramsto Relational Schema • Entity set relation • Relationship relation
Entity Set to Relation name category price Product Product(name, category, price) name category price gizmo gadgets $19.99
Relationships to Relations price name category Start Year name makes Company Product Stock price Makes(product-name, product-category, company-name, year) (watch out for attribute name conflicts)
Relationships to Relations price name category Start Year name makes Company Product Stock price No need for Makes. Modify Product: Product(name, category, price, startYear, companyName)
Multi-way Relationships to Relations address name Product Purchase Store price name Purchase(prodName,stName,ssn) Person ssn name
Design Principles What’s wrong? Purchase Product Person President Country Person Moral: be faithful!
Design Principles:What’s Wrong? date Product Purchase Store Moral: pick the right kind of entities. personAddr personName
Design Principles:What’s Wrong? date Dates Product Purchase Store Moral: don’t complicate life more than it already is. Person
Subclasses name category price Product isa isa Software Product Educational Product platforms Age Group
name category price Product isa isa Software Product Educational Product platforms Age Group Product Subclasses to Relations Sw.Product Ed.Product Notice: subclass = subset Alternative: disjoint classes (Java, C++)
FurniturePiece Company Person Modeling UnionTypes With Subclasses Each piece of furniture is owned either by a person, or by a company
Person FurniturePiece Company ownedByPerson ownedByPerson Modeling Union Types with Subclasses Solution 1. Acceptable, imperfect (What’s wrong ?)
Modeling Union Types with Subclasses Solution 2: better Company Owner isa isa ownedBy Person FurniturePiece
Referential Integrity Constraints makes Product Company Each product made by at most one company. Some products made by no company makes Product Company Each product made by exactly one company.
Other Constraints makes <100 Product Company What does this mean ?
Weak Entity Sets Entity sets are weak when their key comes from other classes to which they are related. affiliation Team University sport number name University(name) Team(universityName, number, sport)
R1 R2 R3 Weak Entity Sets E3 E1 A3 A1 A2 B3 E2 What are the keys ? E4 A4
Schema Refinement • For the relational model • Relation: R(A1, A2, …, Am) • Schema: relation name, attribute names • Instance: a mathematical m-ary relation • Database: R1, R2, …, Rn • Schema • Instance • Schema refinement = normalization
First Normal Form (1NF) • A database schema is in First Normal Form if all tables are flat Student Student Takes Course May needto add keys
More Normal Forms • Based on Functional Dependencies • 2nd Normal Form (obsolete) • 3rd Normal Form • Boyce Codd Normal Form (BCNF) • Based on Multivalued Dependencies • 4th Normal Form • Based on Join Dependencies • 5th Normal Form Discussnext
Data Anomalies Recall set attributes (persons with several phones): SSN Name, City but not SSN PhoneNumber • Anomalies: • Redundancy = repeat data • Update anomalies = Fred moves to “Bellevue” • Deletion anomalies = Joe deletes his phone number: what is his city ?
Relation Decomposition Break the relation into two: • Anomalies have gone: • No more repeated data • Easy to move Fred to “Bellevue” (how ?) • Easy to delete all Joe’s phone number (how ?)
Decompositions in General R(A1, ..., An, B1, ..., Bm, C1, ..., Cp) R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp) R1 = projection of R on A1, ..., An, B1, ..., Bm R2 = projection of R on A1, ..., An, C1, ..., Cp
Problems With Decomposition • Can we get the data back correctly ? • Lossless decomposition • Discuss next • Can we recover the FD’s on the ‘big’ table from the FD’s on the small tables ? • Dependency-preserving decomposition • Figure out yourself, or read 19.5.2
Lossless Decomposition • Sometimes it is correct:
Lossy Decomposition • Sometimes it is not: What’swrong ??
Decompositions in General R(A1, ..., An, B1, ..., Bm, C1, ..., Cp) R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp) Theorem If A1, ..., An B1, ..., Bm Then the decomposition is lossless Note: don’t need necessarily A1, ..., An C1, ..., Cp Example: name price, hence the first decomposition is lossless
Functional Dependencies • A form of constraint • hence, part of the schema • Finding them is part of the database design
Meaning: If two tuples agree on the attributes A1, A2, …, An then they must also agree on the attributes B1, B2, …, Bm Functional Dependency: A1, A2, …, An B1, B2, …, Bm Functional Dependencies
Functional Dependencies Definition: A1, ..., An B1, ..., Bm holds in R if: t, t’ R, (t.A1=t’.A1 ... t.An=t’.An t.B1=t’.B1 ... t.Bm=t’.Bm ) R t if t, t’ agree here then t, t’ agree here t’
Examples EmpID Name Phone Position • EmpID Name, Phone, Position • Position Phone • but Phone Position E0045 Smith 1234 Clerk E1847 John 9876 Salesrep E1111 Smith 9876 Salesrep E9999 Mary 1234 Lawyer
Example Product(name, category, color, department, price) Consider these FDs: namecolor categorydepartment color, categoryprice What do they say ?
Example • FD’s are constraints: • On some instances they hold • On others they don’t namecolor categorydepartment color, categoryprice Does this instance satisfy all the FDs ?
Example namecolor categorydepartment color, categoryprice What about this one ?