320 likes | 448 Views
C20.0046: Database Management Systems Lecture #2. Matthew P. Johnson Stern School of Business, NYU Spring, 2004. Relational schema. Relational DB. E/R design. DB development path. the World. Admin: use stern.nyu.edu addresses. Agenda. Last time: intro, RDBMS, ACID test
E N D
C20.0046: Database Management SystemsLecture #2 Matthew P. Johnson Stern School of Business, NYU Spring, 2004 M.P. Johnson, DBMS, Stern/NYU, Sp2004
Relational schema Relational DB E/R design DB development path the World Admin: use stern.nyu.edu addresses M.P. Johnson, DBMS, Stern/NYU, Sp2004
Agenda • Last time: intro, RDBMS, ACID test • This time: E/R model • Identify entity sets, relations and attributes • One-one, one-many, many-many relations • Simple ER diagrams to model a situation • 3-way relationships; Converting to binary • Entities with multiple roles • Subclasses • Design issues • Principles of faithfulness & simplicity in ER diagrams • Redundancy • Whether an element should be an attribute or entity set • Replacing a relationships with entity sets M.P. Johnson, DBMS, Stern/NYU, Sp2004
Entity Relationship (E/R) Model • A popular data model – useful to database designers • Graphical representation of miniworld • Helps design the database, not implement it • E/R design is translated to a relational design • relational design then implemented in an RDBMS • Elements of model • Entities • Entity Sets • Attributes • Relationships (!= relations!) M.P. Johnson, DBMS, Stern/NYU, Sp2004
Elements of E/R Model: Entity Sets • Entity: like an object • e.g. President Bush • Particular instance of a concept • Entity set: set of one sort of entities or a concept • e.g. World leaders • Generally, same set of attributes • Represented by a rectangle • A “good” entity set – you decide • Common properties • Correspond to class of phys. or bus. objects • (People, products, accounts, grades, etc.) World Leader M.P. Johnson, DBMS, Stern/NYU, Sp2004
Elements of E/R Model: Attributes • Properties of entities in entity set • Like fields in a struct • Like columns in a table/spreadsheet • Like data members in an object • Values in some domain (e.g., ints, strings) • Represented by ovals: • Assumed atomic • But could have limited structure • Ints, strings, etc. ID Name Student M.P. Johnson, DBMS, Stern/NYU, Sp2004
Elements of E/R Model: Relationships • Connect two or more entity sets • e.g. students enroll in courses • Binary relationships: connect two entity sets – most common • Multiway relationships: connect several entity sets • Represented by diamonds Students Enroll Courses M.P. Johnson, DBMS, Stern/NYU, Sp2004
Elms of E/R Model: Rel’ships (cont’d) • Students Enroll in courses • Courses are Held in rooms • The E/R data model: Students Enroll Courses ID Name Held Rooms M.P. Johnson, DBMS, Stern/NYU, Sp2004
A little set theory • A mathematical set a collection of members • A set is defined by its members • “Are you in or are you out?” • No other structure, no order, no duplicates allowed • Sets specified by listing: • {1, 2, 3, …} = N • {1, 2, George Bush} (not useful in DBMS) • Or by “set-builder” notation: • { x in N: 2 divides x} = ? • { x in Presidents | reelected(x)} = ? • {2x: x in N} = ? M.P. Johnson, DBMS, Stern/NYU, Sp2004
A little set theory • One set can be a subset of another (which is a superset) • ReelectedPresidents is a subset of Presidents • Also, RP is a proper subset of Pres – some lost reelection • Given two sets X and Y, the cross product or Cartesian product is X x Y = {(x,y): x in X, y in Y} = the set of all ordered pairs in which the first comes from X and the second comes from Y • Important: (x,y) != {x,y} • In an order pair or tuple • Order matters • Duplicates are allowed M.P. Johnson, DBMS, Stern/NYU, Sp2004
A little set theory • Mathematically, a relation(ship) between X and Y is just a subset of X x Y = all those pairs (x,y) s.t. x is related to y • Example: owner-of O on People, Cats • O(MPJ, Izzy) holds • The equals relation E on N, N: • E(3,3) holds because 3 = 3 • E(3,4) does not hold • E is still a set: E = {(1,1), (2,2), (3,3), …} • Father of relation F on People, People: • F(GHWB, GWB) holds • F(GWB, GHWB) does not hold • Relations aren’t necessarily symmetric M.P. Johnson, DBMS, Stern/NYU, Sp2004
Many-many Many-one One-one Multiplicity of Relationships Representation of relationships • No arrow: many-to-many • Sharp arrow: many-to-one • Rounded arrow: “exactly one” • “key constraint” • One-one: M.P. Johnson, DBMS, Stern/NYU, Sp2004
Students Enrolls Courses Student Live Residence hall Student Live Residence hall Multiplicity of Relationships Many to one: a student living in a residence hall Many-to-many: Many to exactly one: a student must live in a residence hall M.P. Johnson, DBMS, Stern/NYU, Sp2004
Multiplicity, set-theoretically • Assume no vars below are equal • Many-one means: • if (x1, y1) in R then (x2,y1) cannot be in R • One-many means: • if (x1,y1) in R then (x1,y2) cannot be in R • One-one means: • if (x1,y1) in R, then neither (x2,y1) nor (x1,y2) can be in R • Notice: one-one is stronger than many-one • One-one implies both many-one and one-many M.P. Johnson, DBMS, Stern/NYU, Sp2004
E/R Diagram Name Name Enrolls Students Courses ID ID Assisting TA Name ID M.P. Johnson, DBMS, Stern/NYU, Sp2004
E/R Diagrams • Works if each TA is a TA of all students. Connection student-TA is only through the course. • But what if students were divided into sections, each section with a separate TA? • Then, a student in C20.0046 would be related to only one of the TA's for C20.0046. Which one? • 3-way relationship is helpful here. M.P. Johnson, DBMS, Stern/NYU, Sp2004
Courses Enrolls Students Enrolls entries: TAs Students Courses TAs Carol C20.0046 Donald Richard C20.0046 Paul Al C20.0046 Colin … … … Multiway Relationships NB: Enrolls determines TA: (student, course) at most one TA M.P. Johnson, DBMS, Stern/NYU, Sp2004
Courses Course-of Students Student-of Enrolls TAs TA- of Converting multiway relships to binary • Some models (e.g. ODL) limit relationships to binary • Multiway relationship – equivalent collection of binary, many to one relationships • Replace relationship with connecting entity set NB: Enrolls has no attributes! M.P. Johnson, DBMS, Stern/NYU, Sp2004
date VideoStore Rental Movie Customer Second multiway e.g.: renting movies • Scenario: a Customer Rents a Movie from a VideoStore on a certain date • Q: Which entity does date belong to? • A: To the fact of the renting • Relationships can have attributes • always (implicitly) many-one M.P. Johnson, DBMS, Stern/NYU, Sp2004
Second multiway e.g.: renting movies • But they don’t have to • Relationship attributes can be replaced with (trivial) new entities date Date VideoStore Rental Movie Customer M.P. Johnson, DBMS, Stern/NYU, Sp2004
date VideoStore Rental Movie Customer Second multiway e.g.: renting movies • Where can we draw arrows? • (store, video, customer) date ? • Date is a relship att, implicitly determinied • (store, video, date) customer ? • (store, date, customer) video ? • (video, date, customer) store ? M.P. Johnson, DBMS, Stern/NYU, Sp2004
Arrow-drawing • Q: Why does it matter? • Round arrow benefit: • Obvious: One item takes less space than many • Less obvious: easier to access one item x than set of one item {x} • In programming: an int v. a linked list with just one int • Regular arraw benefit: • Mapping to a set of either one elm or none seems bad • But not implemented this way • Always one element, but that value may be NULL • Lesson: it pays to identify your relship’s multiplicity M.P. Johnson, DBMS, Stern/NYU, Sp2004
date VideoStore Rental Movie MovieOf date Movie Customer Rental StoreOf Store BuyerOf Customer Second multiway e.g.: renting movies • Convert to binary? M.P. Johnson, DBMS, Stern/NYU, Sp2004
Roles in relationships • Entity set appears more than once in a relship • Generally distinct entities • Each appearance is in a different role • Edges labeled by roles Successor Pre-req Course Prereq M.P. Johnson, DBMS, Stern/NYU, Sp2004
Subclasses in the E/R model • Some entities are special cases of other • Conversely: some are generalizations • Humans are specialized mammals • Grad students are specialized students • And, in turn, specialized mammals • NB: These aren’t examples but subclasses • Subclass A isa B • Represented by a triangle • Always one-to-one, though arrows omitted • Root is more general • Multiple inheritance is allowed! • A single entity may consist of all components (sets of fields) in aribtrary ESs and their ancestors M.P. Johnson, DBMS, Stern/NYU, Sp2004
TX Chainsaw Massacre Lion King Roger Rabbit Subclasses • “Lion King”: atts of Movies; relship Voices title length year stars Movies isa isa Weapon Voices Murder-Mysteries Cartoons • “Roger Rabbit”: atts of Movies; relship Voices; att weapon Component M.P. Johnson, DBMS, Stern/NYU, Sp2004
E/R inheritance v. OO inheritance • In a OOP class hierarchy, children also inherit “attributes” from parents • But an object is an instance of one class • In E/R, an entity may be composed of components from multiple, not-directly-related ESs • Roger Rabbit is composed of components from Cartoons, Murder Mysteries, and Movies • We could create a Cartoon Murder Mysteries ES if there were any atts specific to them • So the real difference: In E/R, can have implicit multiple inheritance between any set of IS-A-connected nodes (sharing a root) M.P. Johnson, DBMS, Stern/NYU, Sp2004
Design Principles • Faithfulness • Avoiding redundancy • Simplicity • Choice of relationships • Picking elements M.P. Johnson, DBMS, Stern/NYU, Sp2004
Faithfulness • Is the relationship many-many or many-one? • Are the attributes appropriate? • Are the relationships applicable to the entities? • Examples • Courses & instructors • maybe many-one, maybe many-many • Bosses & subordinates • maybe one-many, maybe many-many M.P. Johnson, DBMS, Stern/NYU, Sp2004
Owned-by Owns Movies Ownings Studios Simplicity • Einstein: Theories as simple as possible, but not simpler. • Use as few elements as possible • Minimum required relations • No unnecessary attributes (will you be using this attribute?) • Eliminate “spinning wheels” • Example: how can we simplify this? M.P. Johnson, DBMS, Stern/NYU, Sp2004
Name Own Studios Movies Length Address StudioName Name Avoiding redundancy • Say everything exactly once • Minimize database storage requirements • More important: prevent possible update errors • simplest but not only e.g.: modify data one place but not the other – more later • Example: Spot the redundancy M.P. Johnson, DBMS, Stern/NYU, Sp2004
Next time • We’ll finish E/R models and begin the relational model • Read rest of chapter 2, 3.1, 3.2 • Info on project likely posted over weekend M.P. Johnson, DBMS, Stern/NYU, Sp2004