540 likes | 643 Views
CMSC424: Database Design. Instructor: Amol Deshpande amol@cs.umd.edu. Data Modeling. Goals: Conceptual representation of the data “Reality” meets “bits and bytes” Must make sense, and be usable by other people Today: Entity-relationship Model Relational Model.
E N D
CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu CMSC424, Spring 2005
Data Modeling • Goals: • Conceptual representation of the data • “Reality” meets “bits and bytes” • Must make sense, and be usable by other people • Today: • Entity-relationship Model • Relational Model CMSC424, Spring 2005
Motivation • You’ve just been hired by Bank of America as their DBA for their online banking web site. • You are asked to create a database that monitors: • customers • accounts • loans • branches • transactions, … • Now what??!!! CMSC424, Spring 2005
Three Levels of Modeling info Physical DB design Conceptual DB design Logical DB design Database Design Steps Entity-relationship Model Typically used for conceptual database design Conceptual Data Model Logical Data Model Relational Model Typically used for logical database design Physical Data Model CMSC424, Spring 2005
Entity-Relationship Model • Two key concepts • Entities: • An object that exists and is distinguishable from other objects • Examples: Bob Smith, BofA, CMSC424 • Have attributes (people have names and addresses) • Form entity sets with other entities of the same type that share the same properties • Set of all people, set of all classes • Entity sets may overlap • Customers and Employees CMSC424, Spring 2005
Entity-Relationship Model • Two key concepts • Relationships: • Relate 2 or more entities • E.g. Bob Smith has account at College Park Branch • Form relationship sets with other relationships of the same type that share the same properties • Customers have accounts at Branches • Can have attributes: • has account at may have an attribute start-date • Can involve more than 2 entities • Employee works at Branch at Job CMSC424, Spring 2005
Rectangles: entity sets Diamonds: relationship sets Ellipses: attributes access-date cust-name number cust-id has customer account cust-street cust-city balance ER Diagram: Starting Example CMSC424, Spring 2005
Rest of the class • Details of the ER Model • How to represent various types of constraints/semantic information etc. • Design issues • A detailed example CMSC424, Spring 2005
Next: Relationship Cardinalities • We may know: One customer can only open one account OR One customer can open multiple accounts • Representing this is important • Why ? • Better manipulation of data • Can enforce such a constraint • Remember: If not represented in conceptual model, the domain knowledge may be lost CMSC424, Spring 2005
Mapping Cardinalities • Express the number of entities to which another entity can be associated via a relationship set • Most useful in describing binary relationship sets CMSC424, Spring 2005
Mapping Cardinalities • One-to-One • One-to-Many • Many-to-One • Many-to-Many has customer account has customer account has customer account has customer account CMSC424, Spring 2005
Mapping Cardinalities • Express the number of entities to which another entity can be associated via a relationship set • Most useful in describing binary relationship sets • N-ary relationships ? CMSC424, Spring 2005
Next: Types of Attributes • Simple vs Composite • Single value per attribute ? • Single-valued vs Multi-valued • E.g. Phone numbers are multi-valued • Derived • If date-of-birth is present, age can be derived • Can help in avoiding redundancy, enforcing constraints etc… CMSC424, Spring 2005
access-date cust-name number cust-id has customer account cust-street cust-city balance Types of Attributes CMSC424, Spring 2005
multi-valued (double ellipse) derived (dashed ellipse) access-date number has account balance Types of Attributes age cust-name cust-id customer date-of-birth cust-street phone no. cust-city CMSC424, Spring 2005
access-date number has account balance Types of Attributes age cust-name cust-id customer date-of-birth cust-street phone no. cust-city month day year Composite Attribute CMSC424, Spring 2005
Next: Keys • Key = set of attributes identifying individual entities or relationships CMSC424, Spring 2005
Entity Keys Possible Keys: {cust-id} {cust-name, cust-city, cust-street} {cust-id, age} cust-name ?? Probably not. Domain knowledge dependent !! date-of-birth cust-name cust-id customer age cust-street phone no. cust-city CMSC424, Spring 2005
Entity Keys • Superkey • any attribute set that can distinguish entities • Candidate key • a minimal superkey • Can’t remove any attribute and preserve key-ness • {cust-id, age} not a superkey • {cust-name, cust-city, cust-street} is • assuming cust-name is not unique • Primary key • Candidate key chosen as the key by DBA • Underlined in the ER Diagram CMSC424, Spring 2005
{cust-id} is a natural primary key Typically, SSN forms a good primary key Try to use a candidate key that rarely changes e.g. something involving address not a great idea Entity Keys date-of-birth cust-name cust-id customer age cust-street phone no. cust-city CMSC424, Spring 2005
access-date number cust-id has customer account Relationship Set Keys • What attributes are needed to represent a relationship completely and uniquely ? • Union of primary keys of the entities involved, and relationship attributes • {cust-id, access-date, account number} describes a relationship completely CMSC424, Spring 2005
access-date number cust-id has customer account Relationship Set Keys • Is {cust-id, access-date, account number} a candidate key ? • No. Attribute access-date can be removed from this set without losing key-ness • In fact, union of primary keys of associated entities is always a superkey CMSC424, Spring 2005
Relationship Set Keys • Is {cust-id, account-number} a candidate key ? • Depends access-date number cust-id has customer account CMSC424, Spring 2005
Relationship Set Keys • Is {cust-id, account-number} a candidate key ? • Depends access-date number cust-id has customer account • If one-to-one relationship, either {cust-id} or {account-number} sufficient • Since a given customer can only have one account, she can only participate in one relationship • Ditto account CMSC424, Spring 2005
Relationship Set Keys • Is {cust-id, account-number} a candidate key ? • Depends access-date number cust-id has customer account • If one-to-many relationship (as shown), {account-number} is a candidate key • A given customer can have many accounts, but at most one account holder per account allowed CMSC424, Spring 2005
Relationship Set Keys • General rule for binary relationships • one-to-one: primary key of either entity set • one-to-many: primary key of the entity set on the many side • many-to-many: union of primary keys of the associate entity sets • n-ary relationships • More complicated rules CMSC424, Spring 2005
Next: Data Constraints • Representing semantic data constraints • We already saw constraints on relationship cardinalities CMSC424, Spring 2005
Participation Constraint • Given an entity set E, and a relationship R it participates in: • If every entity in E participates in at least one relationship in R, it is total participation • partial otherwise CMSC424, Spring 2005
Participation Constraint access-date cust-name number cust-id has customer account cust-street cust-city balance Total participation CMSC424, Spring 2005
Cardinality Constraints How many relationships can an entity participate in ? access-date number cust-id has customer account 0..* 1..1 Minimum - 0 Maximum – no limit Minimum - 1 Maximum - 1 CMSC424, Spring 2005
Next: Recursive Relationships • Sometimes a relationship associates an entity set to itself CMSC424, Spring 2005
Must be declared with roles Recursive Relationships emp-name emp-id works-for manager employee worker emp-street emp-city CMSC424, Spring 2005
Next: Weak Entity Sets • An entity set without enough attributes to have a primary key • E.g. Transaction Entity • Attributes: • transaction-number, transaction-date, transaction-amount, transaction-type • transaction-number: may not be unique across accounts CMSC424, Spring 2005
Weak Entity Sets • A weak entity set must be associated with an identifying or owner entity set • Account is the owner entity set for Transaction CMSC424, Spring 2005
Weak Entity Sets Still need to be able to distinguish between different weak entities associated with the same strong entity number trans-date trans-number has Transaction account trans-type balance trans-amt CMSC424, Spring 2005
Weak Entity Sets Discriminator: A set of attributes that can be used for that number trans-date trans-number has Transaction account trans-type balance trans-amt CMSC424, Spring 2005
Weak Entity Sets • Primary key: • Primary key of the associated strong entity + discriminator attribute set • For Transaction: • {account-number, transaction-number} CMSC424, Spring 2005
Next: Specialization • Consider entity person: • Attributes: name, street, city • Further classification: • customer • Additional attributes: customer-id, credit-rating • employee • Additional attributes: employee-id, salary • Note similarities to object-oriented programming CMSC424, Spring 2005
Specialization: Example CMSC424, Spring 2005
Finally: Aggregation • No relationships between relationships • E.g.: Associate account officers with has account relationship set has customer account ? account officer employee CMSC424, Spring 2005
Finally: Aggregation • Associate an account officer with each account ? • What if different customers for the same account can have different account officers ? has customer account ? account officer employee CMSC424, Spring 2005
Finally: Aggregation • Solution: Aggregation has customer account account officer employee CMSC424, Spring 2005
More… • Read Chapter 2 for: • Specialization/Aggregation details • Different types of specialization’s etc • Generalization: opposite of specialization • Lower- and higher-level entities • Attribute inheritance • … CMSC424, Spring 2005
An Example: Employees can have multiple phones phone_loc phone_no E/R Data ModelDesign Issue #1: Entity Sets vs. Attributes (b) (a) Uses Employee vs Phone Employee loc no • To resolve, determine how phones are used • 1. Can many employees share a phone? • (If yes, then (b)) • 2. Can employees have multiple phones? • (if yes, then (b), or (a) with multivalued attributes) • 3. Else • (a), perhaps with composite attributes Employee phone loc no CMSC424, Spring 2005
bcity name name amt ssn ssn bname lno E/R Data ModelDesign Issue #2: Entity Sets vs. Relationship Sets • An Example: How to model bank loans Loans Borrows Branch Customer Customer Loan vs (a) amt lno (b) • To resolve, determine how loans are issued • 1. Can there be more than one customer per loan? • If yes, then (a). Otherwise, loan info must be replicated for each customer (wasteful, potential update anomalies) • 2. Is loan a noun or a verb? • Both, but more of a noun to a bank. (hence (a) probably more appropriate) CMSC424, Spring 2005
E/R Data ModelDesign Issue #3: N-ary vs Binary Relationship Sets • An Example: Works_At Ternary: Works_at Employee Branch Choose n-ary when possible! (Avoids redundancy, update anomalies) Dept (Joe, Moody, Acct) Î Works_At vs Binary: WAB WAE Branch WA Employee WAD (Joe, w3) Î WAE (Moody, w3) Î WAB (Acct, w3) Î WAD Dept CMSC424, Spring 2005
Example Design • We will model a university database • Main entities: • Professor • Projects • Departments • Graduate students • etc… CMSC424, Spring 2005
proj-number SSN dept-no name sponsor name professor project dept office start area • SSN homepage budget rank name grad age degree CMSC424, Spring 2005
proj-number SSN sponsor name project professor start area budget rank • SSN dept-no name name grad dept age office degree homepage CMSC424, Spring 2005
PI Co-PI dept-no SSN proj-number Appt Chair RA Supervises sponsor name name project dept professor office start area • SSN budget rank homepage Time (%) name grad Major age advisee advisor degree Mentor CMSC424, Spring 2005