400 likes | 608 Views
Database Systems I Week 2 The Entity-Relationship Model. Syllabus. Class Time and Location : Tue 14:30-16:20 AQ3005 Thu 14:30-15:20 AQ3003 Course Website: http ://www.cs.sfu.ca/CC/354/rfrank / Instructor : Richard Frank, PhD Email: rfrank@sfu.ca Office Hours Location: TASC 9205
E N D
Syllabus • Class Time and Location: • Tue 14:30-16:20 AQ3005 • Thu 14:30-15:20 AQ3003 • Course Website: • http://www.cs.sfu.ca/CC/354/rfrank/ • Instructor: Richard Frank, PhD • Email:rfrank@sfu.ca • Office Hours • Location: TASC 9205 • Time: Tuesday, 1:30pm-2:30pm • TA:Ankit Gupta • Email:aga53@sfu.ca • Office Hours • Location: ASB9838_TA_1 • Time: Monday, 10am-11:30am
Admin • Assignment #2 changed • A2Q1 moved to A3Q1
Overview of Database Development Requirements Analysis / Ideas High-Level Database Design Conceptual Database Design / Relational Database Schema Physical Database Design / Relational DBMS Similar to software development
Overview of Database Development • Requirements Analysis • What data are to be stored in the enterprise? • What are the required applications? • What are the most important operations? • High-level database design • What are the entities and relationships in the enterprise? • What information about these entities and relationships should we store in the database? • What are the integrity constraints or business rules that hold?
Overview of Database Development • Conceptual database design • What data model to implement for the DBS?E.g., relational data model • Map the high-level design (e.g., ER diagram) to a (conceptual) database schema of the chosen data model. • Physical database design • What DBMS to use? • What are the typical workloads of the DBS? • Build indexes to support efficient query processing. • What redesign of the conceptual database schema is necessary from the point of view of efficient implementation?
Entity-Relationship Model • Short: ER model. • A lot of similarities with other modeling languages such as UML. • Concepts • Entities / Entity sets, • Attributes, • Relationships/ Relationship sets, and • Constraints. • Offers more modeling concepts than the relational data model (which only offers relations). • Closer to the way in which people think.
Entity-Relationship Diagrams • An Entity-Relationship diagram (ER diagram) is a graph with nodes representing entity sets, attributes and relationship sets. • Entity sets denoted by rectangles. • Attributes denoted by ovals. • Relationship sets denoted by diamonds. • Edges (lines) connect entity sets to their attributes and relationship sets to their entity sets. since name dname ssn budget lot did Works_In Employees Departments
Entities and Entity Sets • Entity: Real-world object distinguishable from other objects • e.g. employee Miller. • Entity can be physical or abstract object. • An entity is associated with the attributes describing its properties. • Attribute values are atomic • e.g. strings, integer or real numbers. • Contain a single piece of information • Full name? • Age? • Entity set: A collection of similar entities. • E.g., all employees.
name ssn age Employees Entities and Entity Sets • All entities in an entity set have the same set of attributes. (At least, for the moment!) • Each entity set has a key, i.e. a minimal set of attributes to uniquely identify an entity of this set. Key attributes are underlined. • Each attribute has a domain, i.e. a set of all possible attribute values.
Employees Entities and Entity Sets • A key must be unique across all possible (not just the current) entities of its set. • A key can consist of more than one attribute. • There can be more than one key for a given entity set, but we choose one (primary key) for the ER diagram. lastname birthdate firstname salary
Relationships and Relationship Sets • Relationship: Association among two or more entities. • E.g., Miller works in Pharmacy department. • Relationship set: Collection of similar relationships among two or more entity sets. name dname ssn budget age did Works_In Employees Departments
name ssn age Employees super-visor subor-dinate Reports_To Relationships and Relationship Sets • An n-ary relationship set R relates n entity sets E1 ... En. • Each relationship in R involves entities e1Î E1, ..., en Î En. • Binary relationship sets most common. • Same entity set can participate in different relationship sets, or in different “roles” in same set.
Relationships and Relationship Sets • Entity • object that is distinguishable from other objects • Ex: your home address, CMPT 354 • Entity Set • All home addresses • Collection of CMPT courses • Each entitysethas 1-to-many entities • Each entitycan belong to multiple entity sets • Relationship • Joe lives at 45 Main St. • Mary lives at 89 Wood Ave. • Relationship Set • Person lives at home address
Relationships and Relationship Sets • Relationship sets can also have attributes. • Useful for properties that cannot reasonably be associated with one of the participating entity sets. since name dname ssn budget age did Works_In Employees Departments
Instances of an ER Diagram • Entity set contains a set of entities. Each entity has one value for each of its attributes. • No duplicate instances • (not a technical limit) • What to do?? Employees
Instances of an ER Diagram • Relationship set contains a set of relationships, each relating a set of entities, one from each of the participating entity sets. • Components are entities, not attribute values. • No duplicates • (not a technical limit) Works_In
name ssn age Employees Relationships and Relationship Sets • Multiway relationship sets (n > 2) are used whenever binary relationships cannot capture the application semantics. description tid Works_For Tasks Projects pid pbudget Infrequent.
name ssn age Employees Relationships and Relationship Sets description tid Works_For Tasks Projects Works_For pid pbudget
since name dname ssn age did budget Employees Manages Departments Key Constraints • A key constraint on a relationship set specifies that the marked entity set participates in at most one relationship of this relationship set. • Entity set is marked with an arrow. Key constraint
since name dname ssn age did budget since name dname Employees Manages Departments ssn budget age did Works_In Employees Departments Multiplicity of Relationships • An employee can work in many departments; a dept can have many employees. • Each dept has at most one manager, who may manage several (many) departments. one many
Multiplicity of Relationships • The different types of (binary) relationships from a multiplicity point of view: • One to one • One to many • Many to one • Many to many one-to-one one-to-many many-to-one many-to-many
since since name name dname dname ssn did did budget budget age Departments Employees Manages Works_In since Participation Constraints • A participation constraint on a relationship set specifies that the marked entity set participates in at least one relationship of this relationship set. • Entity set is marked with a bold line. Participation constraint
name cost name age ssn age Policy Dependents Employees Weak Entities • A weak entity exists only in the context of another (owner) entity. • The weak entitycan be identified uniquely only by considering the primary key of the owner and its own partial key. • Owner entity set and weak entity set must participate in a one-to-many relationship set (one owner, many weak entities). • Weak entity set must have total participation in this supporting relationship set. • Ex: If there is no employee, there cannot be a dependent.
Subclasses • Sometimes, an entity set contains some entities that do share many, but not all properties with the entity set hierarchies. • A ISA B: every A entity is also considered to be a B entity. A specializes B, B generalizes A. • A is called subclass, B is called superclass. • A subclass inherits the attributes of a superclass, may define additional attributes. Employees ISA Contract_Emps Hourly_Emps
name ssn age Employees hours_worked hourly_wages ISA contractid Contract_Emps Hourly_Emps Subclasses • Hourly_Emps and Contract_Emps inherit the ssn (key!), name and age attributes from Employees. • They define additional attributes hourly_wages, hours_worked and contractid, resp.
Subclasses • Covering constraints: Does every Employees entity have to be either an Hourly_Empsor a Contract_Empsentity? NO. Unless Hourly_Emps ANDContract_Emps COVER Employees • Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Empsentity? YES. Hourly_EmpsOVERLAPSContract_Emps
Subclasses • There are several good reasons for using ISA relationships and subclasses: • Do not have to redefine all the attributes. • Can add descriptive attributes specific to a subclass. • To identify entitity sets that participate in a relationship set as precisely as possible. • ISA relationships form a tree structure (taxonomy) with one entity set serving as root.
Design Principles • Faithfulness • Design must be faithful to the specification / reality. • Relevant aspects of reality must be represented in the model. • Avoiding redundancy • Redundant representation blows up ER diagram and makes it harder to understand. • Redundant representation wastes storage. • Redundancy may lead to inconsistencies in the database.
Design Principles • Keep it simple • The simpler, the easier to understand for some (external) reader of the ER diagrams. • Avoid introducing more elements than necessary. • If possible, prefer attributes over entity sets and relationship sets. • Formulate constraints as far as possible • A lot of data semantics can (and should) be captured. • But some constraints cannot be captured in ER diagrams.
High-Level Design With ER Model • Major design choices • Should a concept be modeled as an entity or an attribute? a relationship? • What relationships to use: binary or ternary? • Should address be an attribute of Employees or an entity (connected to Employees by a relationship)? • Depends upon the use we want to make of address information, and the semantics of the data: • If we have several addresses per employee, address must be an entity (since attributes cannot be set-valued).
Entity vs. Attribute • Works_In2 does not allow an employee to work in the same department for two or more periods (why?). • We want to record several values of the descriptive attributes for each instance of this relationship.
since dbudget name dname ssn lot did budget Departments Employees Manages2 Entity vs. Relationship • This ER diagram o.k. if a manager gets a separate discretionary budget for each dept. • But what if a manager gets a discretionary budget that covers all managed depts? • Redundancy of dbudget, which is stored for each dept managed by the manager. • Misleading: suggests dbudgettied to managed dept.
Entity vs. Relationship • What about this diagram? • Employees who are not managers will have dbudget=null? • The following ER diagram is more appropriate and avoids the above problems! • Each manager now has a budget.
name ssn lot Employees Policies policyid cost Binary vs. Ternary Relationships • ER diagram says • Employee can own several policies • Each policy can be owned by several employees • Each dependent can be covered by several policies • If each policy is owned by just one employee: • Key constraint on Policies would mean policy can only cover 1 dependent! (only 1 combination of Employees and Policies can be in Covers) • Bad design! pname age Dependents Covers
name pname age ssn lot Dependents Employees Purchaser Beneficiary Policies policyid cost Binary vs. Ternary Relationships • This diagram is a better design. • Policy can only exist for employees. Dependents only exist if they are covered by a policy.
Binary vs. Ternary Relationships • Previous example illustrated a case when two binary relationships were better than one ternary relationship. • An example in the other direction: • a ternary relation Contracts relates entity sets Parts, Departments and Suppliers, and has descriptive attribute qty. No combination of binary relationships is an adequate substitute: • S “can-supply” P, D “needs” P, and D “deals-with” S does not imply that D has agreed to buy P from S. • How do we record qty?