530 likes | 665 Views
CAS CS 460 Entity-Relationship Model. Review. Course Agenda Application/User-Oriented Database Modeling Data Retrieval Design Considerations Database System Internals Storage and Indexing Query Processing, Optimization Transaction Processing Recovery Management Advanced Topics
E N D
Review • Course Agenda • Application/User-Oriented • Database Modeling • Data Retrieval • Design Considerations • Database System Internals • Storage and Indexing • Query Processing, Optimization • Transaction Processing • Recovery Management • Advanced Topics • Data Mining, XML DB in Oracle
Review • Course Agenda • Application/User-Oriented • Database Modeling – ER Model, Relational Model • Data Retrieval -- Relational Alg, Rel. Cal,… SQL,.. • Design Considerations – Normal Forms,… • Database System Internals • Storage and Indexing • Query Processing, Optimization • Transaction Processing • Recovery Management • Advanced Topics • Data Mining, XML DB in Oracle
Databases Model the Real World • “Data Model” translates real world things into structures computers can store • Many models: • Relational, E-R, O-O, Network, Hierarchical, etc. • Relational (more next time) • Rows & Columns • Keys & Foreign Keys to link Relations Students Enrolled
Problems with Relational Model With complicated schemas, it may be hard for a person to understand the structure from the data definition. CREATE TABLE Students (sid CHAR(20), name CHAR(20), login CHAR(10), age INTEGER, gpa FLOAT) CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2)) Enrolled Students
name since dname ssn lot budget did Employee Works_In Department One Solution: The E-R Model • Instead of relations, it has: • Entities and Relationships • These are described with diagrams • both structure, notation more obvious to humans
Steps in Database Design • Requirements Analysis • user needs; what must database do? • Conceptual Design • high level descr (often done w/ER model) • Logical Design • translate ER into DBMS data model • Schema Refinement • consistency, normalization • Physical Design • indexes, disk layout • Security Design • who accesses what, and how
Example: DBA for Bank of America • Requirements Specification • Determine the requirements of clients ( Database to store information about customers, accounts, loans, branches, transactions, …) • Conceptual Design • Express client requirements in terms of E/R model. • Confirm with clients that requirements are correct. • Specify required data operations • Logical Design • Convert E/R model to relational, object-based, XML-based,… • Physical Design • Specify file organizations, build indexes
Modeling • A database can be modeled as: • a collection of entities, • relationship among entities. • An entityis an object that exists and is distinguishable from other objects. • Example: specific employee, company, customer, loan • Entities have attributes • Example: employees have names and addresses • An entity set is a set of entities of the same type that share the same properties. • Example: set of all employees, companies, customers, loans
Entity Sets customer and loan Cst_id Cst_name Cst_street Cst_city Loan_id, Amount
Relationship Sets • A relationship is an association among several entities Example:HayesdepositorA-102customer entity relationship set account entity • A relationship set is a mathematical relation among n 2 entities, each taken from entity sets {(e1, e2, … en) | e1 E1, e2 E2, …, en En}where (e1, e2, …, en) is a relationship • Example: (Hayes, A-102) depositor
Degree of a Relationship Set • Refers to number of entity sets that participate in a relationship set. • Relationship sets that involve two entity sets are binary (or degree two). Generally, most relationship sets in a database system are binary. • Relationship sets may involve more than two entity sets. • Relationships between more than two entity sets are rare. Most relationships are binary. (More on this later.) • Example: Suppose employees of a bank may have jobs (responsibilities) at multiple branches, with different jobs at different branches. Then there is a ternary relationship set between entity sets employee, job, and branch
Attributes • An entity is represented by a set of attributes, that is descriptive properties possessed by all members of an entity set. • Domain – the set of permitted values for each attribute • Attribute types: • Simple and composite attributes. • Single-valued and multi-valued attributes • Example: multivalued attribute: phone_numbers • Derived attributes • Can be computed from other attributes • Example: age, given date_of_birth Example: customer = (customer_id, customer_name, customer_street, customer_city ) loan = (loan_number, amount )
Relationship Sets (Cont.) • An attribute can also be property of a relationship set. • For instance, the depositor relationship set between entity sets customer and account may have the attribute access-date
Employee Entity Set Works_For Relationship Set Attribute phone E/R Data ModelAnother Example Works_At Branch Employee essn phone bname bcity seniority since ename children Lots of notation to come.
E/R Data ModelTypes of Attributes Works_At Branch Employee essn phone bname bcity seniority since ename children Default ename Multivalued children Derived seniority
E/R Data ModelTypes of relationships Works_At Branch Employee essn phone bname bcity seniority since ename children Many-to-Many (n:m) Works_At
Roles • Entity sets of a relationship need not be distinct • The labels “manager” and “worker” are called roles; they specify how employee entities interact via the works_for relationship set. • Roles are indicated in E-R diagrams by labeling the lines that connect diamonds to rectangles. • Role labels are optional, and are used to clarify semantics of the relationship
E/R Data ModelRecursive relationships Works_At Branch Employee essn phone bname bcity seniority since ename children Recursive relationships: Must be declared with roles Works_For manager worker Many-to-1 Relationship manager Employee Works_For worker
Employee phone no loc E/R Data ModelDesign Issue #1: Entity Sets vs. Attributes • An Example: Employees can have multiple phones (b) (a) Uses Employee vs Phone Employee phone_loc phone_no no loc • To resolve, determine how phones are used • 1. Can many employees share a phone? • (If yes, then (b)) • 2. Can employees have multiple phones? • (if yes, then (b), or (a) with multivalued attributes) • 3. Else • (a), perhaps with composite attributes
E/R Data ModelDesign Issue #2: Entity Sets vs. Relationship Sets • An Example: How to model bank loans Loans Borrows Branch Customer Loan Customer vs bcity bname cssn cname amt lno cssn cname (a) amt lno (b) • To resolve, determine how loans are issued • 1. Can there be more than one customer per loan? • If yes, then (a). Otherwise in (b), loan info must be replicated for each customer (wasteful storage , potential update anomalies) • 2. Is loan a noun or a verb? • Both, but more of a noun to a bank. (hence (a) probably more appropriate)
E/R Data ModelDesign Issue #3: Relationship Cardinalities • An Example: Borrows ? ? Customer Loan • Variations on Borrows: • 1. Can a customer hold multiple loans? • 2. Can a loan be jointly held by more than 1 customer?
E/R Data ModelDesign Issue #3: Relationship Cardinalities Borrows ? ? Customer Loan • Cardinalities of Borrows: Borrows Borrows Borrows Borrows
E/R Data ModelDesign Issue #3: Relationship Cardinalities (cont) • In general... 1:1 n:1 1:n n:m
E/R Data ModelDesign Issue #4: N-ary vs Binary Relationship Sets • An Example: Works_At Works_at Employee Branch Ternary: Choose n-ary when possible! (Avoids redundancy, update anomalies) Dept (Joe, Moody, Acct) Î Works_At vs Binary: WAB WAE Branch WA Employee WAD (Joe, w3) Î WAE (Moody, w3) Î WAB (Acct, w3) Î WAD Dept
E/R Data ModelKeys • Key = set of attributes identifying individual entities or relationships Employee essn ephone eaddress ename • A. Superkey: • any attribute set that distinguishes identities • i.e., uniquely identify a member of the entity set • e.g., {essn}, {essn, ename, eaddress} • B. Candidate Key: • “minimal superkey” (can’t remove attributes and preserve “keyness”) • e.g., {essn}, {ename, eaddress} • C. Primary Key: • candidate key chosen as the key by a DBA • e.g., {essn} (denoted by underline)
Example • Superkeys: uniquely identify a member of the entity set • - {ESSN}, {EName, EAddress}, {E-Id}, {ESSN, E-Id} • Candidate keys: {ESSN}, {EName, EAddress}, {E-Id} • - why? No subset is a super key • - E.g., {EName}, {EAddress} are not super keys hence • {EName, EAddress} which is a super key is also a candidate key. • {ESSN, E-id}: Superkey but not a Candidate key
E/R Data ModelRelationship Set Keys Works_At Employee Branch ... ... essn ename bname bcity since • Q: What attributes needed to represent relationships in Works_At? b1 e1 e2 b2 e3 A: {essn, bname, since}
E/R Data ModelRelationship Set Keys (cont.) Works_At Employee Branch ... ... essn ename bname bcity since • Q: What are the candidate keys of Works_At? b1 e1 e2 b2 e3 A: {essn}
E/R Data ModelRelationship Set Keys (cont.) Works_At Employee Branch ... ... essn ename bname bcity since • Q: What are the candidate keys if Works_At is...? a. 1:n A: {bname} b. n:m A: {essn, bname} Assumption: employees have <= 1 record per branch c. 1:1 A: {essn}, {bname}
E/R Data ModelRelationship Set Keys (cont.) • General Rules for Relationship Set Keys R E1 E2 ... ... P (E1) P (E2) • If R is: Candidate Keys P (E1) or P (E2) P (E2) P (E1) P (E1) È P (E2) R 1:1 1:n n:1 n:m
Loan_Pmt E/R Data ModelExistence Dependencies and Weak Entity Sets • Idea: • Existence of one entity depends on another • Example: Loans and Loan Payments Payment Loan lno pno pamt lamt pdate Weak Entity Set Identifying Relationship Total Participation
Loan_Pmt E/R Data ModelExistence Dependencies and Weak Entity Sets Payment Loan lno pno pamt lamt pdate Weak Entity Sets • existence of payments depends upon loans • have no superkeys: different payment records (for different loans) can be identical • instead of keys, discriminators: discriminate between payments for given loan (e.g., pno)
Loan_Pmt E/R Data ModelExistence Dependencies and Weak Entity Sets Payment Loan lno pno pamt lamt pdate Identifying Relationships • We say: • Loan is dominant in Loan_Pmt • Payment is subordinate in Loan_Pmt • Payment is existence dependent on Loan
Loan_Pmt E/R Data ModelExistence Dependencies and Weak Entity Sets Payment Loan lno pno pamt lamt pdate • Total Participation • All elements of Payment appear in Loan_Pmt
R E/R Data ModelExistence Dependencies and Weak Entity Sets E2 E1 ... atta1 attam attb1 ... attbn • Q. Is {attb1, …, attbn} a superkey of E2? A: No • Q. Name a candidate key of E2 A: {atta1, attb1}
E/R Data ModelExtensions to the Model: Specialization and Generalization • An Example: • Customers can have checking and savings accts • Checking ~ Savings (many of the same attributes) • Old Way: Has1 Customer Savings Acct acct_no interest balance Has2 Checking Acct acct_no overdraft balance
Isa E/R Data ModelExtensions to the Model: Specialization and Generalization • An Example: • Customers can have checking and savings accts • Checking ~ Savings (many of the same attributes) • New Way: balance acct_no Has superclass Account Customer Savings Acct Checking Acct overdraft interest subclasses
Isa Child Adult Senior E/R Data ModelExtensions to the Model: Specialization and Generalization • Subclass Distinctions: • 1. User-Defined vs. Condition-Defined • User: Membership in subclasses explicitly determined (e.g., Employee, Manager < Person) • Condition: Membership predicate associated with subclasses - e.g: Person age³65 age³18 age < 18
Isa Child Adult Senior E/R Data ModelExtensions to the Model: Specialization and Generalization • Subclass Distinctions: • 2. Overlapping vs. Disjoint • Overlapping: Entities can belong to >1 entity set (e.g., Adult, Senior) • Disjoint: Entities belong to exactly 1 entity set(e.g., Child) Person age³65 age³18 age < 18
Isa Child Adult Senior E/R Data ModelExtensions to the Model: Specialization and Generalization • Subclass Distinctions: • 3. Total vs. Partial Membership • Total: Every entity of superclass belongs to a subclass e.g., • Partial: Some entities of superclass do not belong to any subclass (e.g., if Adults condition is age³21 ) Person age³65 age³18 age < 18
Loan E/R Data ModelExtensions to the Model: Aggregation • E/R: No relationships between relationships • E.g.: Associate loan officers with Borrows relationship set Borrows Customer ? Loan_Officer Employee • Associate Loan Officer with Loan? • What if we want a loan officer for every (customer, loan) pair?
Loan E/R Data ModelExtensions to the Model: Aggregation • E/R: No relationships between relationships • E.g.: Associate loan officers with Borrows relationship set Borrows Customer Loan_Officer Employee • Associate Loan Officer with Borrows? • Must First Aggregate
Other Similar Models: UML • UML: Unified Modeling Language • UML has many components to graphically model different aspects of an entire software system • UML Class Diagrams correspond to E-R Diagram, but several differences.
UML Class Diagrams (Cont.) • Entity sets are shown as boxes, and attributes are shown within the box, rather than as separate ellipses in E-R diagrams. • Binary relationship sets are represented in UML by just drawing a line connecting the entity sets. The relationship set name is written adjacent to the line. • The role played by an entity set in a relationship set may also be specified by writing the role name on the line, adjacent to the entity set. • The relationship set name may alternatively be written in a box, along with attributes of the relationship set, and the box is connected, using a dotted line, to the line depicting the relationship set. • Non-binary relationships drawn using diamonds, just as in ER diagrams
UML Class Diagram Notation (Cont.) overlapping disjoint *Note reversal of position in cardinality constraint depiction *Generalization can use merged or separate arrows independent of disjoint/overlapping
Cardinality constraints are specified in the form l..h, where l denotes the minimum and h the maximum number of relationships an entity can participate in. Beware: the positioning of the constraints is exactly the reverse of the positioning of constraints in E-R diagrams. The constraint 0..* on the E2side and 0..1 on the E1 side means that each E2 entity can participate in at most one relationship, whereas each E1 entity can participate in many relationships; in other words, the relationship is many to one from E2 to E1. Single values, such as 1 or * may be written on edges; The single value 1 on an edge is treated as equivalent to 1..1, while * is equivalent to 0..*. UML Class Diagrams (Contd.)