1 / 46

CS263 Lec. 4: Enhanced E-R Models

CS263 Lec. 4: Enhanced E-R Models. Basic ER model introduced in the mid 1970s Since then business relationships and business data have become more complex Enhanced Entity Relationship (EER) model refers to the extension of the original ER model

buck
Download Presentation

CS263 Lec. 4: Enhanced E-R Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS263 Lec. 4: Enhanced E-R Models Basic ER model introduced in the mid 1970s Since then business relationships and business data have become more complex Enhanced Entity Relationship (EER) model refers to the extension of the original ER model Supertype/Subtype Relationships - model a general entity type (supertype) then subdivide it into several specialised entity types (subtypes) Each subtype inherits from its supertype and may have special attributes of its own

  2. Representing supertypes and subtypes • Supertype connected with a line to a circle, in turn connected by a line to each subtype (see Fig.) • ‘U’ shaped symbol on each line connecting a subtype to the circle indicates subtype is a subset of supertype, the direction of the relationship • Attributes shared by all entities associated with the supertype, whilst attributes that are unique to a particular subtype are associated with that subtype

  3. Basic notation for supertype/subtype relationships

  4. E.G. Organisation has 3 basic types of employees: Hourly: Employee_Number, Employee_Name, Address, Date_Hired, Hourly_Rate Salaried: Employee_Number, Employee_Name, Address, Date_Hired, Annual_Salary, Stock_Option Contract consultants: Employee_Number, Employee_Name, Address, Date_Hired, Contract_Number, Billing_Rate Many attributes same across 3 types - could define supertype called EMPLOYEE, subtypes for HOURLY_EMPLOYEE, SALARIED_EMPLOYEE and CONSULTANT (see Fig)

  5. Employee supertype with three subtypes All employee subtypes will have emp_no., name, address, and date-hired Each employee subtype will also have its own attributes

  6. When to use supertype/subtype relations Should consider using subtypes when either (or both) of the following conditions are present: 1. There are attributes that apply to some (but not all) of the instances of an entity type 2. The instances of a subtype participate in a relationship unique to that subtype Both are true in the following Fig., where PATIENT has two subtypes: OUTPATIENT and RESIDENT PATIENT (the primary key is PATIENT_ID) All patients have an Admit_Date and a Patient_Name

  7. Patient example Every patient is cared for by a RESPONSIBLE_PHYSICIAN who develops a treatment plan for the patient Each subtype also has unique attributes. Outpatients have a Checkback_Date, whilst residents have a Date_Discharged and a unique relationship that assigns each patient to a bed (this is a mandatory relationship, and each bed may or may not be assigned to a patient)

  8. Supertype/subtype relationships of patients Both outpatients and resident patients are cared for by a responsible physician Only resident patients are assigned to a bed

  9. Generalization = process of defining more general entity type from set of more specialized entity types (BOTTOM-UP) • Specialization = process of defining one or more subtypes of supertype, and forming supertype/subtype relationships (TOP-DOWN) • Figs. Show Generalisation where we have 3 different entity types: CAR, TRUCK and MOTORCYCLE • Second Fig. The more general entity type VEHICLE is shown • MOTORCYCLE not shown as does not satisfy conditions for a subtype – only attributes of MOTORCYCLE are those common to all vehicles - no attributes specific to motorcycles. Further, MOTORCYCLE does not have a relationship to another entity type

  10. Example of generalization Three entity types: CAR, TRUCK, and MOTORCYCLE All these types of vehicles have common attributes

  11. Generalization to VEHICLE supertype So we put the shared attributes in a supertype Note: no subtype for motorcycle, since it has no unique attributes

  12. Specialisation: entity type PART has identifier Part_No and attributes Description, Unit_price, Location, Qty_On_Hand, Routing_Number and Supplier Some parts internally Manufactured Parts, others externally Purchased Parts (some obtained from both) Some attributes apply to all parts, others such as Routing_Number depend on the source as apply only to Manufactured Parts - suggests PART should be specialised into subtypes MANUFACTURED_PART and PURCHASED_PART Relationship ‘Supplies’ created between PURCHASED_PART and SUPPLIER - allows users to easily associate purchased parts with their suppliers Attribute Unit_Price associated with the relationship ‘Supplies’ so price may vary between suppliers

  13. Example of specialization Entity type PART Applies only to purchased parts Only applies to manufactured parts

  14. Specialization to MANUFACTURED PART and PURCHASED PART Note: multivalued attribute was replaced by a relationship to another entity Created 2 subtypes

  15. Completeness Constraints: Whether an instance of a supertype must also be a member of at least one subtype. Has two possible rules: • 1:Total Specialization Rule: Yes (double line) In following Fig. A PATIENT must either be an OUTPATIENT or a RESIDENT PATIENT. Total specialisation is indicated by the double line extending from the PATIENT identity type to the circle • 2: Partial Specialization Rule: No (single line) An entity instance of the supertype is allowed not to belong to any subtype. • In the following Fig. If a VEHICLE is a car it will appear as an instance of CAR, and if a truck as an instance of TRUCK. • However, if the vehicle is a motorcycle it cannot appear as an instance of any subtype. This example of partial specialisation is specified by the single line from the VEHICLE supertype to the circle

  16. Examples of completeness constraints A patient must be either an outpatient or a resident patient Total specialization rule

  17. Partial specialization rule A vehicle could be a car, a truck, or neither

  18. Disjointness Constraints: Whether supertype may simultaneously be member of two (or more) subtypes. 2 possibilities • Disjoint Rule: Supertype can be only ONE of subtypes. Fig. shows PATIENT must be either an outpatient or a resident patient but not be both – specified by the letter ‘d’ • Subclass of patient may change over time, but at any given time patient is of only one type • Overlap Rule: Supertype can simultaneously be member of more than one subtype. Some PARTs both Manufactured and Purchased. • Considering Part Number 4000, at given time may be 250 of this part, of which 100 manufactured and 150 purchased. Overlap specified by placing an ‘o’ in the circle • Double line suggests any part must be either a purchased part or a manufactured part, or may simultaneously be both

  19. Examples of disjointness constraints A patient can either be outpatient or resident, but not both Disjoint rule

  20. A part may be both purchased and manufactured Overlap rule

  21. Subtype Discriminators: Attribute supertype whose values determine the target subtype(s) • Disjoint subtypes: simple attribute with alternative values indicating possible subtypes. In the Fig. new attribute ‘Employee_Type’ added to supertype to serve as subtype discriminator. 3 values ‘H’ = Hourly, ‘S’ = Salaried, ‘C’ = Consultant - assigned correct value when new employee added. • Expression “Employee_Type = “ (the LHS of a condition statement) placed next to line leading from supertype to circle, with the value of the attribute that selects the appropriate subtype placed adjacent to line leading to that subtype

  22. A simple attribute with different possible values indicating the subtype Subtype discriminator (disjoint rule)

  23. Subtype discriminators • Overlapping – composite attribute whose subparts pertain to different subtypes. Each subpart contains boolean value indicating whether or not instance belongs to associated subtype. • In the Fig. a new attribute Part_Type added to PART - composite attribute with components ‘Manufactured?’ and ‘Purchased?’ Each of attributes is a boolean variable, can be combined as: Manufactures only = YN, Purchased only = NY, Purchased and manufactured = YY

  24. A composite attribute with sub-attributes indicating “yes” or “no” to determine whether it is of each subtype Subtype discriminator (overlap rule)

  25. Defining supertype/subtype hierarchies: possible for any of the subtypes to have other subtypes defined on it In University most general entity type would be PERSON (root) - attributes are SSN (Social Security Number – Identifier), Name, Address, Gender and Date_Of_Birth Next we define all major subtypes of the root, here there are 3, EMPLOYEE, STUDENT and ALUMNUS (already graduated) Person may belong to more than one subtype (such as ALUMNUS and EMPLOYEE) so overlap rule is used. Overlap allows for any overlap (3-way) so if certain combinations are not allowed a more refined supertype/subtype hierarchy would have to be developed to eliminate these

  26. The next step is to evaluate whether any of the subtypes already defined qualify for further specialisation. Here employee has two subtypes, FACULTY and STAFF Because there may be types of employee other than Faculty and Staff, the partial specialisation rule is indicated. However, such an employee cannot be both Faculty and Staff at the same time, so the disjoint rule is indicated in the circle Two subtypes are defined for student: GRADUATE_STUDENT and UNDERGRADUATE STUDENT Notice that total specialisation and the disjoint rule are specified

  27. Example of supertype/subtype hierarchy

  28. Entity clustering: EER diagrams difficult to read when there are too many entities and relationships • Entity cluster: set of one or more entity types and relationships grouped into a single abstract entity type • Entity-clustering - hierarchical decomposition of a macro-level view of the data model into finer and finer views, eventually resulting in the full, detailed data model. • First of Figs. shows completed data model with dashed lines drawn around possible entity clusters, second shows • final result of transforming this into an EER diagram of only entity clusters and relationships. Formed: • 1. By abstracting a supertype and its subtypes (see CUSTOMER) • 2. By combining directly related entity types and their relationships (SELLING_UNIT, ITEM, MATERIAL, MANUFACTURING)

  29. Possible entity clusters for Pine Valley Furniture (PVF) Related groups of entities could become clusters

  30. EER diagram of PVF entity clusters More readable, isn’t it?

  31. SQL: The SELECT Statement: Is used for queries on single or multiple tables. It has the following clauses: • SELECT - Lists the columns (and expressions involving columns) from base tables or views to be projected into the table that will be the result of the command • FROM - Identifies the table(s) or view(s) from which columns will be chosen to appear in the result table, and includes the tables or views needed to join tables to process the query • WHERE - Includes the conditions for row selection within a single table or view, and the conditions between tables or views for joining • The first two are required, and the third is necessary when only certain table rows are to be retrieved, or multiple tables are to be joined • Following Fig. Shows how we can display product name and quantity from the PRODUCT view for all products costing less than $275

  32. SELECT Example • Find products with standard price less than $275 • SELECT PRODUCT_NAME, STANDARD_PRICE • FROM PRODUCT_V • WHERE STANDARD_PRICE < 275 Comparison Operators in SQL

  33. Distinct and * If the user does not want to see duplicate rows in the result, SELECT DISTINCT can be used, so SELECT DISTINCT PRODUCT_NAME would display a results table without duplicate rows SELECT * (where * is a wildcard to indicate all columns) displays all columns from all the tables or views in the FROM clause

  34. Using expressions These are mathematical manipulations of the data in the table Some are stored functions, such as SUM or AVG e.g. what is the average standard price for each product in inventory (to be put in the result AVERAGE)? SELECT AVG (STANDARD_PRICE, AS AVERAGE FROM PRODUCT_V;

  35. Functions such as COUNT, MIN, MAX, SUM and AVG of specified columns in the column list of a SELECT command may be used to specify that the resulting answer table is to contain aggregated data instead of row-level data. • Using any of these aggregate functions will give a one-row answer. e.g. using the COUNT aggregate function to find totals, asking how many different items were ordered on order number 1004 • SELECT COUNT(*) FROM ORDER_LINE_V • WHERE ORDER_ID = 1004; Note: with aggregate functions you can’t have single-valued columns included in the SELECT clause

  36. Using functions It is easy to confuse the functions COUNT(*) and COUNT COUNT(*) counts all rows selected by a query, regardless of whether any of the rows contain null values COUNT tallies only those rows that contain a value, it ignores null values SUM and AVG can only be used with numeric columns COUNT, COUNT(*), MIN and MAX can be used with any data type. e.g. using MIN on a text column will find the lowest value in the column - the one which is closest to the beginning of the alphabet

  37. Wildcards can also be used in the WHERE clause if an exact match is not possible Here the keyword LIKE is paired with wildcard characters and usually a string containing the characters that are known to be the desired matches The wildcard character ‘%’ is used to represent any collection of characters e.g. using LIKE ‘%DESK’ when searching PRODUCT_DESCRIPTION will find all the different kinds of desks The underscore ,_, is used to represent exactly once character So using LIKE ‘_-drawer’ when searching PRODUCT_NAME will find any products with specified drawers, such as ‘3-drawer’, ‘5-drawer’ etc.

  38. Using Boolean operators • AND, OR, and NOT Operators for customizing conditions in WHERE clause. NOT is evaluated first, then AND, then OR The following query lists product name, finish and unit price for all desks and all tables that cost more than $300 in the PRODUCT view: • SELECT PRODUCT_DESCRIPTION, PRODUCT_FINISH, STANDARD_PRICE • FROM PRODUCT_V • WHERE (PRODUCT_DESCRIPTION LIKE ‘%Desk’ • OR PRODUCT_DESCRIPTION LIKE ‘%Table’) • AND UNIT_PRICE > 300;

  39. In and not in lists To match a list of values, use IN e.g. list all customers who live in the warmer American states: SELECT CUSTOMER_NAME, CITY, STATE FROM CUSTOMER_V WHERE STATE IN (‘FL’, ’TX’, ‘CA’, ‘HI’);

  40. Sorting results: ORDER BY- sorts the final result rows in ascending or descending order • e.g. list customer, city and state for all customers in the CUSTOMER view whose address is in Florida, Texas, California or Hawaii. List the customers alphabetically by state, and alphabetically by customer within each state: i.e. sort the results first by STATE, and within a state by CUSTOMER_NAME: • SELECT CUSTOMER_NAME, CITY, STATE • FROM CUSTOMER_V • WHERE STATE IN (‘FL’, ‘TX’, ‘CA’, ‘HI’) • ORDERBY STATE, CUSTOMER_NAME;

  41. Sorting Results: Group by: groups rows in intermediate results table, values in those rows same for one or more columns • Useful paired with aggregates such as SUM or COUNT • Divides table into subsets (by groups). Then an aggregate function can be used to provide summary information • Scalar aggregate: a single value returned from SQL query using an aggregate function • Vector aggregate: multiple values returned from SQL query with aggregate function (via GROUP BY) • e.g. count number of customers with addresses in each state SELECT STATE, COUNT(STATE) FROM CUSTOMER_V GROUP BY STATE;

  42. GROUP BY It is also possible to nest groups within groups, the same logic is used as when sorting multiple items e.g. count the number of customers with addresses in each city to which we deliver. List the cities by state: SELECT STATE, CITY, COUNT(CITY) FROM CUSTOMER_V GROUP BY STATE, CITY; In general, each column referenced in the SELECT statement must be referenced in the GROUP BY clause, unless the column is an argument for an aggregate function included in the SELECT clause

  43. HAVING • The HAVING clause acts like a WHERE clause, but it identifies groups that meet a criterion, rather than rows. Therefore we usually see a HAVING clause following a GROUP BY • e.g. find only states with more than one customer: SELECT STATE, COUNT(STATE) FROM CUSTOMER_V GROUP BY STATE HAVING COUNT(STATE) > 1; Using WHERE here would not work, because WHERE does not allow aggregates - also WHERE qualifies rows, whereas HAVING qualifies groups

  44. Using all 6 clauses e.g. list the product finish and average standard price for each finish for selected finishes where the average standard price is less than $750: SELECT PRODUCT_FINISH, AVG(STANDARD_PRICE) FROM PRODUCT_V WHERE PRODUCT_FINISH IN(‘Cherry’, ‘Natural Ash’, ‘Natural Maple’, ‘White Ash’) GROUP BY PRODUCT_FINISH HAVING AVG(STANDARD_PRICE) < 750 ORDER BY PRODUCT_FINISH;

  45. Using all 6 clauses The following Fig. Shows the order in which SQL processes the clauses of a statement Arrows indicate the paths that may or may not be followed Only the SELECT and FROM clauses are mandatory The processing order is different from the syntax order As each clause is processed an intermediate results table is produced that will be used for the next clause (users do not see the intermediate tables, only the final results)

  46. SQL statement processing order

More Related