410 likes | 429 Views
Entity Relationship Approach. Top-down approach to data modeling Uses diagrams Normalization - confirms technical soundness Entity Relationship - reflects business requirements. Using Diagrams. Define a procedure for representing existing models
E N D
Entity Relationship Approach • Top-down approach to data modeling • Uses diagrams • Normalization - confirms technical soundness • Entity Relationship - reflects business requirements
Using Diagrams • Define a procedure for representing existing models • Develop diagrams directly from business requirements • Business-oriented terminology based on... • Entities (things of interest to business) • Relationships between entities
Hospital Example • Graphic 2-13 • Format does not encourage a quick appreciation of the main concepts and rules • e.g. Each operation can be performed by only one surgeon but this is not immediately apparent • This is a very simple model, as we progress to more complex models the problem of presentation becomes more serious
Data Model Diagram • Sometimes called data structure diagram, they are based on two symbols • A "box" represents a table (a rounded rectangle is usually used) • A line drawn between two boxes represents a foreign key pointing back to the table that has the primary key
Existing Tables • Simply draw a box for every table in the model with the name of it inside • Graphic 3-2
Existing Relationships • Draw a line between the two boxes and indicate the direction of the link by putting a "crows foot" at the foreign key end • Think of the crows foot as an arrow pointing back to the table with the primary key • Graphic 3-3
Interpreting the Diagram • The model specifies a Surgeon table (data about surgeons) • The model specifies an Operation table (data about operations) • Each Operation can be associated with only one surgeon • Each Surgeon can be associated with many Operations
Summarizing Relationships • This represents the relationship between tables (implied by the primary and foreign keys) without having to list any column names at all
Asking Questions • We could now ask the business specialist • Is it true that each operation is performed by one surgeon only? • Can we count on this in the future? • We can make changes now (in the model) while the cost is still low • Question: If more than one Surgeon is allowed, how would we handle it?
Another Solution • We could choose to track only the surgeon who managed the operation • Record decisions like this in the diagram • Avoids the question being revisited • Specifies precisely what data will be held • Cannot answer question - "In how many operations did surgeon number 12 at hospital number 18 participate?"
Annotated Relationship • Graphic 3-4 • As well as annotating the diagram, we should change the column name from Surgeon Number to Managing Surgeon Number
Optionality • What if there are no surgeons required for an operation? • e.g. a small cut with antibiotics given • Operation table may have a NULL value for Surgeon Number • To show the difference between optional or mandatory relationships use the convention shown in Graphic 3-5
Exercise • Draw the relationships for the rest of the model
First Pass • Graphic 3-6 • Checking the diagram will often reveal unsound assumptions and misunderstandings • Conversely, it may increase confidence in the model for both user and designer
Operation and Operation Type • Are we sure that each operation can be of only one type? • How would we represent a gall bladder removal and appendectomy operation? • 1. Allow only "simple" operations - need a new table for repeating group • 2. Allow complex operation types such as "Combined Gall Bladder removal and Appendectomy"
Both will Work • Option 2 will be easier to implement • Option 1 will be more elegant • Can ask "list all operations that involved appendectomies"
Redundant Lines • Note the Lines connecting Hospital, Operation, and Surgeon tables • line from Hospital to Surgeon • line from Surgeon to Operation • line from Operation to Hospital • Does this line add anything to our knowledge of the business rules?
Rules for Removing Lines • Graphic 3-7 • If A derives B and B derives C than an A derives C connecting line is not needed • If B is optional, than we cannot remove the A derives C line • If A derives C provides different information than A derives B we can not remove A derives C
Exercise • Remove the remaining Redundant linesfrom Graphic 3-7
Solution • Graphic 3-8
Top-Down Approach • Why Normalize at all? Why not just use this Top-down E-R approach? • In practice this is what is most often done, with Normalization being a final check • Using E-R first... • we can start with "What data do we need to keep information about • no need to start with a single overly complex table
Terminology • The Relational Models (used with Normalization) were built on three basic concepts: tables, columns, and keys • The E-R Models: Entities, Attributes, and Relationships • Easier to say, "The relationship between a hospital and surgeon", then "the existence of the primary key of Hospital as a foreign key in the Surgeon table"
Entities • An entity is the "real world" class of things that a table represents • entities: St. John's Hospital • entity type: Hospital • In practice: Entity means Entity type and Entity Instance (row) for Entity
Entities - continued • Some things will need to be represented by more than one entities • e.g. Invoices would be represented by two entities: Invoice Header and Invoice Item • e.g. Quarterly Profit would be derived from sales and expense figures from other entities
Multiple "Things" as One Entity • Example... • Preferred Customer • Corporate Customer • Becomes a Customer Type with a Customers Table
Entity Naming • The name of an entity must be in the singular. (Exactly the opposite of what it says in the SQL book!) • e.g. Account instead of Accounts • Three reasons: • Consistency - standards • Communication - an 'entity is something we need to keep information about • Compatibility with Relationship Name - (we'll look at this later)
Relationships • The lines between the boxes are the relationships between the entities • We name relationships in both directions • "Each company may issue one or more shares" • "Each share must be issued by one company" • Graphic 3-9 show notation were using • Graphic 3-10 shows alternative diagramming notations
Exercise • Draw the E-R model that would represent the relationship between a Manager and Departments within his company. A Manager may be in charge of more than one department
Solution and More Examples • Graphic 3-11 • Graphic 3-12
Suggested Diagramming Tips • Orient your diagram so that the Crows feet are nearer to the bottom of the page • Place Crow's feet on the right • Eliminate crossing lines where possible (but clarity is most important • Duplicate Entities to avoid long relationship lines • Can use a dotted-line to connect the same entity
Many-to-Many Relationships • Many-to-Many Relationships are frequently modeled • Graphic 3-13 • How would we implement the relationship using foreign keys?
We Can't • We can't hold the key to Qualification in the Employee table because an Employee could have several qualifications • Like wise the Qualification table would need to support multiple Employee Keys • A Normalized model cannot represent many-to-many relationships with foreign keys, yet such relationships certainly exist
Representing Many-to-Many • Can't use foreign keys • Can use a table
Normalized Many-to-Many • Graphic 3-16 • Whenever we encounter a many-to-many relationship between two entities we represent it with a new entity linked to the two original entities. • The Primary keys of each original table become together become the Primary Key of the new entity ("the resolution entity")
Choice of Representation • Conceptual vs. Physical Model • Conversion is not totally mechanical • New Non-key attributes in the resolution entity? • Different Name for the resolution entity? • e.g. instead of Employee-Qualification table we might use "Certifications" table
One-to-One Relationships • Should not automatically combine the entities into a single entity • When to split • distinct real-world concepts (e.g. Person-Passport) • Separating attribute groups (e.g. Detail vs. default) • Transferables (e.g. Part type -stored in-Bin)
Self-Referencing Relationships • Graphic 3-18 • Each employee may manage one or more employees • Each employee may be managed by one employee • carry the foreign key in the same table as the Primary key • e.g. Manager Id -> Employee Id
Attributes • Sometimes show a few attributes to clarify meaning (e.g. Primary, Foreign keys) • Don't show all the attributes, because we want the big picture • Keep in separate lists for each entity
More Normalization • If in the process of listing attributes, we find repeating groups or lookup tables • Normalize the design • Update the E-R model
Summary • Data models can be presented diagrammatically by using a box to represent each table and a line for each foreign key relationship • This provides a language for "top-down" data models; prior to developing attributes • Many-to-Many relationships are resolved with a "resolution entity"
Last Slide - Entity Relationship • Assignment #10 due next week