540 likes | 835 Views
Relational database design. Lecture 1- 2 Relational model revisited Analysis patterns Design aspects of the transformation conceptual schema – relational database schema . Maria Bergholtz SU/KTH. Relational database design. Design- specify form and structure.
E N D
Relational database design Lecture 1- 2 Relational model revisited Analysis patterns Design aspects of the transformation conceptual schema – relational database schema Maria Bergholtz SU/KTH
Relational database design Design- specify form and structure physical design logical design
Physical design A phonebook contains a data set, where each element is defined by name, address and telephone number. The catalogue contains a number of A4-pages where each page contains three fields. Each data element spans one or several lines. The data elements are sorted on names ascending. Describes how data is represented, not what data means.
Logical design A telephone catalogue for a certain area contains the following types of information: For all addresses in the area that have a telephone registered; names of one or several persons are listed together with address information (street, postal code and postal area) and telephone number (numbers). Describes what data means, not how it is represented.
Relational databases A relational database is a database that is viewed by a user as a collection of tables – regardless of how the information in the database is physically stored.
Relational schema (aka table definition) e Domain e.g. the domain of attribute BUDGET is an integer larger than zero. Grade – number of attributes
Relation (aka ’the rows’ (defined on the relational schema)) Project PRNR START READY PLACE BUDGET Prnr1 940201 960131 Goteborg 90 Relation Tuple Prnr2 930126 951231 Falun 45 (row) Prnr3 951115 970630 Stockholm 220 Prnr4 920101 960701 Lund 550 - Cardinality Number of tuples (rows) Cell
Properties of relations The values in each cell are atomic The values in each column (attribute) have the same data type Each row (tuple) is unique The order between the attributes are/are NOT significant The order of the rows are/are NOT significant The name of each attribute (column) is unique within a relation
Relational database design Physical design Logical design Logical data model Secondary storage structures for relational databases Transformation from logical data model into relational schema (synthetic data- base design) Optimization of relational Normalization databases (Analytical database design)
Models Model: A mapping from certain parts of some part of reality onto agreed upon structures. Model example – a geographical MAP Models simplify Models distort – Greenland vs Africa Models focus – topographical, political, …
Modeling continued…. Validation of data as early in systems development as possible! The system shall not have to control that data is not correct it shall exclude/reduce the possibility that it is not correct!
Graphical notation for conceptual schemas: UML class schemas (parts of…) association (relation) Class MONKEY 1..* BANANA 0..1 eats Name : String Cardinality constraint (for the attribute association))
Multiplicity – an exercise has_mother PERSON has_child
Data abstractions: Inheritance Animal Mammals Birds Carnivores Penguins Herbivores Rodents Hoofed animals Dachshunds Gnus Rabbits
Conceptual model of an hierarchical structure: ANIMAL isa isa MAMM- BIRDS ALS isa isa isa CARNI- VORES HERBI- PENGUINS VORES isa isa isa HOOFED ANIMALS DACHS- HOUNDS RODENTS isa isa RABBIT GNUS
Inheritance PERSON ANIMAL Name: String Name: String 1..1 UNIQUE 1..1 UNIQUE isa isa isa isa MAN DOG WOMAN CAT Tax: Integer Military service: 1..1 String: 1..1 DOG and CAT are mutually exclusive MAN och WOMAN are mutually but not exhaustive with respect to exclusive and exhaustive with regard ANIMAL to PERSON A subset-hierarchy contains sub- and super types. The subtypes form a subset of the super type. If the subtypes exhaust the super type they are called exhaustive. If one and the same instance may not belong to more than one subtype then the subtypes are said to be mutually independent with respect to the super type.
Custno: Integer Accountno: Integer 1..1 UNIQUE 1..1 UNIQUE Name : String 1..1 Established: Lone_amount : Integer String 1..1 1..1 Inheritance Customer Account has 1 1..* Person Bank account Salary: Integer 1..1 1..* Company Mortgage employees 0..1 What would this schema look like if ’isa-relationships’ had not been used?
0..* 0..1 Account Customer has Custno: Integer Accountno: Integer 1..1 UNIQUE 1..1 1 1..* Name : String 1..1 Saldo: Integer 0..1 Established: Salary: String 0..1 Integer 0..1
Reification cured ILLNESS ILLNESS 1 0..* 1..* CURE cures Procent: Float 1..1 0..* 1..* TREATMENT TREAT- 1 MENT curer Association “cures” is M:M. If it is necessary to define properties regarding association “cures” the association must be reified, i.e. turned into a class. Any remaining M:M associations may be left as, they are in the conceptual schema.
Recursive structures is reified into SPECIFICATION PRODUCT Procent : Float 1..1 0..* 0..* 0..* 0..* consists_ part_of of consists_of 1 1 1 PRODUCT A special case of reification!
SPECIES titel SpeciesName: String 1..1 UNIQUE Mean weight: Float 1..1 1 of_type 0..* ANIMAL Weight: Float 1..1 Name: String 1..1 Template-Copy structures (power types) Certain objects are regarded as templates for other objects, the ’copies’. A template (aka power type) describes fundamental features that span all copies even if they are not direct properties of the copies. Copies, on the other hand, have properties that are not relevant for the template. Compies are often concrete while templates often are abstract. SPECIES is a template that hold category-information (for instance mean-weight) about the copies, the ANIMALs. ANIMAL have individial properties like weight, name etc.
PARTY PROPOSED_ACTION IMPLEMENTED_ACTION ACTION 0..* 0..* 1..1 0..1 Location Time ref. COMPLETED_ACTION ABANDONED_ACTION 1..1 0..1 Analysis patterns: Actions
1..1 1..1 BOOKING DELIVERY RETURN CANCEL_BOOKING 1..1 0..1 0..* Return time. 0..* Actions: example plus combination with otherpatterns and abstractions PRODUCT TYPE PRODUCT CAR TYPE Colours: 1..* Type: 1..1 if_type CAR Regno 1..1. Colour: 1..1 1..1 0..* PARTY PERSON 1..1 ABANDONED_ACTION 0..* ACTION CAR RENTAL 1..1 0..1 Location Time ref. Penalty PLANNED_ACTION IMPLEMENTED_ACTION Alternative? COMPLETED_ACTION
SPECIAL RA Analysis patterns: Resource allocation [Fowler97] 1 RESOURCE TYPE RESOURCE ALLOCATION Quantity 0..* GENERAL RA CONSUMABLE TYPE ASSET TYPE Price_per_hour Price_per_unit 1 1 1 1 0..* 0..* 0..* 0..* TEMPORAL RESOURCE CONSUMABLE HOLDING 0..* ASSET Number_of_units From Id 1 To 0..* 0..* 1 1
Combination of several patterns and abstractions available resource 1 1 <<Resource Type>> <<Resource Allocation>> RESOURCE TYPE RESOURCE ALLOCATION Type_Name : String 1..1 Quantity : Integer 1..1 0..* 0..* <<Special RA>> <<General RA>> <<Consumable Type>> <<Asset Type>> BOOKING RESOURCE NEED CONSUMABLE RESOURCE ASSET TYPE 0..* 0..* TYPE Price_per_hour : 1..1 Int Price_per_unit : 1..1 Real 0..* 0..* 1 1 1 1 of_type of_type 0..* 0..* 0..* 0..* <<Temporal Resource>> <<Asset>> <<Consumable>> <<Holding>> TIME BOOKING 1 1 0..* 0..* ASSET CONSUMPTION CONSUMABLE uses From : 1..1 RESOURCE Id : 1..1 To : 1..1 No_of_units : 1..1 0..* 0..* 1 1 consumes <<Party>> <<Implemented Action>> 1 1 <<Action Type>> AUDIANCE LECTURE 1 1 1..* 1..* LECTURE TYPE Comapany : 1..1 From : 1..1 0..* 0..* 1 1 Name_of_type : 1..1 1 1 Adress : 1..1 To : 1..1 1 1 <<Completed Action>> RESULT 0..1 0..1 Invoice_date : 1..1 Invoice_payed : 0..1
Exercise reification CLUB PERSON is_member_of Club_name: String Name: String 1 0 UNIQUE, TOTAL UNIQUE, TOTAL Extend the conceptual schema so that it is possible to represent that a certain PERSON entered a certain club at a certain occasion!
Logical database design Transformation from conceptual model into relational database schema ( Synthetical database design )
Rules and events in a RDBMS CONCEPTUAL SCHEMA RELATIONAL MODEL CONCEPT CONCEPT Single valued attribute Domain definition and/or key definition Total attribute "NOT NULL" Derivation rule View Consistency rule Key column definition or domain- definition or trigger or control code in application code or … Event Transaction
Single class – single valued attributes PERSON Personno: String 1..1 UNIQUE Name: String 1..1 Person Key attribute Personno Name shadowed. 610234-7411 Blad Per 561224-5689 Löw Eva 231003-4433 Gren Ove
Single class - multi valued attribute PERSON Personno: String 1..1 UNIQUE Name: String 1..1 Title: String 0..* Person Title Personno Name Personno Title 610234-7411 Blad Per 561224-5689 Civ.ing. 561224-5689 Löw Eva 231003-4433 Fil.dr 231003-4433 Gren Ove 231003-4433 Civ.ing
Keys? Person Title Personno Name Personnr Title 610234-7411 Blad Per 561224-5689 Civ.ing. 561224-5689 Löw Eva 231003-4433 Fil.dr 231003-4433 Gren Ove 231003-4433 Tekn.dr A key is a number of attributes (could be one) that uniquely identifies a row. The on/Those attribute (-s) chosen as identifier is called the primary key. Other possible identifiers are, after the choice of primary key, called alternative keys. A foreign key is one/several attributes in a table that refers to the primary key of another table. “Personno” in table TITLE is part of the primary key for that table but also serves as a foreign key towards table PERSON. Foreign key attributes/columns are the glue that relates tables to each other. All values of a foreign key must correspond to the values of The primary key of another table, or else be NULL. The two last terms Is generally referred to as referential integrity.
Entity integrity No part of the primary key may ever be equal to NULL (the primary key identifies a row and must hence always be present). Referential integrity All values of columns that serve as foreign key columns against another table must correspond to existing primary key values for this table. Or else the values of the foreign key columns must be NULL. This means, among other things, that the foreign key columns must have the same domain as the primary key columns that they are referring to.
Partial associations (as well as attributes) give rise to NULL value problems. If the risk of NULL values is not ‘modeled away’, it is preferable to take care of (= remove) the risk of NULL values at the transformation from conceptual schema into relational data- base schema. Partial associations and attributes are avoided if ‘isa’- relationships are used in modeling. Transformation cont.
NULL What does NULL mean? Value missing? Value exists but is not known? (now...) Value not valid in a certain context? (compare to “template - copies”)
NULL NULL may cause problems in joins. EMPLOYEES DEPARTMENT Pnr Anst.nr Adress Dept.no Departm. Dept.no 11111 1A Byv. 3 3 Research 3 22222 3B Solsv. 6 5 Sales 5 33333 1AA Byv. 5 3 Admin. 1 44444 2B Byv.7 1 55555 1X Solv. 7 NULL 66666 3Y Byv. 11 NULL A join of EMPLOYEES with DEPARTMETN will discard the two last employees. Depending on the circumstances this may be valid or not. Pnr Anst.nr Adress Dept.no Departm. Dept.no 11111 1A Byv. 3 3 Research 3 22222 3B Solsv. 6 5 Sales. 5 33333 1AA Byv. 5 3 Research 3 44444 2B Byv.7 1 Admin 1
The referring attribute is placed so that it gets as few NULL- values as possible Here it is assumed that there exists more PERSONs without CARs than vice versa. CAR PERSON Personno: String 1..1 UNIQUE Regno: String 1..1 UNIQUE owns 0..1 0..1 Type: String 1..1 Name: String 1..1 Person Car Personno Name Regno owned_by Type 610234-7411 Blad Per AIB 436 610234-7411 Volvo 561224-5689 Löw Eva BPL 845 Fiat 561224-5689 231003-4433 Gren Ove FJK 359 NULL Saab
No risk of NULL values. May result in a ineffective solution due to slow joins. If NULL is totally unacceptable: OBS! “owned_by” could equally well have been chosen as primary key! (since we have a 1:1- relation- ship between CAR and PERSON) Owner- ship Person Car Car / Person Car Person owned_by Regno Regno Type Personno Name AIB 436 Volvo 610234-7411 AIB 436 610234-7411 Blad Per BPL 845 Fiat 561224-5689 BPL 845 561224-5689 Löw Eva 231003-4433 Gren Ove
Associations where the cardinality is * in one of the roles Main rule: Foreign key on the ’many’- side!
CAR PERSON Personno: String 1..1 UNIQUE Regno: String 1..1 UNIQUE 0..1 owns 0..* Type: String 1..1 Name: String 1..1 Person Car Personno Name Regno owned_by Type 610234-7411 Blad Per AIB 436 610234-7411 Volvo 561224-5689 Löw Eva BPL 845 610234-7411 Fiat 231003-4433 Gren Ove FJK 359 NULL Saab
M:M- associations must be resolved with a relational table (compare to reification). Positive side effect: NULL values are eliminated. In a conceptual model M:M associations may be present in their own right (as long as the association does not possess properties of its own). In the transformation to relational database schema, each M:M associations is turned into a new table that contains the respective identifiers from each of the classes (tables) that where connected via the association. M:M (“many to many”, “*” : “*”) must be resolved
CAR PERSON Personno: String 1..1 UNIQUE Regno: String 1..1 UNIQUE owns 0..* 0..* Type: String 1..1 Name: String 1..1 0 0 OWNER- 1 1 CAR PERSON SHIP Person Car / Person Car Owned_by Regno Regno Type Personno Name 610234-7411 AIB 436 AIB 436 Volvo 610234-7411 Blad Per 561224-5689 AIB 436 BPL 845 Fiat 561224-5689 Löw Eva 561224-5689 BPL 845 FJK 359 Saab 231003-4433 Gren Ove 561224-5689 JTL 739 JTL 739 BMW
Transforming isa-relationships PERSON Personno :String 1..1 UNIQUE Name: String 1..1 CONSULTANT EMPLOYEE Department: String 1..1 Project: String 1..1 Employee Person Consultant Personno Name Personno Dept. Personno Project 610234-7411 Dam 610234-7411 Blad Per 231003-4433 PR 561224-5689 Herr 561224-5689 Löw Eva 231003-4433 Gren Ove
Complex identifiers COURSE TYPE COURSE of Coursecode: String 1..1 Responsible: String 1..1 0..* 1..1 UNIQUE Start: String 1..1 Name: String 1..1 End: String 1..1 What identifies COURSE TYPE? Coursecode! (UNIQUE and total) No one of the attributes of COURSE is unique and total. Even if we combine all attributes it is not sure that this identifies a COURSE ( a teacher may be responsible for several courses that begin and end the same dates). If we, in turn, choose Start, End and Coursecode (i.e. COURSE:s foreign key against COURSE TYPE), we get an identifier!
COURSE TYPE COURSE of Coursecode: String 1..1 Responsible: String 1..1 0..* 1..1 UNIQUE Start: String 1..1 Name: String 1..1 End: String 1..1 COURSE TYPE Name Coursecode COURSE Coursecode Start End Responsible
Roomno is not enough to identify table ROOM (= it is not possible to choose Roomno as primary key)! Exercise: Transformation of complex identifiers HOTEL PERSON ROOM 1..1 0..* 1..1 0..* Hname: String 1..1 Personno: String 1..1 Rumno: String lives_in UNIQUE part_of UNIK 1..1 Type: String 1..1 Name: String 1..1 ROOM PERSON HOTEL Personno Name Rumno HName Rumno HName HName Typ 650321 Anna 2A Astoria 2A Astoria Astoria 5* 111111 Lisa 3B Ritz 3B Ritz Ritz 3* 222222 Eva 4S Astoria 4S Astoria Plaza 5* 2A Plaza 444444 Pelle 2A Plaza
Surrogate keys User identified keys may be insufficient in a number of ways, e.g. : 1. They may be subject to change. A departments number/name may change if/when a company is reorganized. 2. Different user attributes may be used to identify the same entity/class/table E.g. CUSTOMERS CUSTOMER PERSON may be homonyms, i.e. refer to the. same concept but under different names – schema integration problems. Surrogate keys are system generated unique and total attributes that shall serve as identifiers.
Surrogate keys USERS MAY NOT BE AWARE THAT SURROGATE KEYS ARE USED, SINCE THESE MAY BE USED SYSTEM INTERNALLY ONLY. THERE IS STILL NEED FOR IDENTIFICATION OF ‘REAL LIFE OBJECTS’ VIA USER DEFINED ATTRIBUTES, HENCE EVEN IF THE PRIMARY KEY IS A SURROGATE, ALTERNATIVE KEYS FORMED OF USER DEFINED ATTRIBUTES ARE NEEDED TO BE ANALYSED AND DEFINED! INTERNALLY THE SURROGATE WILL BE USED AS SINGLE VALUED PRIMARY KEYS AND ALSO IN REFERENCES AS FOREIGN KEYS.
HOTEL HName Typ Astoria 5* Ritz 3* Plaza 5* Surrogate keys cont. HOTEL PERSON ROOM 1..1 0..* 1..1 0..* Hname: String 1..1 Personno: String 1..1 Roomno: String 1..1 bor_på UNIQUE ingår_i UNIQUE RoomID: Int 1..1 Type: String 1..1 Namn: String 1..1 UNIQUE! RUM PERSON Personno Name RoomID Rumno HName RoomID 650321 Anna 55 2A Astoria 55 111111 Lisa 55 3B Ritz 33 222222 Eva 67 4S Astoria 67 2A Plaza 16 444444 Pelle 16
Information loss? Exercise: Transform the following two conceptual schemas into two relational database schemas: PERSON PERSON Personno: String 1..1 UNIQUE Personno: String 1..1 UNIQUE 1..1 married_to ISA 1..1 EMPLOYEE EMPLOYEE Empno: String 1..1 UNIQUE Empno: String 1..1 UNIQUE