280 likes | 430 Views
Relational Data Model. Data Model Structures and access techniques provided by a particular database management system (DBMS) are called its data model DBMS became popular from 1970 to 1980 Each model had its advantages and its disadvantages
E N D
Data Model • Structures and access techniques provided by a particular database management system (DBMS) are called its data model • DBMS became popular from 1970 to 1980 • Each model had its advantages and its disadvantages • In many ways the relational data model represented an attempt to simplify the earlier data models
File Management Systems • Before the existence of DBMS, the data were stored in separate files. • There was no link from one file to another • If the structure of the data changed (ex: adding more fields), programs that were using the file had to change • Problems became more severe when the number of the programs using the files increased over time
EmpId EmpName Address City ………………. EmpId EmpName Address City ………………. EmpId EmpName Address City ………………. EmpId HoursWorked Wage ………………. • This is an example of payroll application using a file management System. (FD is the file descriptor in the picture) Employee Update Program FD Employee Master File Employee Report Program FD Check-writing Program FD FD Timecard File
Hierarchical Database • One of the most important application for the earliest database management system was production of manufacturing companies such as car companies • Car companies wanted to produce 10000 units of one model, 5000 of another model and so on. • They needed a database to collect all of this information and required tools to retrieve the information as fast as possible. • Information had to be organized in a form of parent and child structure: • a car is composed of a motor, body , etc. • Similarly each subpart were composed of other subparts and so on. Ex: an engine has spark plugs, wires, etc..
Car1 Car3 Car2 Body Body Body Engine Engine Engine Transmission Transmission Transmission …... …... …... right Door Left Door Hood Roof right Door Roof Left Door Left Door right Door Roof Hood Hood …… …… …… Handle Handle Handle Lock Window Window Window Lock Lock ….. ….. ….. ….. ….. ….. ….. ….. …. ….
Hierarchical model uses upside down trees. • A tree represents parent/child relationships • For example, a car consists of body, engine, transmission, etc. • The root of the tree and all of its associated children would make one record (specifying a particular item: i.e a car) • Pointers were used to link a parent to its children or a child to another child • In general, to access the data, a program would • Find a particular car • Move down to the first child • Move up to its parent • Move sideways to the next child • Retrieving the data in a hierarchical database required navigating through the records, moving up, down, and sideways one record at the time • The most popular hierarchical database was Information Management System (IMS) introduced in 1968
Advantages of Hierarchical database: • Simple Structure: • Organization of IMS database was easy to understand • Parent/Child Organization: • IMS was a good model for representing parent/child relationship (A is part of B) or (A is owned by B) • Performance: • IMS stored parent/child relationships as physical pointers from one data record to another • This would speed up the movement from one record to another • IMS is still widely used. It is still considered as a good DBMS in places with lots of transactions processing (ex: banks)
Network Databases • Hierarchical database could not answer the demand of some business oriented environment. • For example, in an order processing company, a single order might participate in more than one parent/child relationship. • For instance, a particular order should be linked to • The customer who placed it • The sales person who took it • The product ordered • This could not be done by IMS • To deal with these situations, network data model was developed: children could have more than one parent
Customers Products Salespeople • . • . • . • . • . • . • . • . • . Car Car Car Car Car Car Acme Mfg Size 4 Widget Bill Adams • . • . • . Car Car #112963 Orders • Example of parent/child relationship in network database models
In 1971, the conference on the systems languages published an official standard for network databases which became known as CODASYL model • A programmers would access the network database as follows: • Find a specific parent record by key (ex: customer number) • Move down to the first child in a particular set (the first order placed by this customer • Move sideways from one child to the next in the set (the next order placed by this customer) • Move up from a child to its parent in another set ( the salesperson who took the order)
Advantages of Network database • Flexibility: • Multiple parent/child relationships allowed a network database to represent data that did not have a simple hierarchical structure • Standardization: • The CODASYL standard boosted the popularity of the network model, and minicomputer vendors such as Digital Equipment Corporation and Data General implemented network Databases • Performance: • Despite their greater complexity, network databases boasted performance approaching that of hierarchical databases.
Disadvantages of Hierarchical and Network models • They have rigid structure: • The structure of the records had to be known in advance. • Changing the database structure required rebuilding the entire database • Querying the database was not always easy. Retrieving simple information form the database could cause programmer to write lots of code • Some of this code was quite complicated
Relational Data Model • Disadvantages of hierarchical and network models led to creation of Relational Data Model by Dr. Codd in 1970 • In this course, we teach Relational data model and use Structured Query Language (SQL) used to manipulate the data in the database. • Definition: • A relational database is a database where all data visible to users is organized strictly as tables of data values and where all database operations work on these tables • In this model information is stored in a database as simple row/column tables of data • Next slide shows an example of tables in a relational database
Orders OrderNum OrderDate Cust Rep MFR Product QTY Amount 112961 17-DEC-89 2117 106 REI 2A44L 7 31500 113012 11-JAN-90 2111 105 ACI 41003 35 3745 112989 03-JAN-90 2101 106 FEA 114 6 1458 113051 10-FEB-90 2118 108 QSA K47 4 1420 112968 12-OCT-89 2102 101 ACI 41004 34 3978 113036 30-JAN-90 2107 110 ACI 4100Z 9 22500 113045 02-FEB-90 2112 108 REI 2A44R 10 45000 112963 17-DEC-89 2103 105 ACI 41004 28 3276 113013 14-JAN-90 2118 108 BIC 41003 1 652 113058 23-FEB-90 2108 109 FEA 112 10 1480 112997 08-JAN-90 2124 107 BIC 41003 1 652 112983 27-DEC-89 2103 105 ACI 41004 6 702 113024 20-JAN-90 2114 108 QSA XK47 20 7100 113062 24-FEB-90 2124 107 FEA 114 10 2430 112979 12-OCT-89 2114 102 ACI 4100Z 6 15000 …… SalesReps Customers • Salesrep Name Age RepOffice Title HireDate Manager Quota Sales • 105 Bill Adams 37 13 Sales Rep 12-FEB-88 104 350000 367911 • 109 Mary Jones 31 11 Sales Rep 12-OCT-89 106 300000 392725 • 102 Sue Smith 48 21 Sales Rep 10-DEC-86 108 350000 474050 • 106 Sam Clark 52 11 VP Sales 14-JUN-88 275000 299912 • 104 Bob Smith 33 12 Sales Mgr 19-MAY-87 106 200000 142594 • 101 Dan Roberts 45 12 Sales Rep 20-OCT-86 104 300000 305673 • 110 Tom Synder 41 Sales Rep 13-JAN-90 101 75985 • 108 Larry Fitch 62 21 Sales Mgr 12-OCT-89 106 350000 361865 • 103 Paul Cruz 29 12 Sales Rep 01-MAR-87 104 275000 286775 • Nacy Angelli 49 22 Sales Rep 14-NOV-88 108 300000 186042 • …. • …. CustNum Company CustRep Credit_limit 2111 JCP Inc. 103 50000 2102 First Corp. 101 65000 2103 Acme Mfg. 105 50000 2123 Carter and Sons 102 40000 2107 Ace International 110 35000 2115 Smithson Corp. 101 20000 2101 Jones Mfg. 106 65000 2112 Zetacorp 108 50000 2121 QMA Assoc. 103 45000 2114 Orion Corp. 102 20000 …… Products MfrId ProductId Description Price QtyOnHand REI 2A45C RATCHET LINK 79 210 ACI 4100Y WIDGET REMOVER 2750 25 QSA XK47 REDUCER 355 38 BIC 41672 PLATE 180 0 IMM 779C 900-LB BRACE 1875 9 ACI 41003 SIZE 3 WIDGET 107 207 ACI 41004 SIZE 4 WIDGET 117 139 BIC 41003 HANDLE 652 3 IMM 887P BRACE PIN 250 24 QSA XK48 REDUCER 134 203 REI 2A44L LEFT HINGE 4500 12 …. Offices Office City Region Mgr Target Sales 22 Denver Western 108 300000 186042 11 New York Eastern 106 575000 692637 12 Chicago Eastern 104 800000 735042 13 Atlanta Eastern 105 350000 367911 21 Los Angeles Western 108 725000 835915
Table • Definition: • A table is a rectangular object with rows and columns • For example in the office table: • Each row of the office table represents a single physical entity • Each column of the offices table represents one item of data that is stored in the database for each office: • Ex: City column represents the location of the office • An alternative term for column is attribute • Each row of the table contains exactly one data value in each column
In each column of a table, all of the data values in that column have the same type. For example: • City column values are words • Sales values are money type • Mgr values are integer • Each column in a table has a column name which is written as a heading at the top of the column • Column names must be unique in a table • The columns of a table have a left-right order. That is defined when the table is first created. • The order of the column has no effect when any action is done against the table
Each table must have at least one column • Almost all commercial DBMS products impose maximum of 255 columns per table • A table can have zero or more rows • A table with zero rows is called an empty table • Order of the rows is not important in a table. • Most relational DBMSs either do not impose any limit on the number of rows or their limit is a very large number • A common limit is approximately 2 billion rows
Primary Key • Definition: • A primary key is a column or combination of two or more columns that uniquely identifies each row of a table. • Since the order of rows in a table is irrelevant, rows cannot be identified based on their positions in a table • Ex: row 1, row 2, row 20 • In a well-designed relational database each table has a primary key. • If the primary key contains two or more columns, it is called a composite primary key
Example of primary key • Consider the Offices table Office City Region Mgr Target Sales 22 Denver Western 108 300000 186042 11 New York Eastern 106 575000 692637 12 Chicago Eastern 104 800000 735042 13 Atlanta Eastern 105 350000 367911 21 Los Angeles Western 108 725000 835915 • “Office” column (attribute) can be a good choice for the primary key because each office has a different office id • However, city is not a good choice because more than one office may be located in the same city.
Consider the Products table MfrId ProductId Description Price QtyOnHand REI 2A45C RATCHET LINK 79 210 ACI 4100Y WIDGET REMOVER 27.50 25 QSA XK47 REDUCER 355 38 BIC 41672 PLATE 180 0 IMM 779C 900-LB BRACE 1875 9 ACI 41003 SIZE 3 WIDGET 107 207 ACI 41004 SIZE 4 WIDGET 117 139 BIC 41003 HANDLE 652 3 IMM 887P BRACE PIN 250 24 QSA XK48 REDUCER 134 203 REI 2A44L LEFT HINGE 4500 12 …. • What is a good primary key for this table?
In his case, MrfId by itself, is not a good choice to be a primary key because more than one manufacturer may produce more than one product • Further, ProductId by itself is not a good choice either because the same product can be produced by more than one manufacturer. • However, combination of both is unique in every row. • This is an example of composite primary key. • A table with a primary key is called a relation. A relation is a table in which no duplicate rows can exist.
SalesReps Salesrep Name Age RepOffice Title HireDate Manager Quota Sales 105 Bill Adams 37 13 Sales Rep 12-FEB-88 104 350000 367911 109 Mary Jones 31 11 Sales Rep 12-OCT-89 106 300000 392725 102 Sue Smith 48 21 Sales Rep 10-DEC-86 108 350000 474050 106 Sam Clark 52 11 VP Sales 14-JUN-88 275000 299912 …. …. Offices Office City Regin Mgr Target Sales 22 Denver Western 108 300000 186042 11 New York Eastern 106 575000 692637 12 Chicago Eastern 104 800000 735042 13 Atlanta Eastern 105 350000 367911 21 Los Angeles Western 108 725000 835915 • Relationship • In contrast with the older data models (hierarchical and network models) relational data models do not include explicit parent/child pointer connection • Then how does the parent/child model is represented in the relational data model? • Consider the following two tables
SalesReps Salesrep Name Age RepOffice Title HireDate Manager Quota Sales 105 Bill Adams 37 13 Sales Rep 12-FEB-88 104 350000 367911 109 Mary Jones 31 11 Sales Rep 12-OCT-89 106 300000 392725 102 Sue Smith 48 21 Sales Rep 10-DEC-86 108 350000 474050 106 Sam Clark 52 11 VP Sales 14-JUN-88 275000 299912 …. …. Offices Office City Regin Mgr Target Sales 22 Denver Western 108 300000 186042 11 New York Eastern 106 575000 692637 12 Chicago Eastern 104 800000 735042 13 Atlanta Eastern 105 350000 367911 21 Los Angeles Western 108 725000 835915 • The parent is the offices table • The child is the salesreps table because the salesreps works in an office • Relationships are created by having the same data in two or more tables
Note that the RepOffice column salesreps table contains the office number of the sales office where each sales person works • The values of the RepOffice column is the set of office numbers found in the office column of the offices table • We will see how this restriction is imposed when we discuss about creating tables later in the course • For example, it is possible to find the sales office where “Mary Jones” is working by finding the value of Mary Jones RepOffice (11) and finding the corresponding row offices table • So, the parent/child relationship between two tables A and B is not represented by explicit pointers but by common data values stored in the two tables • Programmers must specify this relationship when they create the tables
SalesReps Salesrep Name Age RepOffice Title HireDate Manager Quota Sales 105 Bill Adams 37 13 Sales Rep 12-FEB-88 104 350000 367911 109 Mary Jones 31 11 Sales Rep 12-OCT-89 106 300000 392725 102 Sue Smith 48 21 Sales Rep 10-DEC-86 108 350000 474050 106 Sam Clark 52 11 VP Sales 14-JUN-88 275000 299912 …. …. Offices Office City Regin Mgr Target Sales 22 Denver Western 108 300000 186042 11 New York Eastern 106 575000 692637 12 Chicago Eastern 104 800000 735042 13 Atlanta Eastern 105 350000 367911 21 Los Angeles Western 108 725000 835915 • Foreign Key • Definition: • Foreign key is a column (or combination or two or more columns) whose value matches the primary key of another table • Together, primary key and the foreign key make the parent/child relationship in relational data models Primary key Foreign key
Orders OrderNum OrderDate Cust Rep MFR Product QTY Amount 11296 17-DEC-89 2117 106 REI 2A44L 7 31500 113012 11-JAN-90 2111 105 ACI 41003 35 3745 112989 03-JAN-90 2101 106 FEA 114 6 1458 113051 10-FEB-90 2118 108 QSA K47 4 1420 …. SalesReps Salesrep Name 105 Bill Adams 109 Mary Jones 102 Sue Smith 106 Sam Clark …. Customers CustNum 2111 ….. 2102 …. 2103 …. ………. ……. Products MfrId ProductId REI 2A45C ….. ACI 4100Y ….. QSA XK47 …… …. ….. Example of Foreign Key