210 likes | 242 Views
Learn the theory of relational math and the advantages of using a relational database model which simplifies data organization, reduces duplicate data, and ensures data integrity. Explore definitions, key attributes, and normalization in this comprehensive guide.
E N D
Slides4 The Relational Database Model:
The Relational Database Model • Based on the theory of relational math (set theory) • It is an “automatic transmission” database (with embedded relationships between tables) which replaces the “standard transmission” database (which employs flat-file techniques with explicit pointers between files and records) • Flat-files (collections of similar records) are being replaced by collections of interrelated files • Allows data to be broken down into logical, smaller, more manageable units - simplifies the organization of complex sets of data
Why A Relational Model? • Duplicate data reduced - less input, maintenance, storage, and improved data integrity • Data independence: Data can be thought of as being stored in tables regardless of how physically stored. • Application independence: Databases defined independently from the systems and programs that will use them - allows users to create ad hoc queries, rather than only receive pre-specified reports • A change in the database does not require rewriting all the application program codes. Ability to share same data across multiple applications and systems. • It has the ability to maintain several tables of related information that can be accessed by several different users in many different ways - a single query can retrieve data from more than one table.
Some Definitions... Data: Raw facts about the organization and its business transactions that are of interest to the end user Database: A computer structure that houses a collection of data Relational database: Stores information about instances ofentities (a specific sales event, salesperson), attributes of those entities (invoice no., salesperson ID) , and the relationships among these entities (each sale can only have one salesperson) - perceived by user to be a collection of two-dimensional tables RDBMS: Software that manages a relational database, controls access, and allows users to retrieve requested data through a standard data-access language, SQL.
Entity-type: Something of significance about which you want to store data in a database, e.g., customers, employees, suppliers, inventory items (note: this is a data modeling term – an entity becomes a table in a RDBMS) Table: An entity-type (e.g., customer) and its attributes Attribute: A property or characteristic of an entity. A column in a relational database table, e.g., customer name, reference #, address, zip ((note: this is a data modeling term – an attribute becomes a column in a RDBMS Row (tuple, record): A record of data in a database table - a single occurrence or entity instance Value: Data in a “cell” – the intersection between row and column in a database table
Types of Attributes Key (identifier in data modeling): Attribute, or combination of attributes, that determines the values of other attributes in each row Composite Key: Multiple-attribute keys; may be further subdivided, e.g., phone may be area code and number - can be a primary key Candidate Key (CK): Attribute (or a minimum combination of attributes) that uniquely identifies each row in a given table - there can be more than one CK (employee entity type: SSN; assigned ID#) Primary Key (PK) ( a unique identifier in data modeling): A CK selected to uniquely identify all other attributes in a given row; cannot be Null Foreign Key (FK): ( a relationship in data modeling): Attribute (combination of attributes) whose value(s) must match the Primary Key in another table in the same database, or whose value(s) must be Null Non-key Attribute: Attribute that is not part of a key
Attributes With A Null Value • Null Value: An unknown attribute value (e.g., salesperson not yet allocated to a customer) - it is not a zero. It is an optional attribute. • Inclusion of nulls in a table is important - they provide a consistent way to distinguish between valid data such as a 0 and missing data, e.g., an account payable with 0 is good to see; one with an unknown balance can indicate a significant problem • In most cases, nulls appear as blanks on a query’s result table on a screen
Relationships • Data modeling term that indicates an association between tables: How the things of significance are related (A FK must match to an existing PK, or else be NULL) • This controlled redundancy allows linking of tables (hence “relational”) • Entity-Relationship Diagram (ERD): A data model (at the conceptual level) that shows the relationships enforcing business rules between entities (tables) in a database environment (Fig. 5.4)
Business Rules • Narrative descriptions of policies, procedures, or principles in an organization • Examples: • A pilot cannot be on duty for more than 10 hours in a 24-hour period • A professor must teach at least three classes in a semester • A class may not have fewer than 10 enrollments
Normalization • Process of taking a “raw” database and breaking it into logical units called tables, by following theoretical rules called normal forms • The intent is to create a degree of controlled redundancy that allows two or more tables to be joined, by matching a FK in one table to a PK in another table • Referential integrity (constraint created upon table creation) is enforced when every non-null FK value must match an existing PK value (if there is a FK, there has to be a PK for that FK in another table) • Normalization has six nested normal forms • Generally a well-formed business database will be normalized through 3rd normal form (3NF)
Benefits of Normalization • Greater overall database organization • Minimize data redundancies • Data consistency within the database • A more flexible database design • Data can be used more productively • A better handle on database security Disadvantage of Normalization • Reduced database performance because database must locate requested tables and join data - requires additional processing logic
Normal Forms • Normalization through a series of stages called NORMAL FORMS • Each NF depends on normalization steps taken in the previous NF • First Normal Form - 1NF • Second Normal Form - 2NF • Third Normal Form - 3NF
1NF • First normal form rules: • All key attributes must be defined • There must be no repeating groups (values), i.e., each row/column intersection can have only one value • All attributes must be functionally dependent on the PK, or part of the PK - e.g., SSN determines DOB, but DOB cannot determine SSN Hint: Put all attributes in a two-dimensional flat table, with no repeating values
General Journal Entry:Traditional View - Unnormalized Assume that the transaction # will reset to 1 at the beginning of the next fiscal year
2NF • Second Normal Form Rules: • Table is in 1NF; and • Table includes no partial dependencies; that is, no attribute is dependent on only portion of the primary key – must be dependent on entire PK Hint: Examine non-key attributes to determine whether any are dependent on only portion of a composite PK - this would violate 2NF If a table only has one attribute as a PK, then it is in 2NF.
3NF • Third Normal Form Rules: • Table is in 2NF and • There are no transitive dependencies Hint: You will violate 3NF if you can deduce the value of a non-key attribute by knowing the value of another non-key attribute