880 likes | 889 Views
Explore relational model concepts, formal definitions, and schema in database management systems. Learn key attributes and domains for effective data organization.
E N D
Chapters 5-8Relational Data Models, Relational Constraints, and Relational Algebra • Flat file: A two dimensional array of attributes or data items ProductX 1 Bellaire 5 ProductY 2 Sugarland 5 ProductZ 3 Houston 5 Computerization 10 Stafford 4 Reorganization 20 Houston 1 Newbenefits 30 Stafford 4 • Database Management Systems (DBMS): A generalized software system that is used to create, manage, and protect data bases Chapters 5-8
Attribute: A name characteristic or property of an entity = column header Entity: A “thing” in the real world with an independent existence physical existence: person, student, car Chapters 5-8
Domain - The valid set of atomic value for an attribute in a relation e.g. SSN set of 9 digits GPA: 0<= GPA <= 4.0 Atomic - each value in the domain is indivisible Name (Fname, Minit, Lname) – not atomic Fname -- atomic Minit -- atomic Lname -- atomic Chapters 5-8
Relational Model Concepts • A Relation is a mathematical concept based on the ideas of sets • The model was first proposed by Dr. E.F. Codd of IBM Research in 1970 in the following paper: • "A Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970 • The above paper caused a major revolution in the field of database management and earned Dr. Codd the coveted ACM Turing Award Chapters 5-8
Informal Definitions • Informally, a relation looks like a table of values. • A relation typically contains a set of rows. • The data elements in each row represent certain facts that correspond to a real-world entity or relationship • In the formal model, rows are called tuples • Each column has a column header that gives an indication of the meaning of the data items in that column • In the formal model, the column header is called an attribute name (or just attribute) Chapters 5-8
Formal Definitions - Schema • The Schema (or description) of a Relation: • Denoted by R(A1, A2, .....An) • R is the name of the relation • The attributes of the relation are A1, A2, ..., An • Example: CUSTOMER (Cust-id, Cust-name, Address, Phone#) • CUSTOMER is the relation name • Defined over the four attributes: Cust-id, Cust-name, Address, Phone# • Each attribute has a domain or a set of valid values. • For example, the domain of Cust-id is 6 digit numbers. Chapters 5-8
Formal Definitions - Tuple • A tuple is an ordered set of values (enclosed in angled brackets ‘< … >’) • Each value is derived from an appropriate domain. • A row in the CUSTOMER relation is a 4-tuple and would consist of four values, for example: • <632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000"> • This is called a 4-tuple as it has 4 values • A tuple (row) in the CUSTOMER relation. • A relation is a set of such tuples (rows) Chapters 5-8
Formal Definitions - Domain • A domain has a logical definition: • Example: “USA_phone_numbers” are the set of 10 digit phone numbers valid in the U.S. • A domain also has a data-type or a format defined for it. • The USA_phone_numbers may have a format: (ddd)ddd-dddd where each d is a decimal digit. • Dates have various formats such as year, month, date formatted as yyyy-mm-dd, or as dd mm,yyyy etc. • The attribute name designates the role played by a domain in a relation: • Used to interpret the meaning of the data elements corresponding to that attribute • Example: The domain Date may be used to define two attributes named “Invoice-date” and “Payment-date” with different meanings Chapters 5-8
Formal Definitions - State • The relation state is a subset of the Cartesian product of the domains of its attributes • each domain contains the set of all possible values the attribute can take. • Example: attribute Cust-name is defined over the domain of character strings of maximum length 25 • dom(Cust-name) is varchar(25) • The role these strings play in the CUSTOMER relation is that of the name of a customer. Chapters 5-8
Formal Definitions - Summary • Formally, • Given R(A1, A2, .........., An) • r(R) dom (A1) X dom (A2) X ....X dom(An) • R(A1, A2, …, An) is the schema of the relation • R is the name of the relation • A1, A2, …, An are the attributes of the relation • r(R): a specific state (or "value" or “population”) of relation R – this is a set of tuples (rows) • r(R) = {t1, t2, …, tn} where each ti is an n-tuple • ti = <v1, v2, …, vn> where each vjelement-ofdom(Aj) Chapters 5-8
Formal Definitions - Example • Let R(A1, A2) be a relation schema: • Let dom(A1) = {0,1} • Let dom(A2) = {a,b,c} • Then: dom(A1) X dom(A2) is all possible combinations: {<0,a> , <0,b> , <0,c>, <1,a>, <1,b>, <1,c> } • The relation state r(R) dom(A1) X dom(A2) • For example: r(R) could be {<0,a> , <0,b> , <1,c> } • this is one possible state (or “population” or “extension”) r of the relation R, defined over A1 and A2. • It has three 2-tuples: <0,a> , <0,b> , <1,c> Chapters 5-8
Definition Summary Chapters 5-8
Super key: an attribute or a set of attributes that identifies an entity uniquely (may not be minimal set) SSN SSN, NAME SSN, NAME, MAJOR Chapters 5-8
Candidate key: a super key such that no proper subset of its attributes is itself a super key. So candidate keys must have a minimal identifier. STUID SSNPrimary key: the candidate key that is chosenOR the candidate key that is used to identify tuples in a relation -- unique, must existAlternate key:A candidate key in a relation that is not selectede.g. if primary key is SSN then STUID is a alternate key Chapters 5-8
Concatenated (composite) key: A primary key that is comprised of two or more attributes or data items G RADE_REPORT(STUID, COURSE#, GRADE) Chapters 5-8
Foreign key: A non-key attribute in one relation that appears as the primary key (or part of the key) in another relationEMPLOYEE(SSN, FNAME, MINIT, DNO) DEPARTMENT(DNUMBER, DNAME, MANAGER) Chapters 5-8
Secondary key: a field that can have duplicate values, and that can be used as search path by the users Chapters 5-8
Referential Integrity Constraints for COMPANY database Chapters 5-8
Relational Algebra Overview • Relational algebra is the basic set of operations for the relational model • These operations enable a user to specify basic retrieval requests (or queries) • The result of an operation is a new relation, which may have been formed from one or more input relations • This property makes the algebra “closed” (all objects in relational algebra are relations) Chapters 5-8
Relational Algebra Overview (continued) • The algebra operations thus produce new relations • These can be further manipulated using operations of the same algebra • A sequence of relational algebra operations forms a relational algebra expression • The result of a relational algebra expression is also a relation that represents the result of a database query (or retrieval request) Chapters 5-8
Relational Algebra Overview • Relational Algebra consists of several groups of operations • Unary Relational Operations • SELECT (symbol: (sigma)) • PROJECT (symbol: (pi)) • Relational Algebra Operations From Set Theory • UNION ( ), INTERSECTION ( ), DIFFERENCE (or MINUS, – ) • CARTESIAN PRODUCT ( x ) • Binary Relational Operations • JOIN (several variations of JOIN exist) • DIVISION • Additional Relational Operations • OUTER JOINS, OUTER UNION • AGGREGATE FUNCTIONS (These compute summary of information: for example, SUM, COUNT, AVG, MIN, MAX) Chapters 5-8
Unary Relational Operations: SELECT • The SELECT operation (denoted by (sigma)) is used to select a subset of the tuples from a relation based on a selection condition. • The selection condition acts as a filter • Keeps only those tuples that satisfy the qualifying condition • Tuples satisfying the condition are selected whereas the other tuples are discarded (filtered out) • Examples: • Select the EMPLOYEE tuples whose department number is 4: DNO = 4 (EMPLOYEE) • Select the employee tuples whose salary is greater than $30,000: SALARY > 30,000 (EMPLOYEE) Chapters 5-8
Unary Relational Operations: SELECT • In general, the select operation is denoted by <selection condition>(R) where • the symbol (sigma) is used to denote the select operator • the selection condition is a Boolean (conditional) expression specified on the attributes of relation R • tuples that make the condition true are selected • appear in the result of the operation • tuples that make the condition false are filtered out • discarded from the result of the operation Chapters 5-8
Unary Relational Operations: SELECT (contd.) • SELECT Operation Properties • The SELECT operation <selection condition>(R) produces a relation S that has the same schema (same attributes) as R • SELECT is commutative: • <condition1>(< condition2> (R)) = <condition2> (< condition1> (R)) • Because of commutativity property, a cascade (sequence) of SELECT operations may be applied in any order: • <cond1>(<cond2> (<cond3> (R)) = <cond2> (<cond3> (<cond1> ( R))) • A cascade of SELECT operations may be replaced by a single selection with a conjunction of all the conditions: • <cond1>(< cond2> (<cond3>(R)) = <cond1> AND < cond2> AND < cond3>(R))) • The number of tuples in the result of a SELECT is less than (or equal to) the number of tuples in the input relation R Chapters 5-8
Select Works on single table and takes rows that meet a specified condition, copy them into a new table • (Table name) Condition(s) SQL (Structured Query language) SELECT * FROM (table name) WHERE condition 1 AND condition 2 AND condition 3… Chapters 5-8
Condition(s) Chapters 5-8 Table
Find employees who work for department number 5. employee DNO = 5 SQL: SELECT * FROM employee WHERE dno = 5; Chapters 5-8
Query tree DNO=5 Chapters 5-8 Employee
s(DNO=4 ANDSALARY>25000) OR (DNO=5 ANDSALARY>30000)(EMPLOYEE) --------------------------------------------------------- s<cond1>(s<cond2>(R)) = s<cond2>(s<cond1>(R)) s<cond1>(s<cond2>(. . .(s<condn> (R)) . . .)) = s<cond1> AND <cond2> AND . . . AND <condn>(R) Chapters 5-8
Project Operates on a single table, produces a vertical subset of the table, extract the values of specified columns eliminate duplicate rows place the value in a new table (table name) column1, column2, column3, … Chapters 5-8
SQL: SELECT column1, column2, column3, … FROM (table name) Chapters 5-8
column(s) Chapters 5-8 Table
E.g. Show the names of all employees employee fname, minit, lname SELECT fname, minit, lname FROM employee; Chapters 5-8
fname,minit,lname Chapters 5-8 Employee
Select & project Show the names of all employees who work for department number 5 ( employee) fname, minit, lname dno = 5 SELECT fname, minit, lname FROM employee WHERE dno = 5; Chapters 5-8
fname,minit,lname DNO = 5 Chapters 5-8 Employee
Examples of applying SELECT and PROJECT operations Chapters 5-8
PRODUCT(or Cartesian product) R1 x R2 R1 X R2 is a table where width is the width of R1 plus the width of R2 and whose columns are the columns of R1 followed by the columns of R2 If R1 has X rows and M columns R2 has Y rows and N columns R1 X R2 = X * Y rows and M + N columns Chapters 5-8
Cartesian Product Chapters 5-8
Query Tree for Cartesian Product X Chapters 5-8 Table1 Table2
Example of Query Tree Chapters 5-8
Theta Join The result of performing a SELECT operation using a comparison operator theta (=,<, <=, >, <=, <>) on the product Chapters 5-8
Theta Join (>) Chapters 5-8
Theta Join (ID>STUID) Chapters 5-8