2. Relational Model

2.Relational Model

Attribute Types • Each attribute of a relation has a name • The set of allowed values for each attribute is called the domain of the attribute • Attribute values are (normally) required to be atomic; that is, indivisible • E.g. the value of an attribute can be an account number, but cannot be a set of account numbers • Domain is said to be atomic if all its members are atomic • The special value null is a member of every domain • The null value causes complications in the definition of many operations

Relation Schema • Formally, given domains D1, D2, …. Dn a relation r is a subset of D1 x D2 x … x DnThus, a relation is a set of n-tuples (a1, a2, …, an) where each ai Di • Schema of a relation consists of • attribute definitions • name • type/domain • integrity constraints

Database schema • The database schema , it is a logical design of the database, and the database instance, which is a snapshot of the database at a given instant in time. • Lower case is written for relation and upper case is written for relation schema. • We use Account_schema to denote the relation schema for relation account. • Account_schema=(account_number,branch_name,balance) • We denote the face that account is a relation on Account_schema • account(Account_schema)

Relation Instance • The current values (relation instance) of a relation are specified by a table • An element t of r is a tuple, represented by a row in a table • Order of tuples is irrelevant (tuples may be stored in an arbitrary order) attributes (or columns) customer_name customer_street customer_city Jones Smith Curry Lindsay Main North North Park Harrison Rye Rye Pittsfield tuples (or rows) customer

Example of the relation instance • Consider the branch relation and the schema of that relation is: • Branch_schema(Branch_name, Branch_city,assests)

Why Split Information Across Relations? • Storing all information as a single relation such as bank(account_number, balance, customer_name, ..)results in • repetition of information • e.g.,if two customers own an account (What gets repeated?) • the need for null values • e.g., to represent a customer without an account • Normalization theory (Chapter) deals with how to design relational schemas

Keys • Let K  R • K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(R) • by “possible r ” we mean a relation r that could exist in the enterprise we are modeling. • Example: {customer_name, customer_street} and {customer_name} are both superkeys of Customer, if no two customers can possibly have the same name • In real life, an attribute such as customer_id would be used instead of customer_name to uniquely identify customers, but we omit it to keep our examples small, and instead assume customer names are unique.

Keys (Cont.) • K is a candidate key if K is minimalExample: {customer_name} is a candidate key for Customer, since it is a superkey and no subset of it is a superkey. • Primary key: a candidate key chosen as the principal means of identifying tuples within a relation • Should choose an attribute whose value never, or very rarely, changes. • E.g. email address is unique, but may change

Super Keys : Super key stands for superset of a key.A Super Key is a set of one or more attributes that are taken collectively and can identify all other attributes uniquely.Candidate KeysCandidate Keys are super keys for which no proper subset is a super key. In other words candidate keys are minimal super keys.

Primary Key:It is a candidate key that is chosen by the database designer to identify entities with in an entity set. Primary key is the minimal super keys. In the ER diagram primary key is represented by underlining the primary key attribute. Ideally a primary key is composed of only a single attribute. But it is possible to have a primary key composed of more than one attribute.Composite KeyComposite key consists of more than one attributes.

Example: Consider a Relation or Table R1. Let A,B,C,D,E are the attributes of this relation. R(A,B,C,D,E)A→BCDE This means the attribute 'A' uniquely determines the other attributes B,C,D,E.BC→ADE This means the attributes 'BC' jointly determines all the other attributes A,D,E in the relation. Primary Key :ACandidate Keys :A, BCSuper Keys : A,BC,ABC,AD ABC,AD are not Candidate Keys since both are not minimal super keys.

Schema diagram

Query Languages • Language in which user requests information from the database. • Categories of languages • Procedural • Non-procedural, or declarative • “Pure” languages: • Relational algebra • Tuple relational calculus • Domain relational calculus • Pure languages form underlying basis of query languages that people use.

Many to many relationships • In general, need a new table Enrolled(cid, grade, studid) Studid is foreign key that references sid in Student table Student Foreign key Enrolled

Relational Algebra • Collection of operators for specifying queries • Query describes step-by-step procedure for computing answer (i.e., operational) • Each operator accepts one or two relations as input and returns a relation as output • Relational algebra expression composed of multiple operators

Relational Algebra Summary • Algebras are useful to manipulate data types (relations in this case) • Set-oriented • Brings some clarity to what needs to be done • Opportunities for optimization • May have different expressions that do same thing • We will see examples of algebras for other types of data in this course

Select,project & rename are unary operations as they operate on 1 relation.

Relational Algebra • Basic operations: • Selection ( ) return rows that meet some condition. • Projection ( ) selects columns from relation. • Cross-product( ) Allows us to combine two relations. • Set-difference ( ) Tuples in reln. 1, but not in reln. 2. • Union( ) Tuples in reln. 1 and in reln. 2. • Additional operations: • Set Intersection, natural join, division, assignment

The select,project,and rename are called unary operations because they operate on one relation.

select operation () Select operation select the tuples from the table with satisfy given condition or predicate condition(Relation) select <attributes> from <table name> where <condition>; select * from customer where id=1234;

Selection Select students with gpa higher than 3.3 from S1: σgpa>3.3(S1) S1

The select operation The select operation selects tuples that satisfy given predicate.It is denoted by (σ). • Thus , to select those tuples of the loan relation where the branch is”Perryridge”we write, • σ branch_name=“Perryridge”(loan)

We can find all tuples in which the amount lent is more than $1200 by writing • σ amount>1200(loan) • In general we allow comparisions using <,<=,>,>= in the selection predicate,further we can combine several predicates by using the connections and(^ ),or(v) and not(-) • σbranch_name=“perryridge” ^amount>1200(loan)

project operation () Selects the particular attributes or columns from the table attributes (Relation) select <attributes> from <table name>; select id, name from customer;

Projection Project name and gpa of all students in S1: name, gpa(S1) S1

Projection operation • Suppose we want to list out all loan numbers and the amount of the loans,but donot care about the branch name. the project operations allows us to produce this relation. Since a relation is a set , any duplicate rows are eliminated. • Projection is denoted by the upper case π

Write a query to list all loan numbers and the amount of loan asπloan_number,amount(loan)

Composition of Relational Operations Composition provide the information of the table with specified columns with satisfy conditions or predictions id,name (account>1000 (customer)) select <attributes> from <table name> where <predictions>; select id,name from customer where account>1000;

Combine Selection and Projection • Project name and gpa of students in S1 with gpa higher than 3.3: name,gpa(σgpa>3.3(S1))

Composition of relational operations • “Find those customers who live in ahmedabad” • Πcustomer_name (σ customer_city=“ahmedabad”(customer))

Set Operations • These are binary operations; that is, each is applied to two sets. • Binary operations form mathematical set theory: UNION: R1UR2, INTERSECTION: R1nR2, SET DIFFERENCE: R1-R2, CARTESIAN PRODUCT: R1XR2. Rename:  • When these operations are adapted to relational databases, the two relations on which any of the above three operations are applied must have the same type of tuples; this condition is called Union Compatible if they have the same degree n. • This means that the two relations have the same number of attributes and that each pair of corresponding attributes have the same domain.

UNION: • The result of this operation, denoted by R  S is a relation that includes all tuples that are either in R or in S or in both R and S. Duplicate tuples are eliminated. INTERSECTION: • The result of this operation, denoted by R n S, is a relation that includes all tuples that are in both R and S. SET DIFFERENCE: • The result of this operation denoted by R – S, is a relation that includes all tuples that are in R but not in S. CARTESIAN PRODUCT: • It is also known as CROSS PRODUCT or CROSS JOIN denoted by X, which is also a binary set operation, but the relations on which it is applied do not have to be union compatible. This operation is used to combine tuples from two relations in a combinatorial fashion.

Union Operation () Union of the two entity set id,name (borrower)  id,name (depositor) (select id,name from borrower) union (select id,name from depositor); Option: union all Conditions: • The relations r and s must be of the same arity • The domains of the ith attribute of r and the ith attribute of s must be the same for all i

Union Operation – Example A B A B • Relations r, s:    1 2 1   2 3 s r A B     1 2 1 3 • r  s:

Set Intersection Operation () Intersection of the two entity set id,name (borrower)  id,name (depositor) (select id,name from borrower) intersection (select id,name from depositor); Option: intersection all Conditions: • The relations r and s must be of the same arity • The domains of the ith attribute of r and the ith attribute of s must be the same for all i

Example: Intersection S1 S2 S1  S2 =

Intersect Operation – Example Relations r, s: A B A B    1 2 1   2 3 s r A B  2 • r n s:

Set Difference Operation (-) Give tuples which are in relation r but not in s id,name (borrower) – id,name (depositor) (select id,name from borrower) except (select id,name from depositor); Option: except all Conditions: • The relations r and s must be of the same arity • The domains of the ith attribute of r and the ith attribute of s must be the same for all i

Set Difference Operation – Example • Relations r, s: A B A B    1 2 1   2 3 s r A B • r – s:   1 1

The set –difference operation • It is denoted by -, allows us to find tuples that are in one relation but not in another. • We can find the customer who have an account but not a loan by: • Πcustomer_name (depositor) -Πcustomer_name (borrower)

Rename Operation () ->Rename either the relation or attributes or both x (E) Returns the result of expression E under the name ‘x’ x(A1,A2,…,An) (E) With the attributes rename to A1, A2, …, An e.g. find largest balance in account entity set balance (account) – account,balance (account.balance < d.balance (account x d(account))) (select balance from account) except (select account,balance from account,accountas d where account.balance < d.balance)

Formal Definition of Relational Algebra For E1 and E2 Relational-Algebra Expressions:

Cartesian-Product Operation () Allow us to combine information from any two relations Cartesian product of relations r1 & r2 is r1  r2 customer  loan (customer.id, customer.name, customer.street, customer.city, loan.number, loan.amount) select id,name,number,amount from customer,loan If relations r1 has n rows and r2 has m rows than r1 x r2 is size of n x m.

Cartesian - Product Operation – Example • Relations r, s: A B C D E   1 2     10 10 20 10 a a b b r s A B C D E • r xs:         1 1 1 1 2 2 2 2         10 10 20 10 10 10 20 10 a a b b a a b b

Cartesian product operation • It is denoted by a cross(× ) allows us to combine information from any two relations. We write as r1×r2. • For example relation schema for r=borrower×loan is • (borrower.customer_name, borrower.loan_num, loan. loan_number, loan.branch_name, loan.amount) • We shall drop the relation-name prefix.This simplification doesnot lead to any ambiguity. • We can write relation schema for r as (customer_name, borrower.loan_num,loan.loan_number,branch_name,amount)

The loan relation The borrower relation

Borrower * loan

2. Relational Model

2. Relational Model

Presentation Transcript

Relational Model

Relational Model

Unit 2 The Relational Model

Chapter 2: Relational Model

Lecture 2: Relational Model

Chapter 2 The relational model

Chapter 2: Relational Model

Relational Model

Relational Model

Relational Model

Relational Model

Relational Model

Relational Model

Chapter 2: Relational Model

Chapter 2: Relational Model

Ch.2 Relational Data Model

Unit 2 The Relational Model

Chapter 2: Relational Model

Relational Data Model - 2

Chapter 2: Relational Model