490 likes | 523 Views
Learn about relational model, algebra operators, query languages like SQL, and their application in converting ER diagrams to tables. Understand data retrieval operations and relational algebra basics.
E N D
Relational Algebra Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu
Announcements • Project-Phase 1 is due NOW !!! • Project-Phase 2 is out today (Nov. 4) and due on Nov. 11 • Submission Guidelines (to make it easier for grading) • Submit single file (word or pdf) • Make sure your name (or username) is specified • Groups submit single copy of the project phases • Pickup your graded hardcopy submission from TAs during their office hours
Relational Model (Recap) • Relations (Tables) + Attributes (Columns) • Integrity constraints • Create “Students” relation • Create “Courses” relation CREATE TABLE Students (sid: CHAR(20) Primary Key, name: CHAR(20), login: CHAR(10), age: INTEGER, gpa: REAL); CREATE TABLE Courses (cid: Varchar(20) Primary Key, name: string, maxCredits : integer, graduateFlag: boolean); • Create “Enrolled” relation CREATE TABLE Enrolled (sid: CHAR(20) Foreign Key References (Students.sid), cid: Varchar(20), enrollDate: date, grade: CHAR(2), Constraints fk_cidForeign Key cid References (Courses.cid));
What about Converting this ERD to Relational Model status date
Query Language • Query Language • Define data retrieval operations for relational model • Express easy access to large data sets in high-level language, not complex application programs • Categories of languages • Procedural: What you want and how to get it • Non-procedural, or declarative: What you want (without how) • SQL: High-level language for relational algebra. • Relational Algebra : Operator semantics based on set or bag theory • Relational algebra form underlying basis (and optimization rules) for SQL
Relational Algebra • Basic operators • Set Operations (Union: ∪, Intersection: ∩ ,difference: – ) • Select: σ • Project: π • Cartesian product: x • rename: ρ • More advanced operators, e.g., grouping and joins • The operators take one or two relations as inputs and produce a new relation as an output • One input unary operator, two inputs binary operator
Relational Algebra • Allows to build expressions using composition of the available operators • For example, arithmetic expressions are expressions of operators • (w + t) / ((x + y) * 3) • In relational algebra, instead of variables we have relations
Set Operators • Union, Intersection, Difference • Defined only for union compatiblerelations • Relations are union compatible iff • they have same sets of attributes (schema), and • the same types (domains) of attributes • Example : Union compatible or not? • Student (sNumber, sName) • Course (cNumber, cName) Not compatible
Union: • Consider two relations R and S that are union-compatible R S R S
Union: • Notation: R ∪ S • Defined as: • R ∪S = {t | t∈R or t∈S} • For R ∪S to be valid, they have to be union-compatible
Difference: - • R – S are the tuples that appear in R and not in S • Defined as: • R – S = {t | t ∈R and t∈ S} R S R – S
Intersection: ∩ • Consider two Relations R and S that are union-compatible R S R∩S
Intersection: ∩ • Notation: R ∩ S • Defined as: • R ∩ S = { t | t ∈ r and t ∈ s } • Note: R ∩ S = R– (R–S) S R
Selection: σ • Select: σc (R): • c is a condition on R’s attributes • Select subset of tuples from R that satisfy selection condition c σ(C ≥ 6) (R) R
Selection: σ • Notation: σc(R) • cis called the selection predicate • Defined as: • σc(R) = {t | t ∈ R and c(t) is true} • cis a formula in propositional calculus consisting of terms connected by : • ∧ (and), ∨ (or), ¬ (not) • Each term is one of: • <attribute> op <attribute> | <attribute> op <constant> • op is one of: =,≠,>,≥.<.≤ • Example of selection: • σ branch_name=“Perryridge” ^ balance>1000(account)
Selection: Example R σ ((A=B) ^ (D>5))(R)
Project: π • πA1, A2, …, An (R), with A1, A2, …, An attributes AR • returns all tuples in R, but only columns A1, A2, …, An • A1, A2, …, An are called Projection List πA, C (R) R
Cross Product (Cartesian Product): X R X S S R
Cross Product (Cartesian Product): X • Notation R x S • Defined as: • R x S = {t q | t ∈ r and q ∈ s} • Assume that attributes are all unique, otherwise renaming must be used
Renaming: ρ • ρS (R) changes relation name from R to S • ρS(A1, A2, …, An) (R) renames also attributes of R to A1, A2, …, An ρS(X, C, D) (R) ρS (R) R S S
Composition of Operations • Can build expressions using multiple operations • Example: σA=C(R x S) R X S S R σA=C(R x S)
Banking Example • branch (branch_name, branch_city, assets) • customer (customer_name, customer_street, customer_city) • account (account_number, branch_name, balance) • loan (loan_number, branch_name, amount) • depositor (customer_name, account_number) • borrower (customer_name, loan_number)
Natural Join: R ⋈ S • Consider relations • R with attributes AR, and • S with attributes AS. • Let A = AR∩ AS = {A1, A2, …, An} – The common attributes • In English • Natural join R ⋈ Sis a Cartesian Product R X S with equality predicates on the common attributes (Set A)
Natural Join: R ⋈ S • R ⋈ S can be defined as : πAR – A, A, AS - A (σR.A1 = S.A1 AND R.A2 = S.A2 AND … R.An = S.An (R X S)) Project the union of all attributes Equality on common attributes Cartesian Product
Natural Join: R ⋈ S: Example S R R ⋈ S
Theta Join: R ⋈C S • Theta Join is cross product, with condition C • It is defined as : R ⋈C S = (σC (R X S)) Theta join can express both Cartesian Product & Natural Join R S R ⋈R.A>=S.CS
Outer Join • An extension of the join operation that avoids loss of information • Computes the join and then adds tuples form one relation that does not match tuples in the other relation to the result • Uses null values to fill in empty attributes with no matching • Types of outer join between R and S • Left outer (R o⋈ S): preserve all tuples from the left relation R • Right outer (R ⋈o S): preserve all tuples from the right relation S • Full outer (R ⋈ S): preserve all tuples from both relations o
Left Outer Join (R o⋈ S): Example R S R ⋈ S (R o⋈ S)
Right Outer Join (R ⋈o S): Example R S R ⋈ S (R ⋈o S)
Full Outer Join (R ⋈ S): Example o R S R ⋈ S (R ⋈ S) o
Assignment Operator: • The assignment operation (←) provides a convenient way to express complex queries on multiple line • Write query as a sequence of line consisting of: • Series of assignments • Result expression containing the final answer • Assignment must always be made to a temporary relation variable • May use a variable multiple times in subsequent expressions • Example: • R1 (σ ((A=B) ^ (D>5))(R – S)) ∩ W • R2 R1 ⋈(R.A = T.C)T • Result R1 U R2
Duplicate Elimination: (R) • Delete all duplicate records • Convert a bag to a set R (R)
Extended Projection: πL (R) • Standard project • L contains only column names of R • Extended projection • L may contain expressions and assignment operators π C, VA, X C*3+D(R)
Grouping & Aggregation operator: • Aggregation function takes a collection of values and returns a single value as a result • avg: average value • min: minimum value • max: maximum value • sum: sum of values • count: number of values • Grouing & Aggregate operation in relational algebra • g1,g2, …gm,F1(A1), F2(A2), …Fn(An) (R) • R is a relation or any relational-algebra expression • g1, g2, …gm is a list of attributes on which to group (can be empty) • Each Fiis an aggregate function applied on attribute Ai within each group
Grouping & Aggregation Operator: Example S R • branch_name.sum(balance)(S) • sum(c)(R)
Summary of Relational-Algebra Operators • Set operators • Union, Intersection, Difference • Selection & Projection & Extended Projection • Joins • Natural, Theta, Outer join • Rename & Assignment • Duplicate elimination • Grouping & Aggregation
Example Queries Find customer names having loans with sum > 20,000 • πcustomer_name (σsum > 20,000 (customer_name, sum sum(amount)(loan ⋈ borrower)))
Example Queries Find the branch name with the largest number of accounts • R1 branch_name. countAccounts count(account_number)(account) • R2 Max max(countAccounts)(R1) • Result πbranch_name(R1 ⋈countAccounts = Max R2)
Example Queries Find customers having account balance below 100 or above 10,000 • πcustomer_name (depositor ⋈ • πaccount_number(σbalance <100 OR balance > 10,000(account)))
Example Queries Find customers having account balance below 100 and loans above 10,000 • R1 πcustomer_name (depositor ⋈ πaccount_number(σbalance <100 (account))) • R2 πcustomer_name (borrower ⋈ πloan_number(σamount >10,000 (loan))) • Result R1 ∩ R2
Example Queries Find account numbers and balances for customers having loans > 10,000 • πaccount_number, balance • ( (depositor ⋈ account) ⋈ • (πcustomer_name (borrower ⋈ (σamount >10,000 (loan)))) • )
Reversed Queries (what does it do)? • πcustomer_name(customer) - • (πcustomer_name(borrower) U πcustomer_name(depositer)) Find customers who neither have accounts nor loans
Reversed Queries (what does it do)? • R1 (MaxLoan max(amount)(σbranch_name= “ABC”(loan))) • Result πcustomer_name(borrower ⋈ (R1 ⋈MaxLoan=amount^branch_name= “ABC”loan)) Find customer name with the largest loan from a branch “ABC”