520 likes | 533 Views
Explore the fundamentals of relational algebra and query languages, including SQL and relational model operations. Learn about set operators, selection, projection, Cartesian product, and more with examples. Enhance your understanding of database querying.
E N D
Relational Algebra Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu
What we learned so far… • Entity-Relationship Model • Converting it to relational model • Relational Model • Creating table • Defining constraints • Inserting, deleting, and updating data We learn next how to query the data
Query Language • Query Language • Define data retrieval operations for relational model • Express easy access to large data sets in high-level language, not complex application programs • Categories of languages • Procedural: What you want and how to get it • Non-procedural, or declarative: What you want (without how) • SQL: High-level language for relational algebra. • Relational Algebra: Operator semantics based on set or bag theory • Relational algebra form underlying basis (and optimization rules) for SQL
Relational Algebra • Basic operators • Set Operations (Union: ∪, Intersection: ∩ ,difference: – ) • Select: σ • Project: π • Cartesian product: x • rename: ρ • More advanced operators, e.g., grouping and joins • The operators take one or two relations as inputs and produce a new relation as an output • One input unary operator, two inputs binary operator
Relational Algebra • Allows to build expressions using composition of the available operators • For example, arithmetic expressions are expressions of operators • (w + t) / ((x + y) * 3) • In relational algebra, instead of variableswe have relations
Set Operators • Union, Intersection, Difference • Defined only for union compatiblerelations • Relations are union compatible iff • they have same sets of attributes (schema), and • the same types (domains) of attributes • Example : Union compatible or not? • Student (sNumber, sName) • Course (cNumber, cName) Not compatible
Union over sets: • Consider two relations R and S that are union-compatible R S R S
Union over sets: • Notation: R ∪ S • Defined as: • R ∪S = {t | t∈R or t∈S} • For R ∪S to be valid, they have to be union-compatible
Difference over sets: - • R – S are the tuples that appear in R and not in S • Defined as: • R – S = {t | t ∈R and t∈ S} R S R – S
Intersection over sets: ∩ • Consider two Relations R and S that are union-compatible R S R∩S
Intersection: ∩ • Notation: R ∩ S • Defined as: • R ∩ S = { t | t ∈ r and t ∈ s } • Note: R ∩ S = R– (R–S) S R
Selection: σ • Select: σc (R): • c is a condition on R’s attributes • Select subset of tuples from R that satisfy selection condition c σ(C ≥ 6) (R) R
Selection: σ • Notation: σc(R) • cis called the selection predicate • Defined as: • σc(R) = {t | t ∈ R and c(t) is true} • cis a formula in propositional calculus consisting of terms connected by : • ∧ (and), ∨ (or), ¬ (not) • Each term is one of: • <attribute> op <attribute> | <attribute> op <constant> • op is one of: =,≠,>,≥.<.≤ • Example of selection: • σ branch_name=“Perryridge” ^ balance>1000(account)
Selection: Example R σ ((A=B) ^ (D>5))(R)
Project: π • πA1, A2, …, An (R), with A1, A2, …, An attributes AR • returns all tuples in R, but only columns A1, A2, …, An • A1, A2, …, An are called Projection List πA, C (R) R
Cross Product (Cartesian Product): X R X S S R
Cross Product (Cartesian Product): X • Notation R x S • Defined as: • R x S = {t q | t ∈ r and q ∈ s} • Assume that attributes are all unique, otherwise renaming must be used
Renaming: ρ • ρS (R) changes relation name from R to S • ρS(A1, A2, …, An) (R) renames also attributes of R to A1, A2, …, An ρS(B->X, C, D) (R) ρS (R) R S S
Composition of Operations • Can build expressions using multiple operations • Example: σA=C(R x S) R X S S R σA=C(R x S)
Banking Example • branch (branch_name, branch_city, assets) • customer (customer_name, customer_street, customer_city) • account (account_number, branch_name, balance) • loan (loan_number, branch_name, amount) • depositor (customer_name, account_number) • borrower (customer_name, loan_number)
What We Covered So Far… • Set Operations (Union: ∪, Intersection: ∩ ,difference: – ) • Select: σ • Project: π • Cartesian product: x • rename: ρ
Example Queries Find customers who have neither accounts nor loans • πcustomer_name(customer) - • (πcustomer_name(borrower) U πcustomer_name(depositer))
Natural Join: R ⋈ S(Join on the common attributes) • Consider relations • R with attributes AR, and • S with attributes AS. • Let A = AR∩ AS = {A1, A2, …, An} – The common attributes • In English • Natural join R ⋈ Sis a Cartesian Product R X S with equality predicates on the common attributes (Set A)
Natural Join: R ⋈ S • R ⋈ S can be defined as : πAR – A, A, AS - A (σR.A1 = S.A1 AND R.A2 = S.A2 AND … R.An = S.An (R X S)) Project the union of all attributes Equality on common attributes Cartesian Product Common attributes appear once in the result
Natural Join: R ⋈ S: Example S R R ⋈ S
Theta Join: R ⋈C S • Theta Join is cross product, with condition C • It is defined as : R ⋈C S = (σC (R X S)) Theta join can express both Cartesian Product & Natural Join R S R ⋈R.A>=S.CS
Example Queries Find customer names having account balance below 100 or above 10,000 • πcustomer_name (depositor ⋈ • πaccount_number(σbalance <100 OR balance > 10,000(account)))
Assignment Operator: • The assignment operation (←) provides a convenient way to express complex queries on multiple line • Write query as a sequence of line consisting of: • Series of assignments • Result expression containing the final answer • Assignment must always be made to a temporary relation variable • May use a variable multiple times in subsequent expressions • Example: • R1 (σ ((A=B) ^ (D>5))(R – S)) ∩ W • R2 R1 ⋈(R.A = T.C)T • Result R1 U R2
Example Queries Find customers having account balance below 100 and loans above 10,000 • R1 πcustomer_name (depositor ⋈ πaccount_number(σbalance <100 (account))) • R2 πcustomer_name (borrower ⋈ πloan_number(σamount >10,000 (loan))) • Result R1 ∩ R2
Outer Join • An extension of the join operation that avoids loss of information • Computes the join and then adds tuples form one relation that does not match tuples in the other relation to the result • Uses null values to fill in empty attributes with no matching • Types of outer join between R and S • Left outer (R o⋈ S): preserve all tuples from the left relation R • Right outer (R ⋈o S): preserve all tuples from the right relation S • Full outer (R ⋈ S): preserve all tuples from both relations o
Left Outer Join (R o⋈ S): Example R S R ⋈ S (R o⋈ S)
Right Outer Join (R ⋈o S): Example R S R ⋈ S (R ⋈o S)
Full Outer Join (R ⋈ S): Example o R S R ⋈ S (R ⋈ S) o
Duplicate Elimination: (R) • Delete all duplicate records • Convert a bag to a set R (R)
Extended Projection: πL (R) • Standard project • L contains only column names of R • Extended projection • L may contain expressions and assignment operators π C, VA, X C*3+D(R)
Grouping & Aggregation operator: • Aggregation function takes a collection of values and returns a single value as a result • avg: average value • min: minimum value • max: maximum value • sum: sum of values • count: number of values • Grouing & Aggregate operation in relational algebra • g1,g2, …gm,F1(A1), F2(A2), …Fn(An) (R) • R is a relation or any relational-algebra expression • g1, g2, …gm is a list of attributes on which to group (can be empty) • Each Fiis an aggregate function applied on attribute Ai within each group
Grouping & Aggregation Operator: Example S R • branch_name,sum(balance)(S) • sum(c)(R)
Summary of Relational-Algebra Operators • Set operators • Union, Intersection, Difference • Selection & Projection & Extended Projection • Joins • Natural, Theta, Outer join • Rename & Assignment • Duplicate elimination • Grouping & Aggregation
Example Queries Find customer names having loans with sum > 20,000 • πcustomer_name (σsum > 20,000 (customer_name, sum sum(amount)(loan ⋈ borrower)))
Example Queries Find the branch name with the largest number of accounts • R1 branch_name, countAccounts count(account_number)(account) • R2 Max max(countAccounts)(R1) • Result πbranch_name(R1 ⋈countAccounts = Max R2)
Example Queries Find account numbers and balances for customers having loans > 10,000 • πaccount_number, balance • ( (depositor ⋈ account) ⋈ • (πcustomer_name (borrower ⋈ (σamount >10,000 (loan)))) • )
Reversed Queries (what does it do)? • πcustomer_name(customer) - πcustomer_name(borrower) Find customers who did not take loans
Reversed Queries (what does it do)? • R1 (MaxLoan max(amount)(σbranch_name= “ABC”(loan))) • Result πcustomer_name(borrower ⋈ (R1 ⋈MaxLoan=amount^branch_name= “ABC”loan)) Find customer name with the largest loan from a branch “ABC”