420 likes | 433 Views
Relational Algebra. Manipulating Databases. To access information in a database we use a query Ex: How many customers have the first name = `John’? Good query writing follows a formal model called relational algebra. Relational Algebra.
E N D
Manipulating Databases • To access information in a database we use a queryEx: How many customers have the first name = `John’? • Good query writing follows a formal model called relational algebra
Relational Algebra • Relational algebra: a collection of mathematical operations that manipulate tables • Familiarity with relational algebra helps understanding the logic behind complex queries and ease the way for writing them
Tables and Queries • Recall: a table is a set of rows/records having the same number and types of attributes • When you send a query to the database, it • Finds the appropriate rows of information in the stored tables • Performs the requested operations on the data • Represents the results in a new temporary table • Delivers the table of results to the user
Example • Ex: How many customers have the first name = `John’? The database creates a table containing all customers whose first name is `John’ and returns the table to the user
Basic types of queries • There are 4 basic types of queries • A projection operation produces a result table with • Only some of the columns of its input table. • A selection operation produces a result table with • All of the columns of the input table • Only those rows of its input table that satisfy some criteria. • A join or product operation produces a result table by • Combining the columns of two input tables. • A set operation produces a result table by • Combining rows from one or the other of its input tables
Projection operation: • A projection query selects some of the columns of the input table • project T onto (attribute1, attribute2, …) • Relational algebra form: • attribute1, attribute2,...(T)
Example • firstName,lastName(Customer)
Example … • Notice that the result table has fewer rows • Duplicate rows have been removed because the attributes do not contain a key
Storing the temporary results • We can store the result of a query in a table T’ as follows: T’ attribute1, attribute2,...(T) This will create a table T’ with attributes: attribute1, attribute2, … containing the result of the query
Selection queries: • A selection query selects rows that match a selection criteria from a table • Relational algebra form • <condition>(T) • Each row is checked to see if it satisfies the condition and selected accordingly
Example • lastName=‘Doe’(Customer)
Complex selection criteria • The selection criterion can be any boolean expression containing operators like: and, or, =, , <, >, , , etc …
Example T ssn=’376-77-0099’ and date < ’01-mar-2002’(TimeCard)
Product queries: X • The product query takes two tables and produce a table which is the cross product of the two, i.e., combines every row of one table with every row of other table • R(A1, A2 , … , An) × S(B1, B2 , … , Bm)= Q(A1, A2 , … , An, B1, B2 , … , Bm) • Relational algebra form: • R S
Example Employee TimeCard
Product queries … • If two attributes in two tables T and R have the same name, we prefix them with the relation name: T.<attribute> Ex: Employee.ssn, TimeCard.ssn • Remark. Many of the resulting rows in the previous example don’t make sense
Join queries: ⋈ • In the previous table we are only interested in the rows that match: rows with Employee.ssn = TimeCard.ssn We are interested in the query: Employee.ssn=TimeCrad.ssn(Employee x TimeCard)
Join operations … • A join query is a cross product with a restriction on the result rows • The join condition determines which rows match • Only matching rows are in the result table • Typical join condition is equality of attributes • It is called equi-join • Relational algebra form: • R ⋈<condition>S
Example Some rows from the tableEmployee⋈Employee.ssn=TimeCard.ssnTimeCard
Natural join: * • Frequently, when doing an equi-join, the attributes have the same name • A natural join is an equi-join with an equality condition on the common attributes: Employee⋈ssnTimeCard Employee * TimeCard • In natural join the common attributes appear once
Queries with multiple joins • Consider the Video-Rental schema, and suppose we want to retrieve for every currently-rented video, the renter’s account number, video number, rental date, due date, title of the movie, and cost
Solution accountId, videoId, dateRented, dateDue, title, cost ((Rental ⋈videoId Video) ⋈movieId Movie)
Combining operations • Suppose we want to find the following info. For customer with account = 113, find all the videos that he is renting: For each video, find the video number, the title of the movie, and the due date
Solution videoId, title, dateDue ((accountId=113(Rental) ⋈videoId Video) ⋈movieId Movie) Or: T1= accountId=113(Rental) T2=T1 ⋈videoId Video T3 = T2 ⋈movieId Movie T4 = videoId, title, dateDue ( T3)
More examples • List all comedy movies that were rented on December 21, 2001. For every movie list the customer’s name, movie title, and date returned
Solution T1= daterented=‘December 21 2001’(PreviousRental) T2=T1 ⋈videoId Video T3= genre=‘comedy’ (Movie) T4= T2 ⋈movieId T3 T5= T4 ⋈accountId Customer T6 = firstName, lastName, title, dateReturned ( T5)
Set operations • Set operations include: Union, intersection, and difference • Relational algebra form: , , • Set operations can be applied to any tables with the same shape (compatible) • The same order and type of attributes • Attribute names do not have to agree
Set operations • If R and S are two compatible tables: R S is the table that contains the set of rows that are either in R or in S R S is the table that contains the set of rows that are both in R and S R - S is the table that contains the set of rows that are in R but not in S
Example of • Retrieve all the videos that are currently or were previously rented EverRented= Rental PreviousRental
Example of • Retrieve the video id of all the videos that are currently rented and have been rented at least once before Veterans=videoId, ( Rental) videoId, ( PreviousRental)
Example of - • Retrieve the video id of all the videos that are currently rented and have never been rented before FirstTime=videoId, ( Rental) -videoId, ( PreviousRental)
Aggregate functions • Not all queries can be expressed using the basic operations described previously. • What if we want to compute the average salary of all employees?
Aggregate functions • What if we want to count the number of employees in each department? • For such queries, we use aggregate functions. • Relational algebra form • <grouping attributes><function list>(T)
Aggregate functions … • The function list includes: average, sum, count, maximum, minimum • The result of the query will be a table containing the results • The attributes consist of the grouping attributes + function parameters
Examples Ex1: compute the average salary of all the employees • Average(salary)(Employee) The resulting table contains one attribute: Average_Salary and one value Ex2: compute the number of employees in each department • DNO Count(ssn)(Employee) The resulting table contains two attributes: DNO and Count_ssn. There is a row for every dept. containing the DNO value and the number of employees
Renaming attributes • It is sometimes convenient to rename the attributes in the resulting relation: R(DEPTNUM, NUM_EMPL) DNO Count(ssn) (Employee)
Recursive operations • Compute all the employees supervised by ``Pinochio’’ • Compute all the emplyees supervised by ``Pinochio’’ at level two • Compute all the employees supervisod by ``Pinochio’’ at any level!!!
Answers A1: Pinochio_ssn <- ssn (fname=`pinochio’(Employee)) Result1ssn (Pinochio ⋈ssn=superssn Employee) A2: Result2 ssn (Result1 ⋈ssn=superssn Employee) Result Result1 Result2 A3: is not supported by standard relational algebra
Outer Join • Left Outer Join Ex: list the employee names and also the name of the department they manage in case it exists • Right Outer Join • Full Outer Join
Examples from (Emp-Dept-Proj schema) • List everybody who makes more than $30000. • List names of everybody working for the research department. • List employees with a dependent. • List employees that have a daughter. • List employees without dependents. • List employees working on a project in Houston. • List all supervisors. • List names of all managers. • List names of managers with at least one dependent
Examples from (Emp-Dept-Proj schema …) • For every project located in ‘Chicago’, list the project number, the controlling department number, the department manager’s last name, address, and birthdate. • Make a list of project numbers for projects involving an employee whose first name is `Pinochio’ either as a worker on the project, or as a manager of the department that controls the project. • Find the names of all employees who are directly supervised by `Isaac Newton’ • For each department, retrieve the department name and average salary of its employees. • Retrieve the average salary of all female employees • For each project, list the project name and the total number of hours spent on the project.