280 likes | 296 Views
Learn about relational algebra, a mathematical language for manipulating relations in databases. Explore operators like Selection and Projection, and understand set operations like Union and Intersection. Study examples and practical applications in database querying.
E N D
Relational Algebra (CB Chapter 5.1) CPSC 356 Database Ellen Walker Hiram College
Relational Algebra • Mathematical language for operating on relations • Each operator takes one or more relations as input, and produces one relation as output • Relational algebra operators can be implemented as functions in a programming language • A relational algebra expression indicates which operators to use, in which order
Selection (condition ) • The output relation has all rows of the input relation that satisfy the condition • (Title=“prof”) Phonebook
Valid Conditions • Basic condition • Dept = ‘CS’ attribute compare to const • First < Last attribute compare to attribute • Combination of conditions using AND, OR, or NOT • Note: All attributes must come from the relation in the selection
Pseudocode for Selection Select (input relation, condition, &output rel) Do for every tuple in the input relation If the tuple satisfies the condition Copy the tuple to the output relation Time = O(number of tuples in input relation)
Projection (attributes) • Create a new relation with only the listed attributes in it. • last, phone PhoneBook
Relations Have No Duplicates • dept PhoneBook • Only 4 rows, even though PhoneBook had 5!
Pseudocode for Projection Project (input relation, attribute-list, &output rel) Do for every tuple in the input relation Do for every attribute in the input relation If the attribute is in the attribute-list copy the value to the output relation Remove duplicates in the output relation Time = O(number of tuples in relation + time for duplicate-removal)
Combining Select & Project • first, last ( title=prof PhoneBook )
Remember: A Relation is a Set • A set is an unordered collection of unique elements Set S = {1,2,3} “1 is an element of S” {a,b,c} = {a,c,b} • A subset of a set is another set whose elements all come from the original set. {a,b} is a subset of {a,c,b} {1,2,3} is a subset of {1,2,3} {1,2,4} is not a subset of {1,2,3} {} (the empty set) is a subset of every set!
Basic Set Operations • Union: the set of all elements in either or both original sets • {1,2} union {2,3} = {1,2,3} • Intersection: the set of all elements in both original sets (only) • {1,2} intersect {2,3} = {2} • Set Difference: the set of all elements in the first but not the second set • {1,2} – {2,3} = {1}
Applying to Relations • Relations must be “comparable” • Same set of attributes in each relation! • Union = all tuples • Intersection = all matching tuples • Set Difference = all tuples from first but not second
Basic Operation Examples • R1 = dept PhoneBook • R2 = name as “dept” Dept • Rename attribute to be same • R1 R2 = R2 (in this case) • R1 R2 = R1 (in this case) • R1 – R2 = { } (empty set) • R2 – R1 = R1 R2
Another Set Operation • Cartesian product: a set of ordered pairs, where each contains one element from each original set {1,2,3} x {a, b} = {(1,a), (1,b), (2,a), (2,b), (3,a), (3,b)} • For Relations: create a new relation with every combination of tuples
Cartesian Product (X) • dept, last PhoneBook X Dept Etc….
Pseudocode for Cartesian Product Product (relation1, relation2, &output rel) Do for every tuple in relation1 Do for every tuple in relation2 Build a row with all attributes from both relations Add it to the output relation Time = O(number of tuples in relation1 * number of tuples in relation 2) This is the most expensive operation in relational algebra!
Join Combines X and Select • Theta Join: • any condition R1 X R2 • Equijoin: • equality condition R1 X R2 • Natural Join: • equality condition R1 X R2 • Project to remove one copy of each equal attribute • Left or Right Outer Join: • Include all tuples from (left or right) side, even if they don’t have a match
Naïve Pseudocode for Join • Join (rel1, rel2, condition, output rel) • product (rel1, rel2, tmp) • select (tmp, condition, output rel) • Time: Same as Cartesian Product • To keep time down, keep the size of the relations down -- we’ll look at this later!
Let’s Try Some Examples: • What are the first and last names of all professors who don’t work in Hinsdale? • What are the telephone extensions of people who work in Hinsdale? • Which buildings contain people whose phone numbers are between 5000 and 5200? • List the Dept. Name, Building Name, and phone numbers for All departments (even those without phone numbers).
First and last names of all professors who don’t work in Hinsdale • Select “all professors” • Title=“Prof” (Phonebook) • Select “Departments not in Hinsdale” • Bldg != “Hinsdale” (Dept) • Connect these relations where depts match • (Title=“Prof” (Phonebook)) |X| Dept=Name (Bldg != “Hinsdale” (Dept)) • One project to get the final result • first,last ((Title=“Prof” (Phonebook)) |X| Dept=Name (Bldg != “Hinsdale” (Dept)))
Telephone extensions of people who work in Hinsdale • Select “Departments in Hinsdale” and project to just Dept to make the table smaller • Name (Bldg = “Hinsdale” (Dept)) • Join with PhoneBook to get only those in the right departments • Name (Bldg = “Hinsdale” (Dept)) |X| Dept=Name PhoneBook) • Project to get just the extensions • Phone (Name (Bldg = “Hinsdale” (Dept)) |X| Dept=Name PhoneBook))
Buildings with phone numbers between 5000 and 5300 • Select to get phone numbers from 5000 to 5300 and Project to have only the dept attribute • dept (5000<=Phone && 5300>=Phone (PhoneBooks)) • Join with Dept to associate department names with buildings • (dept (5000<=Phone && 5300>=Phone (PhoneBooks)) |X| dept=name (Dept)) • Project to get just the building names • bldg ((dept (5000<=Phone && 5300>=Phone (PhoneBooks)) |X| dept=name (Dept)))
Dept. Name, Building, and phone numbers for all departments • Join to include info from *ALL* departments. This requires an outer join • Dept X| name=dept PhoneBook • Project to get the right attributes • name, bldg,phone Dept X| name=dept PhoneBook
Aggregation • Operations that allow you to combine all the values in a table (column) in some way: • COUNT • SUM • AVG • MIN • MAX • Examples: • How many CPSC majors are there? • What is the average GPA of CPSC majors?
Grouping • Aggregate all elements in one column based on values in another column • Aggregation operators (COUNT, SUM, AVG, MIN, MAX) • Format: • group-by-attribute, FOPER attribute (table) • Note: F is a backward cursive F in the book.
Grouping Examples • List average GPAs by major majorFAVG GPA(Student) • What is the average GPA of CPSC students? σmajor=CPSC (majorFAVG GPA(Student)) • List number of faculty in each department • deptFCOUNT id(Faculty)