1 / 45

RELATIONAL ALGEBRA

RELATIONAL ALGEBRA. Chapter-4. FUNCTIONAL DEPENDENCIES (FD). A functional dependency (FD) is a constraint between two sets of attributes. A functional dependency denoted by X  Y (read as "Y is functionally dependent on X").

Download Presentation

RELATIONAL ALGEBRA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RELATIONAL ALGEBRA Chapter-4

  2. FUNCTIONAL DEPENDENCIES (FD) • A functional dependency (FD) is a constraint between two sets of attributes. • A functional dependency denoted by X  Y (read as "Y is functionally dependent on X"). • The left-hand side of the FD is some times called as the determinant and the right-hand side is called dependent. • Example (1) SSN  Name (2) PNumber  {PName, PLocation} (3) {SSN, PNumber}  Hours

  3. Example – 2 SupplierId  City {SupplierId, ProdId}  City

  4. FD Diagrams FD1 FD2 FD1: SSN  {Name, BDate, Addr, DNumber} FD2:DNumber  {DName, DMgrSSN}

  5. Closure of a set of FD’s • Given some FD’s, new FD’s can often be inferred. • The set of all FD’s that are implied by a given set F of FD’s is called the closure of F and is denoted by F +. • Example: Let F = {AB  C, C  B} be a set of FD’s satisfied by R (A, B, C). Then, F + = {A  A, B  B, AB  AC, AB  BC, AB  ABC, etc.}

  6. Armstrong's Axioms • IR-1: Reflexivity: If X  Y, then X  Y. (if X is a superset of Y or Y is a subset of X) • IR-2: Augmentation: If X  Y, then XZ  YZ. • IR-3: Transitivity: If X  Y and Y  Z, then X  Z. • IR-4: Decomposition: If X  YZ, then X  Y and X  Z. • IR-5:Union:If X  Y and X  Z, then X YZ. • IR-6:Pseudo-Transitivity:

  7. Example Supposing we are given a relation R {A, B, C, D, E, F} with a set of FDs as shown below: A  BC, B  E , CD  EF Show that the FD AD  F holds for R and is a member of the closure. (1) A  BC & CD  EF {Given} (2) A  C & A  B {Decomposition of (1)} (3) AD CD {Augmentation of (2) by adding D} (4) AD  CD & CD  EF AD  EF {Transitivity of (3) and (1)} (5) AD  EADF{Decomposition of (5)}

  8. Attribute closure, F+ • To compute F + , start with FD’s in F; Repeatedly apply IR-1 to IR-6 until no new FD can be derived • Armstrong's Axioms do not produce any incorrect FD’s that are added to F +. However, finding F + is too expensive; the complexity grows exponentially • The solution is to find the attribute closure of X, denoted as X +

  9. Algorithm to findX+ Algorithm Attribute_Closure() { X + = X; Repeat { for each FD A  B in F do if A X +thenX + B // i. e. if A is in X +, then add B to X + until no change; // until no more attributes are added to X + } }

  10. Example Consider R (A, B, C) and a set of FDs F = {AB  C, C  B} Using the Algorithm, we calculate the following closure sets with respect to F:   A+ = {A}, (Reflexivity) B+ = {B}, (Reflexivity) C+ = {C, B} (CB) because of FD-2 {AB}+ = {ABC} because of FD-1 add attribute C {AC}+ = {ACB} because of AC  AB (IR-2) Augmentation attribute B {BC}+ = {BC} nothing can be added {ABC}+ = {ABC} nothing can be added

  11. Decompositions • The basic idea in decomposition is to split a relation into smaller relation schemas. • We address the problem of redundancy to a large extent. • We must ensure that when a relation is decomposed, the integrity constraints are maintained.

  12. Example Branch Loan

  13. Lossy and Lossless Decomposition • Lossless decomposition: A relation R is said to be a lossless decomposition into R1 and R2 iff. the natural join of these two relations gives back the original relation R. • Lossy decomposition: The natural join of R1 and R2 does not provide the original relation R, then it is said to be a lossy decomposition. • Spurious tuples: The effect of lossy decomposition is that when R1 and R2 are joined, some extra tuples will creep in. These extra tuples are called as spurious tuples.

  14. Example Rows shown with blue color background are called spurious tuples.

  15. Definition: Let R be a relation schema. A decomposition of R, denoted by D = {R1, R2, ..., Rn}, is a set of relation schemas such that R = R1  R2  ...  Rn. If {R1, R2, ..., Rn} is a decomposition of R and r is an instance of R, then r R1(r) R2(r) . . . Rn(r)

  16. Lossless Join Decomposition • Theorem: Let F be a set of FDs over R, and D = {R1, R2} be a decomposition of R. D is a lossless-join decomposition if and only if • R1  R2  R1 - R2 is in F +; or • R1  R2  R2 - R1 is in F +. • The above FDs imply that the attributes common to R1 and R2 must contain a key for either R1 or R2.

  17. Example CollegeCode  Rank CollegeCode  City CollegeRank Lossy: R1 = {CollegeCode, Rank} R2 = {Rank, City} Lossless: R1 = {CollegeCode, Rank} R2 = {CollegeCode, City}

  18. Candidate Keys • Finding candidate keys Step-1: Draw the dependency graph of F. Each vertex corresponds to an attribute. Edges can be defined as follows: A  B becomes A B A  BC becomes A B C AB  C becomes A B C Step-2: Identify the set of vertices Vni that have no incoming edges. Step-3: Identify the set of vertices Voi that have only incoming edges. Step-4: A candidate key is a set of attributes that • contains all attributes in Vni. • contains no attributes in Voi. • has no subset that is already a candidate key.

  19. Example - 1 • Consider R (A, B, C, G, H, I), and F = {A  BC, CG  HI, B  H}. B H C A I G No Incoming Edges Vni = {A, G} Only Incoming EdgesVoi = {H, I} (AG)+ = {A, B, C, G, H, I}, AG is the only candidate key

  20. Example - 2 Consider R (A, B, C, D, E, H), and F = {A  B, AB  E, BH  C, C  D, D  A} A B C D E H Vni = {H} and Voi = {E} Candidate keys = AH, BH, CH, and DH

  21. Introduction • What is relational algebra? Relational algebra is a collection of operations that are used to manipulate the entire set of relations. The output of any relational algebra operation is always a relation. • What are the Operations? • Set operations like union, intersection, difference, and Cartesian product. • Selection, Projection, and Joining.

  22. Sample Relations Employee Department Project

  23. The SELECT operation () • The select operation retrieves a subset of tuples in a relation that satisfy a selection condition. <selection-condition> (Relation) • Query-1: Find the employees whose salary is greater than 10,000 rupees. Salary > 10000 (Employee) • Query-2: Find the employees who work for department 3 and whose salary is greater than 30,000 rupees. Emp3 DNo = 3 (Employee) Result Salary > 30000 (Emp3) Or, Result  DNo = 3 and Salary > 30000 (Employee)

  24. One of the interesting properties of the selection operation is that it is commutative. Therefore, all the expressions shown below are equivalent, condition-1 (condition-2 (R)) (condition-1 AND condition-2) (R) condition-2 (condition-1 (R))

  25. The Projection Operation () • The projection operation is used to select only few columns from a relation. <attributes> (Relation) • Query-5: List the name and salary of all the employees. Name, Salary (Employee) • Query-6: Print the project name and their locations. PName, PLocation (Project) • Query -7: Retrieve the Name and Salary of all employees working for department 1. Result  Name, Salary (DNo = 1 (Employee))

  26. Query-8: Find the name, address, and salary of the employees who earn more than 25000 rupees. Name, Salary, Addr (Salary > 25000 (Employee)) • Query-9: List the name and the location of the projects not controlled by department 2. PName, PLocation (DNum ≠ 2) (Project))

  27. Union () • The result of this operation, i.e. R  S, is a relation that includes all tuples that are either in R or in S or in both. Duplicate tuples will not appear in the output. • Union Compatibility: Let R and S are two relations with attributes (A1, A2, …, An) and (B1, B2, ….., Bn) respectively. If R and S are to be unioned, it should satisfy the following two rules: Rule 1: The relations R and S must have the same degree. That is, the number of attributes of R and S must be same. Rule 2: The domain of the ith attribute of R and the domain of the ith attribute of S must be same. dom(Ai) = dom(Bi), where, 1 in.

  28. Example • Query-10: Retrieve the SSN of all employees who either work in department 3 or directly supervise an employee who works in 3. Emp_Dept3 SSN (DNo = 3 (Employee)) Supervisors SuperSSN (DNo = 3 (Employee)) Result  Emp_Dept3  Supervisors Emp_Dept3 Supervisors 2222 4444 4444 null 5555 Result 2222 4444 5555 null

  29. Intersection (∩) • The expression R ∩ S returns all tuples that appear in both the relations R and S. • Query-12: Find the fist name and the last name of people who are teachers as well as students. Student Teacher

  30. Difference • This operation, written as R – S (set difference) returns all tuples that are in R but not in S. • Query-13: Find the students who are not teachers. Result  Student – Teacher {Susan, Joseph, Andy} • Query-14: Find the teachers who are not students. Result  Teacher – Student {Nandagopal}

  31. Cartesian Product (×) • The Cartesian product or cross-product is a binary operation that is used to combine two relations. Assuming R and S as relations with n and m attributes respectively, the Cartesian product, R × S can be written as, R (A1, A2, …, An) × S (B1, B2, …, Bm) The result of the above set operation is, Q (A1, A2, …, An, B1, B2, …, Bm) Where, Degree (Q) = n + m count(Q) = Number of tuples in R * Number of tuples in S.

  32. Query-15: Find for each female employee, all the names of her dependents. Dependent ESSN DependentName Sex BDate Relationship 1111 Pradeep M 05-Jul-63 Brother 3333 Sonali F 15-Aug-85 Daughter 3333 Rahul M 01-Jan-80 Son 4444 Miruthula F 02-Aug-83 Spouse FemaleEmp SSN, Name (Sex = ‘F’ (Employee)) FemaleDeps  FemaleEmp × Dependent Result Name, DependentName (SSN = ESSN (FemaleDeps))

  33. Join Operations • Natural join (*) • Equijoin ( ) • Theta join () • Outer join ( )

  34. Natural Join (*) • When we omit the condition during joining of two relations, then it is called as natural join (*). When we use the word join, it invariably means natural join. We shall formally define natural join as: • Let R and S are the two relations with the attributes as shown below: R(X1, X2, .., Xm, Y1, Y2, …, Yn) and S(Y1, Y2, …, Yn , Z1, Z2, …, ZP) Example: Employee (SSN, Name, Addr, DNo) Department (DNo, DName)

  35. Theta Join () • Let R and S are two relations. Consider an attribute x in R, andan attribute y in S. The theta join of these two relations can be written as, R × S where x  y where  indicates a valid relational operator.

  36. Equijoin ( ) • The most widely used join operation is equijoin. As discussed in the previous subsection, when  is =, this type of -join (a special case) is called equijoin. We shall use the notation to denote equijoin. The general form for this kind of join is, R <Join_condition> S <Join_condition> must always have = operator

  37. Query-16: Retrieve the name and salary of the manager of each department. Managers  Department MgrSSN = SSN Employee Gnames Name, Salary (Managers) • Query-17: Find which project(s) and location(s) Pooja is working on. EmpPooja Name = 'Pooja' (Employee) PoojaPrj  EmpPooja SSN = ESSN WorksOn Result Name, PName, PLocation (PoojaPrj WrksOn.PNo = Project.PNumber Project)

  38. Outer Joins • Dangling tuples in Join • Usually, only a subset of tuples of each relation will actually participate in a join, i.e. only tuples that match with the joining attributes. Tuples of a relation not participating in a join are called dangling tuples. • How do we keep dangling tuples in the result of a join? Use null values to indicate a no-join situation. There are three types of outer joins - left outer join - right outer join and - outer join (full outer join).

  39. Left outer-join R1 R2 is similar to a natural join but keep all dangling tuples of R1. • Right Outer-Join R1 R2 is similar to a natural join but keep all dangling tuples of R2. • Outer Join (full outer-join) R1 R2 is similar to a natural join but keep all dangling tuples of both R1 and R2. The advantages of outer join is to take the union of tuples from two relations that are not union compatible.

  40. Example Students Awards Dangling tuples Students Awards

  41. Division • Consider two relations R and S. Assume that R has only two attributes r1 and r2 and S has only one attribute r2 with the same domain as in R. This is to ensure that the degree of the numerator is more than the degree of the denominator. Now we shall define R / S as, • For each r1 value in R, consider the set of r2 values that appear in tuples of R with that r2 value. If this set contains S, the r1 is in the result of R / S. • One possible restriction in this definition is that every attribute in S should be in R. Alternatively, we can define R / S as, For, R(a1, a2, …., an, b1, b2, …., bn) / S(b1, b2, …., bn) and T = a1, a2, …., an (R), return the subset of T, say W, such that every tuple in W × S is in R. W is the largest subset of T, such that, (W × S)  R

  42. Example Electives3 Electives1 Electives2 Options/Electives3 Options/Electives1 Options/Electives2

  43. GROUPING AND AGGREGATE FUNCTIONS • We shall define Function formally using the symbol  (pronounced as script F ) as, grouping_attributesaggregate_functions (Relation) • The steps involved in the evaluation of this function is, • Partition the relation into groups. • Apply aggregate function to each group. • Output group and aggregate values, one tuple per group.

  44. Query-18: Retrieve department number, number of employees and their average salary. Result (DNo, N, AvgSal) DNoCount(SSN), Avg(Salary) (Employee) • What happens when grouping attribute is dropped? Count(SSN), Avg(Salary) (Employee)

  45. End of Chapter - 4

More Related