1 / 56

CIS 671 Introduction to Database Systems II Introduction

CIS 671 Introduction to Database Systems II Introduction. S. Parthasarathy. CIS 671 Course Outline. Review of basic database material (CIS 670) Object-Oriented and Object Relational Database Systems Data Warehousing and Data Mining Active Databases Introduction to Storage Structures

loismurray
Download Presentation

CIS 671 Introduction to Database Systems II Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CIS 671Introduction to Database Systems II Introduction S. Parthasarathy Query Languages

  2. CIS 671 Course Outline • Review of basic database material (CIS 670) • Object-Oriented and Object Relational Database Systems • Data Warehousing and Data Mining • Active Databases • Introduction to Storage Structures • Emerging Technologies Query Languages

  3. Relations • Relation schema R or R(A1, A2, …, An) • fixed set of attributes A1, A2, …, An • each attribute Aj corresponds to exactly one of the underlying domains Dj (j = 1, 2, …, n), not necessarily distinct. Example: EMP( EMPNO, NAME, DNO, JOB, MGR, SAL, COMMISSION) note: EMPNO and MGR take values from the same domain, the set of valid employee numbers N, the number of attributes in the relation scheme is called the degree of the relation. Degree 1 unary degree 3 ternary degree 2 binary degree n n-ary Query Languages

  4. Keys • A relation is a set of tuples. • All elements of a set are distinct. • Hence all tuples must be distinct. • There may also be a subset of the attributes with the property that values must be distinct. Such a set is called a superkey. • SK a set of attributes • t1 and t2 tuples • t1 [SK]  t2 [SK] Guaranteed by the “real world”. Query Languages

  5. Candidate Key -a minimal superkey • A set of attributes, CK is a superkey, • but no proper subset is a superkey. Example EMPNO NAME ??? Primary Key - one arbitrarily chosen candidate key. Example EMPNO Query Languages

  6. Theoretical Properties of Relations • Tuples are unordered. • There are no tuple duplicates. • Attributes are unordered • All attribute values are atomic. • Implementations vary in terms of how closely they follow the above properties. Query Languages

  7. Relational Database Schema S • Set of Relation Schemas S = {R1, R2, …, Rp} • Set of integrity constraints, IC • Integrity constraints will be discussed next. Query Languages

  8. Relational Database (Instance) DB of Schema S • Set of relation instances, DB • DB = {r1, r2, …, rp} • Each rk is an instance of relation RK • The rk’s satisfy the integrity constraints, IC Query Languages

  9. Database Example - Schema EMP( EMPNO, NAME, DNO, JOB, MGR, SAL, COMMISSION) EMPNO from domain EmployeeNumbers ... Integrity constraints to be added DEPT( DNO, DNAME, LOC) DNO from domain DepartmentNumber Integrity constraints to be added Query Languages

  10. Database Example- Database (instance) EMP( EMPNO, NAME, DNO, JOB, MGR, SAL, COMMISSION) 1234 Smith D2 J1 4567 30K 10 2222 Jones D2 J2 4567 25K 4567 Blue D1 J4 9999 40K 9999 Green D3 J5 100K DEPT( DNO, DNAME, LOC) D1 xxx yyy D2 aaa yyy D3 bbb zzz Query Languages

  11. Relational Integrity Constraints • Not included in Codd’s original definition of relations. • Not supported in initial commercial relational database systems. • Now supported in all major relational database systems. Query Languages

  12. Integrity Rules:Applicable to All Databases • Key constraint • Entity integrity • Referential integrity Query Languages

  13. The Three Integrity Rules • Key constraint Candidate keys are specified for each relation schema. Candidate key values must be unique for every tuple in any relation instance of that schema. Query Languages

  14. The Three Integrity Rules, Continued • Entity integrity: • No attribute participating in the primary key is allowed to accept null values. Justification: we must be able to identify each tuple uniquely. Query Languages

  15. Foreign KeysEMP( EMPNO, NAME, DNO, JOB, MGR, SAL, COMMISSION)DEPT( DNO, DNAME, LOC) DNO in EMP - should be allowed only if that DNO appears as primary key in relation DEPT. MGR in EMP - should be allowed only if that MGR appears as primary key in relation EMP. The attributes DNO and MGR called foreign keys in relation EMP. Query Languages

  16. The Three Integrity Rules, Continued • Referential integrity: • If relation schema R1 includes a foreign key, FK, matching the primary key of relation schema R2, then a value of FK in a tuple t1 of R1 must either • Be equal to the value of PK in some tuple t2 of R2, i.e. T1 [FK] = t2 [PK], or • Be wholly null (i.e. Each attribute value participating in that FK value must be null). Note: R1 and R2need not be distinct. Justification: if some tuple t1 references some tuple t2 , then tuple t2 must exist. Query Languages

  17. Implications of the Three Integrity Rules • System could reject the operation. • System could accept the operation, but perform some additional operations so as to reach a new legal state. What should happen if an operation on the database is about to cause the violation of one of the integrity rules? i.e. about to put the database into an “illegal” state. Query Languages

  18. Query Languages Review Srinivasan Parthasarathy Query Languages

  19. SQL - Parts of the Language • Data Definition Language (DDL) • create table • create index • Data Manipulation Language (DML) • select (retrieve) • update • insert • delete Query Languages

  20. Select - Basic Form(select from where) • Cartesian product followed by select and project. • select project-list • from Cartesian-product-list • where select-condition(s) • Abstract example: Given tables R(A,B) and S(B,C) • select R.A, R.B, S.C • from R, S • where R.A > 10 BUT - Duplicates NOT eliminated. Bag vs. Set. Query Languages

  21. CASE STUDY EXAMPLE EMP( EMPNO, … , DNO, JOB, . . .) 100 D3 electrician 200 D3 plumber 300 D3 electrician 400 D1 electrician 500 D1 plumber 600 D1 carpenter 700 D2 electrician 800 D2 carpenter 900 D2 electrician Query Languages

  22. Select as a JOIN • Cartesian product followed by • select (“join” & “select” conditions) and project. • select project-list • from Cartesian-product-list • where join-condition • and select-condition • Abstract example: Given tables R(A,B) and S(B,C) • select R.A, R.B, S.C • from R, S • where R.B = S.B /* join condition */ • and R.A > 10 /* “select” condition */ How is this related to these relational operators? Select (), Project (),Join: Natural (*) Query Languages

  23. Using EMP and DEPTFrom Relational Algebra to SQL EMP( EMPNO, NAME, DNO, JOB, MGR, SAL, COMMISSION)DEPT( DNO, DNAME, LOC) • List the names, employee numbers, department numbers and locations for all clerks. • select NAME, EMPNO, E.DNO, LOC • from EMP E, DEPT D • where E.DNO = D.DNO /* join condition */ • and JOB = ‘Clerk’ /* “select” condition */ • . Query Languages

  24. Duplicates in project - must use explicit distinct List the different department numbers in theEMP table (eliminate duplicates). select distinctDNO from EMP • Specify sort order List employee number, name, and salary of employees in department 50. select EMPNO, NAME, SAL from EMP where DNO = 50 order byEMPNO Query Languages

  25. Union List the numbers of those departments which have an employee named ‘Smith’ or are located in ‘Columbus’. select DNO from EMP where ENAME = ‘Smith’ union select DNO from DEPT where LOC = ‘Columbus’ Duplicates ARE eliminated by default. union all - leaves duplicates Query Languages

  26. Functions and Groups • List the departments (DNO) and the average salary of each. select DNO, avg(SAL) from EMP E, DEPT D where E.DNO = D.DNO group by DNO • List the departments (DNO, DNAME) in which the average employee salary < $25,000. select DNO, DNAME from EMP E, DEPT D where E.DNO = D.DNO group by DNO, DNAME havingavg(SAL) < 25000 Query Languages

  27. Nested Select: No analog in Relational Algebra • List names of employees in departments 25, 47 and 53. select NAME from EMP where DNO in (25, 47, 53) • List names of employees who work in departments in Ann Arbor. select NAME from EMP where DNO in ( select DNO from DEPT where LOC = ‘Ann Arbor’ ) Query Languages

  28. Null Values • All of the following conditions are always false. • null > 25 null < 25 null = 25 null <> 25 • null >= 25 null <= 25 null = null null <> null • However we can use the following: • select NAME • from EMP • where SAL < 35000 • or SAL is null Query Languages

  29. Q. Find the numbers of those departments that have employees who can do some job that is done by an employee in department D3. Answer: D1, D2, D3 exists, select DNO from DEPT where exists ( select * from EMP ED3 where ED3.DNO = ‘D3’ and exists ( select * from EMP EY where EY.JOB = ED3.JOB and EY.DNO = DEPT.DNO)) The order of the two “selects” does not matter. Query Languages

  30. Ok, lets see why this works Since there is an external reference to DEPT within the second Nested select we will execute the nested select for each DEPT Tuple. Assume the DEPT table contains only three tuples corresponding to D1, D2 and D3 in that order. The first tuple we will evaluate is the DEPT.DNO = D1 The first nested select will simply highlight the tuples in department D3. The second nested select will point to tuples related to the tuple currently pointed to within the department table.

  31. EY.EMPNO, EY.DNO, EY.JOB,..) 100 D3 electrician 200 D3 plumber 300 D3 electrician 400 D1 electrician 500 D1 plumber 600 D1 carpenter 700 D2 electrician 800 D2 carpenter 900 D2 electrician ED3.EMPNO, ED3.DNO, ED3.JOB,..) 100 D3 electrician 200 D3 plumber 300 D3 electrician 400 D1 electrician 500 D1 plumber 600 D1 carpenter 700 D2 electrician 800 D2 carpenter 900 D2 electrician select DNO from DEPT where exists ( select * from EMP ED3 where ED3.DNO = ‘D3’ and exists ( select * from EMP EY where EY.JOB = ED3.JOB and EY.DNO = DEPT.DNO))

  32. Q. Find the numbers of those departments that have employees who can do allthe jobs that are done by an employee in department D3. Answer: D1, but not D2 EMP( EMPNO, … , DNO, JOB, . . .) 100 D3 electrician 200 D3 plumber 300 D3 electrician 400 D1 electrician 500 D1 plumber 600 D1 carpenter 700 D2 electrician 800 D2 carpenter 900 D2 electrician for all, Query Languages

  33. Q. Find the numbers of those departments that have employees who can do allthe jobs that are done by an employee in department D3. Answer: D1,D3, but not D2 select DNO from DEPT where for all ( select * from EMP ED3 where ED3.DNO = ‘D3’ and exists ( select * from EMP EY where EY.JOB = ED3.JOB and EY.DNO = DEPT.DNO)) However no for all exists in SQL. Query Languages

  34. Q. Find the numbers of those departments that have employees who can do allthe jobs that are done by an employee in department D3. Answer: D1,D3, but not D2 However no for all exists in SQL. Use two not exists. (see page 207 for a good list of mathematical logic operations/tricks) select DNO from DEPT where not exists ( select * from EMP ED3 where ED3.DNO = ‘D3’ and notexists ( select * from EMP EY where EY.JOB = ED3.JOB and EY.DNO = DEPT.DNO)) Query Languages

  35. Ok, lets see why this works Since there is an external reference to DEPT within the second Nested select we will execute the nested select for each DEPT Tuple. Assume the DEPT table contains only three tuples corresponding to D1, D2 and D3 in that order. The first tuple we will evaluate is the DEPT.DNO = D1 The first nested select will simply highlight the tuples in department D3. The second nested select will point to tuples related to the tuple currently pointed to within the department table.

  36. ED3.EMPNO, ED3.DNO, ED3.JOB,..) 100 D3 electrician 200 D3 plumber 300 D3 electrician 400 D1 electrician 500 D1 plumber 600 D1 carpenter 700 D2 electrician 800 D2 carpenter 900 D2 electrician EY.EMPNO, EY.DNO, EY.JOB,..) 100 D3 electrician 200 D3 plumber 300 D3 electrician 400 D1 electrician 500 D1 plumber 600 D1 carpenter 700 D2 electrician 800 D2 carpenter 900 D2 electrician select DNO from DEPT where not exists ( select * from EMP ED3 where ED3.DNO = ‘D3’ and notexists ( select * from EMP EY where EY.JOB = ED3.JOB and EY.DNO = DEPT.DNO))

  37. Q12. Find the numbers of those departments that have employees who can do allthe jobs that are done by an employee in department D3. Answer: D1, but not D2 select DNO from DEPT where not exists ( select * from EMP ED3 where ED3.DNO = ‘D3’ and notexists ( select * from EMP EY where EY.JOB = ED3.JOB and EY.DNO = DEPT.DNO) ) and DNO <> ‘D3’ Eliminate department D3. Query Languages

  38. 980 900950 700800600 850 100 200 300400 500 SQL:1999 (SQL 3)Recursive Closure Q 16. List all the superiors of EMPNO 500. 600,950, 980 Q 17. List all those supervised by EMPNO 900. 700, 800, 100, 200, 300, 400 How to express these queries? Query Languages

  39. Q 16. Given EMP (EMPNO, MGR, ...),list all the superiors of EMPNO 500.Generate SUPERIORS (EMPNO, MGR) with recursive SUPERIORS(EMPNO, MGR) as (select EMPNO, MGR from EMP where EMPNO = 500 union all select SUPERIORS.EMPNO, EMP.MGR from SUPERIORS, EMP where EMP.EMPNO = SUPERIOR.MGR) select MGR from SUPERIORS Initial table The recursion Just the superiors600 of500. 950 980 Query Languages

  40. Q 16. List all the superiors of EMPNO 500. SUPERIORS (EMPNO, MGR) 500 600 Initial table EMP( EMPNO, MGR,...) 100 700 200 700 300 800 400 800 500 600 600 950 700 900 800 900 850 950 900 980 950 980 980 980 500 950 500 980 First addition Second addition Query Languages

  41. Entity-Relationship (ER) Model(Peter P.-S. Chen) Review CIS 671 S. Parthasarathy Query Languages

  42. Helpful for conceptualizing the Real World • Entity: a thing that exists • e.g. person, automobile, department, employee • Entity Set: a group of similar entities • e.g. all persons, all automobiles, all employees • Relationship: association between entities • e.g. a person is assigned to a department • Relationship Set : set of similar relationships • Attribute: property of an entity or relationship • e.g. person - name, address • Domain: set of values allowed for an attribute Query Languages

  43. Example Employees E#, ENAME, ADDRESS Departments D#, DNAME Projects P#, PNAME Constraints (cardinality) 1. Employees may be assigned to only 1department at a time. 2. Employees may be assigned to severalprojects at once, each with an associated %time. Constraints (participation) 3. Employees must be assigned to a department. 4. Employees need not be assigned to any projects. Query Languages

  44. D# DNAME Is In E# ENAME ADDRESS P# %TIME PNAME Complete Picture Participation Department 1 N Cardinality Employee N Assigned To M Project Query Languages

  45. Is In %TIME Example: Relationships and Attributes 1. Cardinality N 1 Employee Department 1. Employees may be assigned to only 1department at a time. 2. N M Project Employee Assigned To %TIME 2. Employees may be assigned to severalprojects at once, each with an associated %time. Query Languages

  46. Is In %TIME %TIME Example: Relationships and Attributes Participation N 1 Employee Department 3. Total • 3. Employees must be assigned to a department. • Total: Each entity must be included at least once in the relationship. N M Project Employee Assigned To 4.Partial • 4. Employees need not be assigned to any projects. • Partial: Each entity instance need not be included at least once in the relationship. Query Languages

  47. LName LName FName FName E# Entity-Relationship Enhancements: Attributes 1. Simple (atomic) vs. Composite Attributes • Simple • Composite Employee Employee Employee Name Query Languages

  48. Has Major Entity-Relationship Enhancements: Attributes 2. Single-valued vs. Multi-valuedAttributes • Multi-valued • Multi-valued as Entity Major Student Major N M Student Major_Program Query Languages

  49. Member Of AvgSal Salary Entity-Relationship Enhancements: Attributes 3. Derived Attributes - Include in Department the average salary of the employees in the department. Employee Department Query Languages

  50. Dependent Dependents Of Entity-Relationship Enhancements: Entities • Weak Entity Type, Identifying Relationship Type, Partial Key • E.g. Represent all the dependents of each employee given his or her name. No ID number. Problem: Names are not unique across employees. 1 N Employee Name SSN Name Query Languages

More Related