1 / 53

MIS415 Module 2 Query Languages SQL

Structure of this quarter. Database Fundamentals. Relational Model. Normalization. ConceptualModeling. Query Languages. AdvancedSQL. Transaction Management. Java DB Applications

domani
Download Presentation

MIS415 Module 2 Query Languages SQL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. MIS415 Module 2 Query Languages –SQL Arijit Sengupta

    2. Structure of this quarter

    3. Today’s Buzzwords Query Languages Formal Query Languages Procedural and Declarative Languages Relational Algebra Relational Calculus SQL Aggregate Functions Nested Queries

    4. Objectives At the end of the lecture, you should Get a formal as well as practical perspective on query languages Have a background on query language basics (how they came about) Be able to write simple SQL queries from the specification Be able to look at SQL queries and understand what it is supposed to do Be able to write complex SQL queries involving nesting Execute queries on a database system

    5. Set Theory Basics A set: a collection of distinct items with no particular order Set description: { b | b is a Database Book} {c | c is a city with a population of over a million} {x | 1 < x < 10 and x is a natural number} Most basic set operation: Membership: x ? S (read as x belongs to S if x is in the set S)

    6. Other Set Operations Addition, deletion (note that adding an existing item in the set does not change it) Set mathematics: Union R ? S = { x | x ? R or x ? S} Intersection R ? S = { x | x ?R and x ? S} Set Difference R – S = { x | x ? R and x ? S} Cross-product R x S = { <x,y> | x ? R and y ? S} You can combine set operations much like arithmetic operations: R – (S ? T) Usually no well-defined precedence

    7. Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model supports simple, powerful QLs: Strong formal foundation based on logic. Allows for much optimization. Query Languages != programming languages! QLs not expected to be “Turing complete”. QLs not intended to be used for complex calculations. QLs support easy, efficient access to large data sets.

    8. Formal Relational Query Languages Two mathematical Query Languages form the basis for “real” languages (e.g. SQL), and for implementation: Relational Algebra: More operational, very useful for representing execution plans. Relational Calculus: Lets users describe what they want, rather than how to compute it. (Non-operational, declarative.)

    9. Structured Query Language Need for SQL Operations on Data Types Definition Manipulation Operations on Sets Declarative (calculus) vs. Procedural (algebra) Evolution of SQL SEQUEL ..SQL_92 .. SQL_93 SQL Dialects Does SQL treat Relations as ‘Sets’?

    10. Preliminaries A query is applied to relation instances, and the result of a query is also a relation instance. Schemas of input relations for a query are fixed (but query will run regardless of instance!) The schema for the result of a given query is also fixed! Determined by definition of query language constructs. Positional vs. named-field notation: Positional notation easier for formal definitions, named-field notation more readable. Both used in SQL

    11. Example Instances Students, Registers, Courses relations for our examples.

    12. Basic operations: Selection ( ) Selects a subset of rows from relation. Projection ( ) Deletes unwanted columns from relation. Cross-product ( ) Allows us to combine two relations. Set-difference ( ) Tuples in reln. 1, but not in reln. 2. Union ( ) Tuples in reln. 1 and in reln. 2. Additional operations: Intersection, join, division, renaming: Not essential, but (very!) useful. Since each operation returns a relation, operations can be composed! (Algebra is “closed”.) Relational Algebra

    13. Projection Deletes attributes that are not in projection list. Schema of result contains exactly the fields in the projection list, with the same names that they had in the (only) input relation. Projection operator has to eliminate duplicates! (Why??) Note: real systems typically don’t do duplicate elimination unless the user explicitly asks for it. (Why not?)

    14. Vertical Slices Projection Specifying Elements

    15. Does SQL treat Relations as ‘Sets’? What are the different salaries we pay to our employees? select salary from EMPLOYEE; OR is the following better? select DISTINCT salary from EMPLOYEE;

    16. Selection Selects rows that satisfy selection condition. No duplicates in result! (Why?) Schema of result identical to schema of (only) input relation. Result relation can be the input for another relational algebra operation! (Operator composition.)

    17. Horizontal Slices Restriction Specifying Conditions

    18. Specifying Conditions

    19. Pattern Matching

    20. Missing or Incomplete Information

    21. Horizontal and Vertical Query: List all student ID, names and addresses who have GPA > 3.0 and date of birth before Jan 1, 1980. select StudentID, Name, Address from STUDENT where GPA > 3.0 and DOB < ‘1-Jan-80’ order by Name DESC; Algebra: ? StudentID,name, address (? GPA > 3.0 and DOB < ‘1-Jan-80’ (STUDENT)) Calculus: {t.StudentID, t.name, t.address | t ? Student ?t.GPA > 3.0 ? t.DOB < ‘1-Jan-80’} Order by sorts result in descending (DESC) order. Note: The default order is ascending (ASC) as in: order by Name;

    22. Union, Intersection, Set-Difference All of these operations take two input relations, which must be union-compatible: Same number of fields. `Corresponding’ fields have the same type. What is the schema of result?

    23. Union

    24. Union Compatibility

    25. Union, Intersect, Minus

    26. Cross-Product Each row of S1 is paired with each row of R1. Result schema has one field per field of S1 and R1, with field names `inherited’ if possible. Conflict: Both S1 and R1 have a field called sid.

    27. Joins Condition Join: Result schema same as that of cross-product. Fewer tuples than cross-product, might be able to compute more efficiently Sometimes called a theta-join.

    28. Joins Equi-Join: A special case of condition join where the condition c contains only equalities. Result schema similar to cross-product, but only one copy of fields for which equality is specified. Natural Join: Equijoin on all common fields.

    29. Find names of students who have taken course #103

    30. Connecting/Linking Relations

    31. Join

    32. Join Forms General Join Forms Equijoin Operator Dependent Natural Join Outer Join Left Right Full

    33. Find names of students who have taken a CIS course Information about departments only available in Courses; so need an extra join:

    34. Find students who have taken an MIS or a CS course Can identify all MIS or CS courses, then find students who have taken one of these courses:

    35. Find students who have taken a CIS and an ECI Course Previous approach won’t work! Must identify students who have taken CIS courses, students who have taken ECI courses, then find the intersection (note that sid is a key for Students):

    36. Relational Calculus Comes in two flavours: Tuple relational calculus (TRC) and Domain relational calculus (DRC). Calculus has variables, constants, comparison ops, logical connectives and quantifiers. TRC: Variables range over (i.e., get bound to) tuples. DRC: Variables range over domain elements (= field values). Both TRC and DRC are simple subsets of first-order logic. Expressions in the calculus are called formulas. An answer tuple is essentially an assignment of constants to variables that make the formula evaluate to true.

    37. Find students with GPA > 3.7 who have taken a CIS Course

    38. Find students who have taken all CIS courses

    39. Monotonic and Non-Monotonic Queries Monotonic queries: queries for which the size of the results either increase or stay the same as the size of the inputs increase. The result size never decreases Non-monotonic queries: queries for which it is possible that the size of the result will DECREASE when the size of the input increases Examples of each? Which of the algebra operations is non-monotonic? What does this signify?

    40. Summaries and Aggregates

    41. Aggregate Functions

    42. Grouping Results Obtained

    43. Selections on Groups

    44. Grouping Results after Join

    45. Nesting Queries SELECT attribute(s) FROM relation(S) WHERE attr [not] {in | comparison operator | exists } ( query statement(s) ); List names of students who are taking “BA201” select Name from Student where StudentID in ( select StudentID from REGISTRATION where course#=‘BA201’);

    46. Sub Queries

    47. Sub Queries

    48. Relational Views Relations derived from other relations. Views have no stored tuples. Are useful to provide multiple user views.

    49. View Creation Create View view-name [ ( attr [ , attr ] ...) ] AS subquery [ with check option ] ; DROP VIEW view-name; Create a view containing the student ID, Name, Age and GPA for those who are qualified to take 300 level courses, i.e., GPA >=2.0.

    50. View Options With Check Option enforces the query condition for insertion or update To enforce the GPA >=2.0 condition on all new student tuples inserted into the view A view may be derived from multiple base relations Create a view that includes student IDs, student names and their instructors’ names for all CIS 300 students.

    51. View Retrieval Queries on views are the same as that on base relations. Queries on views are expanded into queries on their base relations. select Name, Instructor-Name from CIS300-Student where Name = Instructor-Name;

    52. View: Update Update on a view actually changes its base relation(s)! update Qualified-Student set GPA = GPA-0.1 where StudentID = ‘s3’; insert into Qualified-Student values ( ‘s9’, ‘Lisa’, 4.0 ) insert into Qualified-Student values ( ‘s10’, ‘Peter’, 1.7 ) Why are some views not updateable? What type of views are updateable?

    53. Non-monotonic queries – again! Need to use either MINUS or NOT EXISTS! Find courses where no student has gpa over 3.5 Find students who have taken all courses that Joe has taken How would you solve these?

    54. Summary SQL is a low-complexity, declarative query language The good thing about being declarative is that internally the query can be changed automatically for optimization Good thing about being low-complexity? No SQL query ever goes into an infinite loop No SQL query will ever take indefinite amount of space to get the solution Can be used for highly complex problems!

More Related