300 likes | 495 Views
From the Calculus to the Structured Query Language. Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 19, 2007. Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan. Administrivia.
E N D
From the Calculus to the Structured Query Language Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 19, 2007 Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan
Administrivia • Recall: no class next Monday 9/24 – special TA office hours instead • SQL discussion continues 9/26 • Preparation for Homework 2 (handed out next week) • To test your SQL queries, we have Oracle set up on eniac.seas.upenn.edu • Go to: www.seas.upenn.edu/~zives/cis550/oracle-faq.htmlClick on “create Oracle account” linkEnter your login info so you’ll get an Oracle account
Recall Last Time • Which students have taken more than one course from the same professor? {<name> | sid,cid,fid,cid2 . (<sid,name> ϵ STUDENTS ^<sid,_,cid> ϵ Takes ^ <fid,cid> ϵ Teaches ^ <sid,_,cid2> ϵ Takes ^ <fid,cid2> ϵ Teaches ^ cid cid2)} OR {<name> | sid,cid,fid . (<sid,name> ϵ STUDENTS ^<sid,_,cid> ϵ Takes ^ <fid,cid> ϵ Teaches ^ cid2 (<sid,_,cid2> ϵ Takes ^ <fid,cid2> ϵ Teaches ^ cid cid2))}
Algebra vs. Calculus • We’ve claimed thatthe calculus (when safe)and the algebra areequivalent • Thus (core) SQL => calculus algebramakes sense • Let’s look moreclosely at this… STUDENT COURSE Takes Calculus SELECT * FROM STUDENT, Takes, COURSE WHERE STUDENT.sid = Takes.sID AND Takes.cID = cid
Translating from RA to DRC • Core of relational algebra: , , , x, - • We need to work our way through the structure of an RA expression, translating each possible form. • Let TR[e] be the translation of RA expression e into DRC. • Relation names: For the RA expression R, the DRC expression is {<x1,x2, …, xn>| <x1,x2, …, xn> R}
Selection: TR[R] • Suppose we have (e’), where e’ is another RA expression that translates as: TR[e’]= {<x1,x2, …, xn>| p} • Then the translation ofc(e’) is {<x1,x2, …, xn>| p’}where’ is obtained fromby replacing each attribute with the corresponding variable • Example: TR[#1=#2 #4>2.5R] (if R has arity 4) is {<x1,x2, x3, x4>| < x1,x2, x3, x4> R x1=x2 x4>2.5}
Projection: TR[i1,…,im(e)] • If TR[e]= {<x1,x2, …, xn>| p} thenTR[i1,i2,…,im(e)]= {<x i1,x i2, …, x im>| xj1,xj2, …, xjk.p}, where xj1,xj2, …, xjk are variablesin x1,x2, …, xn that are not inx i1,x i2, …, x im • Example: With R as before,#1,#3 (R)={<x1,x3>| x2,x4. <x1,x2, x3,x4> R}
Union: TR[R1 R2] • R1 and R2must have the same arity • For e1 e2, where e1, e2 are algebra expressions TR[e1]={<x1,…,xn>|p} and TR[e2]={<y1,…yn>|q} • Relabel the variables in the second: TR[e2]={< x1,…,xn>|q’} • This may involve relabeling bound variables in q to avoid clashes TR[e1e2]={<x1,…,xn>|pq’}. • Example: TR[R1 R2] = {< x1,x2, x3,x4>| <x1,x2, x3,x4>R1 <x1,x2, x3,x4>R2
Other Binary Operators • Difference: The same conditions hold as for union If TR[e1]={<x1,…,xn>|p} and TR[e2]={< x1,…,xn>|q} Then TR[e1-e2]= {<x1,…,xn>|pq} • Product: If TR[e1]={<x1,…,xn>|p} and TR[e2]={< y1,…,ym>|q} Then TR[e1 e2]= {<x1,…,xn, y1,…,ym >| pq} • Example:TR[RS]= {<x1,…,xn, y1,…,ym >| <x1,…,xn> R <y1,…,ym > S }
What about the Tuple Relational Calculus? • We’ve been looking at the Domain Relational Calculus • The Tuple Relational Calculus is nearly the same, but variables are at the level of a tuple, not an attribute • {Q | 9 S COURSES, 9 T 2 Takes (S.cid = T.cid Æ Q.cid = S.cid Æ Q.exp-grade = T.exp-grade)}
Tuple Relational Calculus (in More Detail) Queries of form: {T | p} Predicate: boolean expression over Tx attribs • Expressions: TxR TX.a op TY.b TX.a op constconst op TX.a T.a = Tx.a where op is , , , , , Tx,… are tuple variables, Tx.a, … are attributes • Complex expressions: e1e2, e1e2, e, and e1e2 • Universal and existential quantifiers predicate
Domain Relational Calculusto Tuple Relational Calculus • {<subj> | 9 cid, sem, cid, sid (<cid, subj, sem> 2 COURSE Æ <sid, “C”, cid> 2 Takes} • {<cid> | 9 s1, s2 (<cid, s1, s2> 2 COURSE Æ9 cid2, s3, s4 (<cid2, s3, s4> 2 COURSE Æ (cid > cid2)))}
Mini-Quiz on the Relational Calculus How do you write: • TRC: Which faculty teach every course?
Limitations of the Relational Algebra / Calculus Can’t do: • Aggregate operations (sum, count) • Recursive queries (arbitrary # of joins) • Complex (non-tabular) structures • Most of these are expressible in SQL, OQL, XQuery – using other special operators • Sometimes we even need the power of a Turing-complete programming language
Summary • Can translate relational algebra into relational calculus • DRC and TRC are slightly different syntaxes but equivalent • Given syntactic restrictions that guarantee safety of DRC query, can translate back to relational algebra • These are the principles behind initial development of relational databases • SQL is close to calculus; query plan is close to algebra • Great example of theory leading to practice!
Basic SQL: A Friendly FaceOver the Tuple Relational Calculus SELECT [DISTINCT]{T1.attrib, …, T2.attrib}FROM {relation} T1, {relation} T2, …WHERE {predicates} Let’s do some examples, which will leverage your knowledge of the relational calculus… • Faculty ids • Course IDs for courses with students expecting a “C” • Courses taken by Jill select-list from-list qualification
Our Example Data Instance STUDENT COURSE Takes PROFESSOR Teaches
Some Nice Features • SELECT * • All STUDENTs • AS • As a “range variable” (tuple variable): optional • As an attribute rename operator • Example: • Which students (names) have taken more than one course from the same professor?
Expressions in SQL • Can do computation over scalars (int, real or string) in the select-list or the qualification • Show all student IDs decremented by 1 • Strings: • Fixed (CHAR(x)) or variable length (VARCHAR(x)) • Use single quotes: ’A string’ • Special comparison operator: LIKE • Not equal: <> • Typecasting: • CAST(S.sid AS VARCHAR(255))
Set Operations • Set operations default to set semantics, not bag semantics: (SELECT … FROM … WHERE …){op}(SELECT … FROM … WHERE …) • Where op is one of: • UNION • INTERSECT, MINUS/EXCEPT(many DBs don’t support these last ones!) • Bag semantics: ALL
Exercise • Find all students who have taken DB but not AI • Hint: use EXCEPT
Nested Queries in SQL • Simplest: IN/NOTIN • Example: Students who have taken subjects that have (at any point) been taught by Martin
Correlated Subqueries • Most common: EXISTS/NOT EXISTS • Find all students who have taken DB but not AI
Universal and Existential Quantification • Generally used with subqueries: • {op}ANY,{op}ALL • Find the students with the best expected grades
Table Expressions • Can substitute a subquery for any relation in the FROM clause: SELECT S.sidFROM (SELECT sid FROM STUDENT WHERE sid = 5) SWHERE S.sid = 4 Notice that we can actually simplify this query! What is this equivalent to?
Aggregation • GROUP BY SELECT{group-attribs}, {aggregate-operator}(attrib)FROM{relation} T1, {relation} T2, …WHERE {predicates}GROUP BY {group-list} • Aggregate operators • AVG, COUNT, SUM, MAX, MIN • DISTINCT keyword for AVG, COUNT, SUM
Some Examples • Number of students in each course offering • Number of different grades expected for each course offering • Number of (distinct) students taking AI courses
What If You Want to Only ShowSome Groups? • The HAVING clause lets you do a selection based on an aggregate (there must be 1 value per group): SELECT C.subj, COUNT(S.sid)FROM STUDENT S, Takes T, COURSE CWHERE S.sid = T.sid AND T.cid = C.cidGROUP BY subjHAVING COUNT(S.sid) > 5 • Exercise: For each subject taught by at least two professors, list the minimum expected grade
Aggregation and Table Expressions • Sometimes need to compute results over the results of a previous aggregation:SELECT subj, AVG(size)FROM ( SELECT C.cid AS id, C.subj AS subj, COUNT(S.sid) AS size FROM STUDENT S, Takes T, COURSE C WHERE S.sid = T.sid AND T.cid = C.cid GROUP BY cid, subj)GROUP BY subj
Something to Ponder • Tables are great, but… • Not everyone is uniform – I may have a cell phone but not a fax • We may simply be missing certain information • We may be unsure about values • How do we handle these things?