360 likes | 628 Views
C20.0046: Database Management Systems Lecture #6. M.P. Johnson Stern School of Business, NYU Spring, 2008. Agenda. Receive proj1 Basic SQL RA…. Recap: You are here. First part of course is done: conceptual foundations You now know: E/R Model Relational Model Relational Algebra
E N D
C20.0046: Database Management SystemsLecture #6 M.P. Johnson Stern School of Business, NYU Spring, 2008 M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Agenda • Receive proj1 • Basic SQL • RA… M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Recap: You are here • First part of course is done: conceptual foundations • You now know: • E/R Model • Relational Model • Relational Algebra • You now know how to: • Capture part of world as an E/R model • Convert E/R models to relational models • Convert relational models to good (normal) forms • Next: • Create, update, query tables with R.A/SQL • Write SQL/DB-connected applications M.P. Johnson, DBMS, Stern/NYU, Spring 2008
3-minute Normalization Review • Q: What’s required for BCNF? • Q: How do we fix a non-BCNF relation? • Q: If AsBs violates BCNF, what do we do? • Q: Can BCNF decomposition ever be lossy? • Q: How do we combine two relations? • Q: Can BCNF decomp. lose FDs? • Q: Why would you ever use 3NF? M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Normalization example: bookstore M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Normalization example: bookstore • Orders tbl: key = ordernum,isbn • Orders tbl FDs: M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Next topic: SQL • Standard language for querying and manipulating data Structured Query Language • Many standards: ANSI SQL, SQL92/SQL2, SQL3/SQL99 • Originally: Structured English Query Language (SEQUEL) • Vendors support various subsets/extensions • We’ll do Oracle/MySQL/generic • “No one ever got fired for buying Oracle.” • Basic form (many more bells and whistles in addition): SELECT attributes FROM relations (possibly multiple, joined) WHERE conditions (selections) M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Data Types in SQL • Characters: • CHAR(20) -- fixed length • VARCHAR(40) -- variable length • Numbers: • BIGINT, INT, SMALLINT, TINYINT • REAL, FLOAT -- differ in precision • MONEY • Times and dates: • DATE • DATETIME -- SQL Server M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Table name “Tables” Attribute names Product Tuples or rows M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Simple SQL Query Product SELECT *FROM ProductWHERE category='Gadgets' “selection” M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Simple SQL Query Product SELECT PName, Price, ManufacturerFROM ProductWHERE Price > 100 “selection” and “projection” M.P. Johnson, DBMS, Stern/NYU, Spring 2008
A Notation for SQL Queries Input Relation Product(PName, Price, Category, Manfacturer) SELECT Name, Price, ManufacturerFROM ProductWHERE Price > 100 (PName, Price, Manfacturer) Output Relation M.P. Johnson, DBMS, Stern/NYU, Spring 2008
The WHERE clause • Contains a boolean expression • Teach literal is a test: x = y, x < y, x <= y, etc. • For numbers, they have the usual meanings • For CHARs/VARCHARs: lexicographic ordering • Expected conversion between CHAR and VARCHAR • For dates and times, what you expect M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Complex RA Expressions • Schema: Movies (Title, year, length, inColor, studioName, Prdcr#) • Q: How long was Star Wars (1977)? • Strategy: find the row with Star Wars; then project the length field M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Combining operations • Query: Which Fox moves were >= 100 minutes long? M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Operators • Cross product again • “Cartesian Product” • Each tuple in R1 combines w/each tuple in R2 • Algebraic notation: R1 R2 • Not actual SQL! • If R1, R2 fields overlap, include both and disambiguate: R1.A, R2.A • Q: Where does the name come from? • Q: If R1 has n1 rows and R2 has n2, how large is R1 x R2? M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Cartesian product example Hillary-addresses Hillary-jobs Hillary-addresses x Hillary-jobs M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Operators • Natural join: our join up to now • merging shared attributes • Algebraic notation: R1 R2 • SQL query: • a = shared fields SELECT *FROM R1,R2WHERE R1.a = R2.a M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Natural join example Addresses Jobs Addresses Jobs M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Natural Join • R S • R S= ? • Unpaired tuples called dangling M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Natural Join • Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R S ? • Given R(A, B, C), S(D, E), what is R S? • Given R(A, B), S(A, B), what is R S? M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Join on arbitrary test U V “Theta-join” U V A<D M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Next (parallel) topic: relational algebra • Projection • Selection • Cartesian Product • Joins: natural joins, theta joins • Set operations: union, intersection, difference • Combining operations to form queries • Dependent and independent operations M.P. Johnson, DBMS, Stern/NYU, Spring 2008
What is relational algebra? • An algebra for relations • “High-school” algebra: an algebra for numbers • Algebra = formalism for constructing expressions • Operations • Operands: Variables, Constants, expressions • Expressions: • Vars & constants • Operators applied to expressions • They evaluate to values M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Why do we care about relational algebra? • The exprs are the form that questions about the data take • The relations these exprs cash out to are the answers to our questions • RA ~ more succinct rep. of many SQL queries • DBMS parse SQL into something like RA • First proofs of concept for RDBMS/RA: • System R at IBM • Ingress at Berkeley • “Modern” implementation of RA: SQL • Both state of the art, mid-70s M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Relation operators • Basic operators: • Selection: s • Projection: P • Cartesian Product: • Other set-theoretic ops: • Union: • Intersection: • Difference: - • Additional operators: • Joins (natural, equijoin, theta join, semijoin) • Renaming: r • Grouping… M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Selection op • Selects all tuples satisfying a condition • Notation: sc(R) • Examples • ssalary > 100000(Employee) • sname = “Smith”(Employee) • The condition c can have • comparison ops:=, <, , >,, <> • boolean ops: and, or M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Theater N’hood Title Sunshine Village Annie Hall Sunshine Village Bad Edu. Theater N’hood Title Film Forum Village Masc. Fem. Sunshine Village Annie Hall Sunshine Village Bad Edu. Selection example • Select the movies at Angelica: • sTheater=“Sunshine”(Showings) M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Projection op • Keep only certain columns • Projection: op we used for decomposition • Eliminates other columns, then removes duplicates • Notation: PA1,…,An(R) M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Join op • Corresponds to SQL query doing cross & equality test • Specifically: R1 R2 = Pevery att once(sshared atts =(R1 R2)) • I.e., first compute the cross product R1 x R2 • Next, select the rows in which shared fields agree • Finally, project onto the union of R1 and R2’s fields (remove duplicates) M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Rename op • Changes the schema, not the instance • Notation: rB1,…,Bn(R) • r is spelled “rho”, pronounced “row” • Example: • Employee(ssn,name) • rE2(social, name)(Employee) • Or just: rE(Employee) M.P. Johnson, DBMS, Stern/NYU, Spring 2008
RA SQL • SQL SELECT RA Projection P • SQL WHERE RA Selection s • SQL FROM RA Join/cross • Comma-separated list… • SQL renaming RA rho r • More ops later • Keep RA in the back of your mind… M.P. Johnson, DBMS, Stern/NYU, Spring 2008
Review • Examples from sqlzoo.net SELECTL FROM R1, …, Rn WHERE C PL(sC(R1 x … Rn) M.P. Johnson, DBMS, Stern/NYU, Spring 2008