180 likes | 294 Views
CS609 Introduction. Databases. Current state? Future?. Relational Model (1970 E.F. Codd) - Chapters 2-3. A data model: types relationships constraints basic operations relational model - files + mathematical foundation (set theory)
E N D
CS609 Introduction
Databases • Current state? • Future?
Relational Model (1970 E.F. Codd) - Chapters 2-3 • A data model: • types relationships constraints basic operations • relational model - files + mathematical foundation (set theory) Relation - Table Rows - Tuples Attributes - Columns
Relational Rules 1. 1NF 2. Access rows by content only (can't say 3rd row), unique rows 3. Integrity constraints • key constraints (all tuples distinct) • entity constraints - no primary key null • domain constraint - defines domain values • referential constraint - if a tuple refers to another tuple, it must be an existing tuples • semantic constraint - defines rules for relationships
Notation • Relational algebra (algebraic notation) and relational calculus - logical notation • used to demonstrate potential for query language of relational DB • algebra and calculus equivalent in expressive power
Operations • Set theoretic ops: • union, intersection, difference, cartesian product • Native relational ops: • project, select, division, join - equi, theta, natural join • R / S - result is those rows t such that for every row s in S, if concatenate t and s, can be found in R • Other ops • outer joins - left, right, aggregate (set) functions, grouping
Operations cont’d • Precedence of relational algebra ops: (in order of highest) • Project • Select • cartesian product • join & divide • Minus • Union • intersection
Complete Set • Can express any query (except aggregate functions) with the complete set • Select s • Project p • Cartesian product X • Union U • Difference - R/S = R - ((R X S) - R) given R(A1...An,B1...Bm) S(B1...Bm) (pA1...AnR) - pA1...An((pA1...AnR X S) - R) R ^ S = (RUS)-((R-S) U (S-R))
Query Language • SQL or SEQUEL - (Structured English Query Language) • High-level DB language created at IBM in 1970's DB system products using SQL released in early 80's • Provides DML and DDL • Standard SQL • SQL-86, SQL-89 - minimal standard, still incompatibility product features • ANSI/ISO - (SQL-92) SQL2 - tried to avoid incompatibility problems • X/Open standard - extensions plus portability • SQL-3 • Current standard SQL:1999, SQL-99 • Future standard SQL:2003, SQL:200n
SQL • Subselect - formal definition: Select expr {, expr} From tablename [alias] {, tablename [alias]} [Where search_condition] [Group By col {, col}] [Having search_condition] • Full Select - formal definition: Subselect {Union [all] subselect} [Order By result_col [asc | desc] {, result_col [asc | desc]}] http://download-west.oracle.com/docs/cd/A91202_01/901_doc/server.901/a90125/statements_103.htm#2080289
Sample SQL queries • List the ssn for each female employee and pname of projects she works_on. • List the ssn and salary for all employees who do not have a son as a dependent. • List lname, fname of employees who have more than 2 dependents. • List lname, pname, plocation for all projects worked on by femail employees with > 2 dependents. • List employee ssn and salary (with 10% increase) for all employees who do not work on project # 30. • Give every employee a 10% raise.
Power/Weakness of SQL • SQL is a non-procedural language • user specifies what is desired rather than how (e.g. no need to write loop statement to traverse tuples) • no implied ordering • query optimizer decides how • desired result described in one statement • Avoids procedural complexities • A language with the power of relational algebra defined as relationally complete (Codd)
Weakness of SQL • 1) Too many equivalent forms • e.g. in, =any, exists not in, <> all • 2) No non-procedural language can have Turing power (computational complete), e.g. perform any computational procedure that can be specified in algorithmic terms • Not true of SQL only relationally complete
Weakness • 3) Some capabilities missing • Selects cannot be arbitrarily nested like relational algebra • loss of expressive power of a relational algebra expression • 4) Cannot nest set functions • e.g. Average of total dollar sales • 5) No statistical functions (besides avg) • e.g. variance, mode
Weakness • 6) Can find reports SQL can't create - need embedded SQL • e.g. categorize total sales based on size of sale • 7) No Transitive closure - arbitrary number • managers of managers • 8) Limited by Boolean conditions • Cannot retrieve 5 out of 6 • Cannot list 20 agents with largest sales • No distance to keywords • Are 1-8 still true of ORACLE?
Strengths • While selects cannot be arbitrarily nested - have added power in search (where) condition • Other strengths of SQL?