370 likes | 465 Views
Extending Relational Databases. Toon Calders t.calders@tue.nl. Last Lectures. Relational query languages are limited Transitive closure cannot be expressed Gaifman-locality Also w.r.t. data storage relational model is limited Simple data No set types, arrays, …. This lecture.
E N D
Extending Relational Databases Toon Calders t.calders@tue.nl
Last Lectures • Relational query languages are limited • Transitive closure cannot be expressed • Gaifman-locality • Also w.r.t. data storage relational model is limited • Simple data • No set types, arrays, …
This lecture • Extending SQL • Recursion • Nested relations • Object-oriented and object-relational database systems
This lecture • Extending SQL • Recursion • Nested relations • Object-oriented and object-relational database systems
Recursion in SQL • SQL:1999 permits recursive view definition • Example: find all employee-manager pairs, where the employee reports to the manager directly or indirectly (that is manager’s manager, manager’s manager’s manager, etc.) This example view, empl, is called the transitive closure of the manager relation
Recursion in SQL with recursive empl (employee_name, manager_name ) as ( select employee_name, manager_name from manager union select manager.employee_name,empl.manager_name from manager, empl where manager.manager_name = empl.employee_name ) select * from empl
This lecture • Extending SQL • Recursion • Nested relations • Object-oriented and object-relational database systems
Nested Relations • Very simple example: • Class book • set of authors • title • set of keywords Extremely simple to model in any programming language Hard in relational database!
Nested Relations Either we ignore the multivalued dependencies This table is in 3NF, BCNF
Nested Relations Or we go to 4NF
Nested Relations • 4NF design requires users to include joins in their queries. • 1NF relational view • eliminates the need for users to perform joins, • but loses the one-to-one correspondence between tuples and objects. • has a large amount of redundancy
Nested Relational Algebra • Types: • Set of constants C = {c1,c2, …} • If T1, …, Tk are types, then also {(T1,…,Tk)} • Domain: • Dom(C) = {c1,c2, …} • Dom ( {(T1,…,Tk)} ) = P( { (x1,…,xk) | xiTi, 1 ≤ i ≤ k} )
Nested Relational Algebra • Example: • C = {1,2,3,…} • Type { (C, { ( C, { (C) } ) } ) } has domain: • {(C)} { {}, {(1)}, {(2)}, {(1),(2)}, ... } Set of natural numbers • { ( C, { (C) } ) } Set of pairs (c,S), c is a number, S a set of numbers • All Set of pairs (c,T), T is set of pairs (c,S)
Nested Relational Algebra • Nested relation of type (T1,…,Tn) • Subset of Dom( {(T1,…,Tn)} ). • Nested relational algebra • The usual operators , , , -, , • Also Nest and Unnest: • U$i (R): remove nesting from ith column of R • N$i1, …$ik (R): nest columns $i1 … $ik
Nested Relational Algebra R = { ( a, b ), ( a, c ), ( d, b ), ( d, c ) } N$2 R ={ ( a, {b,c} ), ( d, {b,c} ) } U$2 (N$2 R) produces again original R
Flat-Flat Theorem • Let Q be a nested-relational algebra expression • Q takes a non-nested relation as input • Q produced a non-nested relation as output • Type (C,…,C) (C,…,C) • Then: • Q can be rewritten as a normal relational algebra expression; (i.e., one without nesting) • Result is actually stronger: • Nested d deep nested d’ deep: no need for intermediate results having depth > d, d’
Flat-Flat Theorem • Importance? • Can be used by query optimizers No need to introduce intermediate nesting Standard techniques can be used
Nesting in SQL • Nesting is the opposite of unnesting, creating a collection-valued attribute • Many commercial database systems support nesting in one way or another • NOTE: SQL:1999 does not support nesting • Nesting can be done in a manner similar to aggregation, but using the function set() in place of an aggregation operation, to create a set
Nesting in SQL select title, author, pub_name, pub_branch, set(keyword) as keyword-listfrom flat-booksgroupby title, author, pub_name, pub_branch select title, set(author) as author-list, pub_name, pub_branch,set(keyword) as keyword-listfrom flat-booksgroupby title, pub_name, pub_branch,
Another approach to creating nested relations is to use subqueries in the select clause. select title, ( selectauthorfromflat-books as MwhereM.title=O.title) as author-set,pub-name, pub-branch, (selectkeywordfromflat-books as NwhereN.title = O.title) as keyword-setfrom flat-books as O Nesting (Cont.)
This lecture • Extensions to SQL • Recursion • Enumeration types • Nested relations • Object-oriented and object-relational database systems
Motivation for OO Databases • Many applications require the storage and manipulation of complex data • design databases • geometric databases • … • Object-Oriented programming languages manipulate complex objects • classes, methods, inheritance, polymorphism
Motivation for OO Databases • In many applications persistency of the data is nevertheless required • protection against system failure • consistency of the data • Mapping: object in OO language tuples of atomic values in relational database is often problematic • Impedance mismatch
Motivation for OO Databases • Need for object-oriented features in databases • Object-Oriented Databases • Object-Relational Databases • Need for more powerful data manipulation • Extend expressive power of SQL; include functions, recursion
Motivation for OO Databases • Often we want to manipulate the data in the database itself • no data transmission costs • client has limited computation power • Expressive power of SQL is quite limited • allows for efficient query optimization • disallows complex data operations
Motivation for OO Databases • Whole spectrum: Relational Object Persistent OO Database Relational Programming Database Language
Complex Types and SQL:1999 • Extensions to SQL to support complex types include: • Collection and large object types • Nested relations are an example of collection types • Structured types • Nested record structures like composite attributes • Inheritance • Object orientation • Including object identifiers and references
Structured Types and Inheritance in SQL • Structured types can be declared and used in SQL create type Name as (firstname varchar(20),lastname varchar(20)) final create type Address as (street varchar(20),city varchar(20),zipcode varchar(20)) not final • Note: final and not final indicate whether subtypes can be created
Structured Types and Inheritance in SQL Structured types can be used to create tables with composite attributes create table customer ( name Name, address Address, dateOfBirth date ) Dot notation used to reference components: Name.firstname
Methods • Can add a method declaration with a structured type. method ageOnDate (onDate date) returns interval year • Method body is given separately. create instance method ageOnDate (onDate date) returns interval year for CustomerType begin return onDate - self.dateOfBirth; end
Suppose that we have the following type definition for people: create type Person(name varchar(20),address varchar(20)) Using inheritance to define the student and teacher types create type Student under Person(degree varchar(20),department varchar(20)) create type Teacherunder Person (salary integer,department varchar(20)) Inheritance
Object-Identity and Reference Types • Define a type Department with a field name and a field head which is a reference to the type Person, with table people as scope: create type Department (name varchar (20),head ref (Person) scope people) • We can then create a table departments as follows create table departments of Department
Persistent Programming Languages • Languages extended with constructs to handle persistent data • Programmer can manipulate persistent data directly • no need to fetch it into memory and store it back to disk (unlike embedded SQL) • Persistent objects: • by class - explicit declaration of persistence • by creation - special syntax to create persistent objects • by marking - make objects persistent after creation • by reachability - object is persistent if it is declared explicitly to be so or is reachable from a persistent object
Object Identity and Pointers • Degrees of permanence of object identity • Intraprocedure: only during execution of a single procedure • Intraprogram: only during execution of a single program or query • Interprogram: across program executions, but not if data-storage format on disk changes • Persistent: interprogram, plus persistent across data reorganizations
Object Identity and Pointers • Persistent versions of C++ and Java have been implemented • C++ • ODMG C++ • ObjectStore • Java • Java Database Objects (JDO)
Comparison of O-O and O-R Databases • Relational systems • simple data types, powerful query languages, high protection. • Persistent-programming-language-based OODBs • complex data types, integration with programming language, high performance. • Object-relational systems • complex data types, powerful query languages, high protection. • Note: Many real systems blur these boundaries • E.g. persistent programming language built as a wrapper on a relational database offers first two benefits, but may have poor performance.
Self-Study • Read Chapter 9 of the book “Database System Concepts, 5th edition”. • Corresponds roughly to Chapters 8 and 9 of the 4th edition, except the parts on ODMG (sections 8.5.1 and 8.5.2) which are not present in the 5th edition anymore. • During the lecture of next Friday 20/02/09 there will be time for questions and answers regarding the topics covered in this book chapter.