360 likes | 432 Views
Data models. Relational, object, and semistructured. Types of database applications. Queries No queries. Payroll. GIS. Word processing. CAD/CAM. Simple data Complex data. An example schema. COMPANY. head_office. address. PERSON. ceo. departments. ADDRESS. ISA. office.
E N D
Data models Relational, object, and semistructured
Types of database applications Queries No queries Payroll GIS Word processing CAD/CAM Simple data Complex data
An example schema COMPANY head_office address PERSON ceo departments ADDRESS ISA office boss DEPARTMENT EMPLOYEE EMPLOYEEs
A relational schema • COMPANY(CID, Name, Street, City, CEO) • DEPARTMENT(CID, DeptName, Street, City, Boss) • PERSON(SS#, Name, Street, City) • EMPLOYEE(SS#, Salary) • EMPLOYMENT(CID, DeptName, Emp) COMPANY.CEO << EMPLOYEE.SS# DEPARTMENT. CID << COMPANY.CID EMPLOYEE.SS# << PERSON.SS# EMPLOYMENT.(CID, DeptName) << DEPARTMENT.(CID, DeptName) EMPLOYMENT.Emp << EMPLOYEE.SS#
Problems with the schema • Composite attributes do not exist Head office cannot be described directly but is decomposed into Street, City • Set valued attributes do not existThe fact that people are employed at departments is described in an extra table
Problems with the schema • Generalisation does not exist The generalisation relationship between EMPLOYEE and PERSON is not explicit • Artificial identifiers requiredCID is an artificial identifier for COMPANY
An SQL query SELECT A.SS#, B.SS# FROM EMPLOYEE AS A, EMPLOYEE AS B, EMPLOYMENT AS EMPL1, EMPLOYMENT AS EMPL2, PERSON AS P1, PERSON AS P2 DEPARTMENT AS D1, DEPARTMENT AS D2 WHERE A.SS# = P1.SS# AND B.SS# = P2.SS# AND P1.Street = P2.Street AND P1.City = P2.City AND A.Salary > 20000 AND B.Salary > 20000 AND EMPL1.SS# = A.SS# AND EMPL2.SS# = B.SS# AND EMPL1.CID = D1.CID AND EMPL2.CID = D2.CID AND EMPL1.DeptName = D1.DeptName AND EMPL2.DeptName = D2.DeptName AND D1.City = London AND D2.City = London Are there employees who live at the same address, work at departments in London and earn more than 20000?
An object database schema ADDRESS Street: String City: String PERSON SS#: String Name: String Address: Address EMPLOYEE ISA PERSON Salary: Integer Boss: Employee Departments: {Department} COMPANY Name: String Head office: Address Departments: {Department} CEO: Employee DEPARTMENT Name: String Office: Address Boss: Employee Employees: {Employee}
An OO query Are there employees who live at the same address, work at departments in London and earn more than 20000? SELECT e1.ss#, e2.ss# FROM e1, e2 in EMPLOYEE; d1, d2 in DEPARTMENT WHERE e1 in d1.Employees AND e2 in d2.Employees AND e1.Address = e2.Address AND e1.Salary > 20000 AND e2.Salary > 20000 AND f1.Office.City = London AND f2.Office.City = London
OODB models Fundamentals • Complex types • Types and classes • Object identity • Inheritance
Complex types • Base typesStringIntegerBoolean • ConstructorsTupleSetBagListArray
Complex types - an example • COORDINATEX: IntegerY: IntegerZ: Integer • PARKING SPOTPosition: COORDINATEOccupied: Boolean • CAR PARKSpots: ARRAY OF PARKING SPOTPersonnel: SET OF EMPLOYEE The type constructors can be applied to any type - cf. the relational model
Relation schemes and relations PERSON Relation scheme SS# Name Age Salary 650101-2288 Eva Svensson 33 25000 750203-3133 Per Jonsson 23 20000 500107-5532 Sven Olsson 47 25000 800515-0044 Pia Eriksson 17 18000 Relation
Types and classes • A type is a time independent description of a set of (base or complex) values • A class has a time dependent extension that is a set of object identifiers
Object identity • Every object is identified by an object identifier • The object identifier does not change during the life span of the object • Two different objects with different object identifiers may have the same values OID: 1293301 Name: Simba Age: 11 Weight: 260 OID: 1088732 Name: Simba Age: 11 Weight: 260
Object identity The relational model Lion(Name, Age, Weight) (Simba, 11, 260) OO-model LION Name: String Age: Integer Weight: Integer 1088732 OID: 1088732 Name: Simba Age: 11 Weight: 260
Object identity Why object identifiers are better than keys: • Key attributes not stable • Key values not stable • Object sharing Object identifiers can be used in relational databases but must be maintained by the users
Inheritance Employees are persons The relational model PERSON(SS#, Name, City) ANSTÄLLD(SS#, Salary) EMPLOYEE.SS3 << PERSON.SS# OO model PERSON SS#: String NamE: String CITY: String EMPLOYEE ISA PERSON Salary: Integer
Inheritance in the relational model PERSON(SS#, Name, City) EMPLOYEE(SS#, Spouse#, Salary) EMPLOYEE.Spouse# << PERSON.SS# PERSON(SS#, Name, City) EMPLOYEE(Emp#, SS#, Salary) EMPLOYEE.SS# << PERSON.SS# PERSON PERSON ISA spouse EMPLOYEE EMPLOYEE
Inheritance in OO databases PERSON SS#: String Name: String City: String EMPLOYEE Emp#: String Salary: Integer Spouse: PERSON PERSON SS#: String Name: String City: String EMPLOYEE ISA PERSON Emp#: String Salary: Integer PERSON PERSON ISA spouse EMPLOYEE EMPLOYEE
Extended base types create table slides ( id int, date date, caption document, picture photo_CD_image); create table landmarks ( name varchar(30), location point); Find sunsets within a 20 kilometers radius around Sacramento. select id from slides P, landmarks L S where sunset (P.picture) and contains (P.caption, L.name) and L.location |20| S.location and S.name = 'Sacramento';
Types of DBMSs Queries No queries Relational OO/OR File systems OO Simple data Complex data
Why do we like types? • Types facilitate understanding • Types enable compact representations • Types enable query optimisation • Types facilitate consistency enforcement
Background assumptions fortyped data • Data stable over time • Organisational body to control data • Exercise: Give an example of a context where these assumptions do not hold
Semistructured data Semistructured data is schemaless and self describing The data and the description of the data are integrated
Value Label name tel email “John” 112233 “john@123.edu” Label-value pairs • {name: “John”, tel: 112233, email: “john@123.edu”}
name tel email 112233 “john@123.edu” first last “John” “Smith” Nested label-value pairs {name: {first: “John”, last: “Smith”}, tel: 112233, email: “john@123.edu”}
Duplicate labels {name: {first: “John”, last: “Smith”}, tel: 112233, tel: 445566, tel: 778899, email: “john@123.edu”}
Representing variations {person: {name: “John”, tel: 112233, email: “john@123.edu”}, person: {name: “John”, email: “john@123.edu”}, person: {name: “John”, age: 33, email: “john@123.edu”}, person: {name: “John”, tel: 112233, tel: 332211}}
Representing relational data CAR PERSON Regno Manuf Weight Ssno City Weight AA11 Volvo 1100 1122 London 80 CC33 Volvo 900 2233 Paris 70 EE55 Saab 1000 3344 Berlin 80 {CAR: {row: {Regno: “AA11”, Manuf: “Volvo”, Weight: 1100}, {row: {Regno: “CC33”, Manuf: “Volvo”, Weight: 900}, {row: {Regno: “EE55”, Manuf: “Saab”, Weight: 1000}}, PERSON: {row: {Ssno: “1122”, City: “London”, Weight: 80}, {row: {Ssno: “2233”, City: “Paris”, Weight: 70}, {row: {Ssno: “3344”, City: “Berlin”, Weight: 80}}}
An object graph This graph represents two people, Eva and Abel, where Abel is the child of Eva. &o1 and &o2 are object identifiers denoting the people Eva and Abel. person person child &o1 &o2 name age name age “Eva” 40 “Abel” 20
person person child &o1 &o2 name age name age “Eva” 40 “Abel” 20 Representing objects {person: &o1{name: “Eva”, age: 40, child: &o2}, person: &o2{name: “Abel”, age: 20}} An object identifier, such as &o1, before a structure, binds the object identifier to the identity of that structure. The object identifier can then be used to refer to the structure.
An exercise a a a a &o1 &o2 &o1 &o2 b c b c &o3 &o3 &o4 d d d “hello” “hello” “hello”
Course goals • The course will familiarise the student with: • database models and query languages with respect to expressiveness and usability • theory and principles of object data bases • semistructured data and its applications • application areas for object and relational databases • interactive and embedded query languages • non-traditional data types and their management in databases
Lectures 1. Introduction to data models 2. Query languages for relational databases 3. Models and query languages for object databases 4. Models and query languages for semistructured data, XML 5. Embedded query languages