700 likes | 824 Views
SFDV3002. Chapter 1: Relational Database Management Systems (RDBMS) Spring 2012. Overview of Chapter 1. What is a DBMS? Functions of a DBMS The Relational Model of Data Structural aspects of the relational model Relational operators Data integrity. References.
E N D
SFDV3002 Chapter 1: Relational Database Management Systems (RDBMS) Spring 2012 SFDV3002
Overview of Chapter 1 • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002
References • Kifer (Database Systems: An Application-Oriented Approach), chapters 1–2 • Silberschatz (Database System Concepts) chapters 1, 3 • Date (An Introduction to Database Systems, 8th ed.) chapters 1–3, 5–7, 9 • D’Orazio & Happel (Practical Data Modelling for Database Design) SFDV3002
Database Management Systems 1.1 What are they, why are they needed and what do they do? SFDV3002
We are here… • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002
What is a DBMS? (Figure 1–1) • Software that manages a database. • Database: an organised collection of (inter)related data. SFDV3002
Drawbacks of file-based database systems • Duplication of data ⇒ inconsistency. • Applications dependent on data structure and vice versa. • No standard data model/format. • Integrity control separated from data. SFDV3002
Drawbacks of file-based database systems • Failures ⇒ partial updates ⇒ inconsistency. • Single vs. multi-user. • Batch vs. online. SFDV3002
Why is a DBMS better? • Implements standard logical data model. • Shields applications from physical data representation. • Single, integrated database (including integrity) ⇒ reduced duplication. • Concurrent access control (locking). • Failure recovery (transactions). • Application neutral ⇒ more flexible. SFDV3002
Database schema • Definition of database’s structure (“class”). • Processed to create actual database (“instance”). CREATE TABLE Customer ( ID INTEGER PRIMARY KEY, Name VARCHAR(50), Address VARCHAR(100) ); CREATE TABLE Employee ( ... SFDV3002
Conceptual level (e.g., ERD) Logical data independence Logical level (e.g., relational, SQL) Physical data independence Physical level (blocks, files, pointers) Levels of abstraction(Date §2.2) Changes at lower levels should affect higher levels as little as possible (data independence). users applications DBMS SFDV3002
Mainstream DBMS types • Hierarchical, network, relational, object. • Hierarchical & network dominated 1970’s. • Relational (RDBMS) dominant since 1980’s. • Pure object DBMS (ODBMS) never really took off. • Oracle10g is object/relational (ORDBMS). SFDV3002
Relational DBMS • Most common type of commercial DBMS. • Relational model compliance varies (!) • all SQL-based • “SQL DBMS” better? • Constant development since 1975, very mature. • Ideal for typical business data sets. SOURCE: Stair & Reynolds, Fundamentalsof Information Systems, 2003 SFDV3002
Other DBMS types • “Flat-file” (dBASE, FileMaker): common on PCs, scale badly. • Logic-based: experimental. • RM technically logic + set theory • Temporal (time): gradually becoming mainstream (data warehouses). • Multi-dimensional: decision support. • Native XML databases. etc… SFDV3002
Some names: experimental • University Ingres (relational, 1975). • System R (IBM relational prototype, 1975). • Postgres (object/relational, mid-1980’s). • ROCK & ROLL (logic/object, 1995). • Tamino, Xindice, TextML (native XML, 2001). SFDV3002
Some names: commercial RDBMS Oracle (1978–79) Rdb (DEC, 1984 ⇒ Oracle 1995) SQL/DS (IBM, 1981) DB2 (IBM, 1983) Sybase (1986) SQLAnywhere SQL/Server (Sybase/Microsoft) Ingres (1981) Interbase (Borland/open) Informix (⇒ IBM 2001) MySQL (open source) PostgreSQL (open source) ODBMS ObjectStore Objectivity Versant Poet FastObjects ORDBMS Oracle8 (1998), 8i (1999), 9i (2001), 10g (2004) Illustra (1995 ⇒ Informix) SFDV3002
We are here… • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002
Functions of a full DBMS: “must haves”(✓ = covered in SFDV3002) • Defining database structure ✓ • Defining integrity constraints ✓ • Catalog/data dictionary ✓ • Data manipulation ✓ • Security management ✓ • Transaction support ✓ • Concurrent access control ✓ • Recovery services • Physical database tuning ✓ • Application programming interfaces ✓ We will explain some of them: SFDV3002
Defining database structure(SFDV3002 §2.2) • Data definition language (DDL)(⇒ logical schema). • Tables, domains, attributes, table links (FKs), views, etc. • Possibly user-defined/complex data types. • Object DBMSs add object behaviour, inheritance, etc. • Examples: SQL (RDBMS), ODL (ODMG), graphical (Access). SFDV3002
Defining integrity constraints(SFDV3002 §2.2, 4.2) • Integrity rules define how data are kept correct and consistent. • Oracle10g supports arbitrarily complex integrity constraints (PL/SQL). SFDV3002
Data manipulation(SFDV3002 §1.2, 2.3) • Data manipulation language (DML). • Four basic operations: • create, read, update and delete (CRUD). • SQL: INSERT, SELECT, UPDATE, DELETE. • ODMG: • OQL for reading only (also methods) • create via object construction (e.g., Java new) • update and delete via methods SFDV3002
Physical database tuning(SFDV3002 ch. 5) • Large, complex corporate databases: • hundreds of gigabytes/terabytes/… • hundreds of relations & relationships • tens/hundreds/… simultaneous users • Disk at least 100,000 times slower than RAM. • Need to tune physical storage and access. SFDV3002
Data communications services(Figure 1–5) • Communication with client hardware and/or software. • Communication with other databases. • Internet access to databases (e.g., web services). • Essential in today’s network environments. SFDV3002
Distributed computing(Kifer chs. 16, 24) • Spreading data and/or processing across several machines on a network. • Client/server distributes application processing (e.g., client application accesses database server). • Distributed database distributes data and data management (e.g., multinational corporation with many sites). SFDV3002
XML support(Kifer ch. 15) • Extensible Markup Language. • Hierarchically structured data in text form. • Excellent for data interchange. • Essential for web services. SFDV3002
Summary of Section 1.1 • DBMS: software for managing (large, multi-user) databases. • Many types, different data models; we focus on relational (RDBMS). • Primary functions of a DBMS—expected of all large commercial DBMS products. Next: Representing data at the logical level using the Relational Model of Data. SFDV3002
The Relational Model of Data 1.2 Representing dataat the logical level SFDV3002
We are here… • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002
The need for (logical) data models(Kifer §3.1) • Codd: “…users…must be protected from having to know how the data is [sic] organized in the machine (the internal representation)”. • Shield users & applications from physical data representation (files, records, bits, bytes, etc.) ⇒ physical data independence. • Make general query languages possible. • Relational Model of Data most widespread. SFDV3002
True data models have three parts(“Data models in database management”, ACM SIGMOD Record 11(2), E. F. Codd, 1981) Structure defines how data are structured, e.g., files, relations, objects. Integrity rules define how data are kept correct and consistent. Operators define how data are manipulated. SFDV3002
The Relational (logical) Model of Data(Kifer §3.2; “A relational model of data for large shared data banks”, CACM 13(6), E. F. Codd, 1970) • Mathematical base (set theory + first order predicate logic). • Formal & precise. • Independent of physical implementation. • Relations, attributes, tuples, domains. • Modern DBMSs mostly relational. SFDV3002
**Insert details of your ten favourite songs into this table. The speaker icons can link to actual audio samples from the song if you wish.** Example of a relation SFDV3002
We are here… • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002
Structural elements of the relational model(Kifer §2.2, 3.2.1; D’Orazio & Happel, pp. 63–64, Figure 3.4 & Table 3.1) • Degree of a relation = number of attributes (cf. degree of ERD relationship = number of associated entities). Table ≡ Relation Row ≡Tuple Attribute name (role) Attribute type (domain) Column ≡ SFDV3002
A closer look at domains(Date ch. 5) Codd: “in effect, a pool of values” from which attribute values may be drawn. • simple domains, whose elements are atomic values • non-simple domains, whose elements are relations and therefore non-atomic Date: user-defined data types of arbitrary complexity. • e.g., student numbers, part weights, part quantities, order dates, customer photos, order documents, … • atomic if internal structure invisible at logical level (strict “encapsulation”); e.g., dates SFDV3002
A closer look at attributes Part x is a sub-component of part y, and n units of x are needed to assemble one unit of y. • Each attribute: • has a unique name or role • has a defined domain • belongs to a particular relation • Attribute names may be re-used across relations: • different meanings or semantics • qualify with relation name, e.g., Component.Quantity vs. Product.Quantity SFDV3002
Relation variables vs. values(Kifer pp. 35–38; Date §6.3–6.5) Relation variable = definition • Product(productno, type, description, quantity) • Static* Relation value = occurrence/instance • Dynamic SFDV3002
Properties of a relation(Date §6.4) • Ordering of attributes & tuples irrelevant. • No identical tuples. • All attributes atomic (single-valued) = first normal form. (see SFDV3003) **substitute typical Omani names in this relation if you wish** non-atomic attribute SFDV3002
Keys(Date §9.10) • Combine one or more attributes to uniquely identify a tuple. • “Invented” keys are called surrogates. • Keys with >1 attribute are composite. • Relational keys: Candidate (CK): unique, stable, minimal Primary (PK): “most important” CK Alternate (AK): CKs that aren’t the PK Foreign (FK): PK embedded in another relation; used to implement relationships (see slides 74–75) SFDV3002
Attribute(s) of relation B Candidate key(s) (Alternate key(s)) Primary key Relational keys(Kifer pp. 41–43) Attribute(s) of relation A Candidate key(s) (Alternate key(s)) Primary key Foreign key(s) SFDV3002
Examples of relational keys Composite PK FK Non-composite PK SFDV3002
We are here… • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002
Relational operators • Recall the four basic data manipulation operations (CRUD): • Create • Read • Update • Delete • Relational model supports these through: • relational algebra (mainly read) • relational calculus SFDV3002
Relational algebra(Kifer §5.1; Date ch. 7; Codd 1970, 1972; Figure 1–6) • Relational “arithmetic”. • Codd defined eight operators: • set operators: product (×), union (∪), intersection (∩), difference (–) • relational operators: restrict (σ), project (π), join (⋈), divide (/) • All operators produce relations. • Expressions describe how to build result. • Mainly relevant to data retrieval. SFDV3002
Rel(http://dbappbuilder.sourceforge.net/Rel.html) • Simple open source RDBMS. • Based on Date & Darwen’s Third Manifesto. • Query language is Tutorial D (no SQL). • Supports most “standard” operators. SFDV3002
Primary algebraic operators(Figure 1.7) • πx,y,zA ⇒ filter A by attributes x, y, z.(Rel: A {x, y, z}) • σconditionB ⇒ filter B by tuples (condition).(Rel: B WHERE (condition)) • A×B ⇒ combine every tuple of relation A with every tuple of relation B.(Rel: no explicit operator) • A ⋈ B ⇒ combine tuples of A and B that share a common value for some attribute(s). (Rel: A JOIN B) • A ← B ⇒ A is assigned the value of B.(Rel: A := B) SFDV3002
**substitute your own paper details if you wish ** Project (π) operator(Kifer pp. 131–133) Paper πPaper_code,Sem,Dept_code(Paper) Paper { Paper_code, Sem, Dept_code } πDescription,Pts(Paper) Paper { Description, Pts } SFDV3002
Restrict (σ) operator(Kifer pp. 129–131) Paper σDept_code=‘INFO’(Paper) Paper WHERE (Dept_code = 'INFO') σDept_code=‘ACCT’(Paper) Paper WHERE (Dept_code = 'ACCT') SFDV3002
Product (×) operator(Kifer pp. 135–137) Paper Department × = Paper × Department Paper × Department No explicit TIMES operator in Rel; this should work: Paper JOIN (Department RENAME Dept_code AS DCODE) SFDV3002
Joins(Kifer pp. 137–143) • Combine tuples of two relations “horizontally” based on join condition. • Many forms: • basic: natural join, theta-join (θ), union join • variants: inner vs. outer ⇒ inner natural join, outer theta-join, … • Theta-joins use a comparison operator θ: • =, <, >, <=, >=, <> • equijoin, greater-than join, less-than-or-equal-to join, … SFDV3002