1 / 70

SFDV3002

SFDV3002. Chapter 1: Relational Database Management Systems (RDBMS) Spring 2012. Overview of Chapter 1. What is a DBMS? Functions of a DBMS The Relational Model of Data Structural aspects of the relational model Relational operators Data integrity. References.

melia
Download Presentation

SFDV3002

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SFDV3002 Chapter 1: Relational Database Management Systems (RDBMS) Spring 2012 SFDV3002

  2. Overview of Chapter 1 • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002

  3. References • Kifer (Database Systems: An Application-Oriented Approach), chapters 1–2 • Silberschatz (Database System Concepts) chapters 1, 3 • Date (An Introduction to Database Systems, 8th ed.) chapters 1–3, 5–7, 9 • D’Orazio & Happel (Practical Data Modelling for Database Design) SFDV3002

  4. Database Management Systems 1.1 What are they, why are they needed and what do they do? SFDV3002

  5. We are here… • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002

  6. What is a DBMS? (Figure 1–1) • Software that manages a database. • Database: an organised collection of (inter)related data. SFDV3002

  7. Drawbacks of file-based database systems • Duplication of data ⇒ inconsistency. • Applications dependent on data structure and vice versa. • No standard data model/format. • Integrity control separated from data. SFDV3002

  8. Drawbacks of file-based database systems • Failures ⇒ partial updates ⇒ inconsistency. • Single vs. multi-user. • Batch vs. online. SFDV3002

  9. Why is a DBMS better? • Implements standard logical data model. • Shields applications from physical data representation. • Single, integrated database (including integrity) ⇒ reduced duplication. • Concurrent access control (locking). • Failure recovery (transactions). • Application neutral ⇒ more flexible. SFDV3002

  10. Database schema • Definition of database’s structure (“class”). • Processed to create actual database (“instance”). CREATE TABLE Customer ( ID INTEGER PRIMARY KEY, Name VARCHAR(50), Address VARCHAR(100) ); CREATE TABLE Employee ( ... SFDV3002

  11. Conceptual level (e.g., ERD) Logical data independence Logical level (e.g., relational, SQL) Physical data independence Physical level (blocks, files, pointers) Levels of abstraction(Date §2.2) Changes at lower levels should affect higher levels as little as possible (data independence). users applications DBMS SFDV3002

  12. Mainstream DBMS types • Hierarchical, network, relational, object. • Hierarchical & network dominated 1970’s. • Relational (RDBMS) dominant since 1980’s. • Pure object DBMS (ODBMS) never really took off. • Oracle10g is object/relational (ORDBMS). SFDV3002

  13. Relational DBMS • Most common type of commercial DBMS. • Relational model compliance varies (!) • all SQL-based • “SQL DBMS” better? • Constant development since 1975, very mature. • Ideal for typical business data sets. SOURCE: Stair & Reynolds, Fundamentalsof Information Systems, 2003 SFDV3002

  14. Other DBMS types • “Flat-file” (dBASE, FileMaker): common on PCs, scale badly. • Logic-based: experimental. • RM technically logic + set theory • Temporal (time): gradually becoming mainstream (data warehouses). • Multi-dimensional: decision support. • Native XML databases. etc… SFDV3002

  15. Some names: experimental • University Ingres (relational, 1975). • System R (IBM relational prototype, 1975). • Postgres (object/relational, mid-1980’s). • ROCK & ROLL (logic/object, 1995). • Tamino, Xindice, TextML (native XML, 2001). SFDV3002

  16. Some names: commercial RDBMS Oracle (1978–79) Rdb (DEC, 1984 ⇒ Oracle 1995) SQL/DS (IBM, 1981) DB2 (IBM, 1983) Sybase (1986) SQLAnywhere SQL/Server (Sybase/Microsoft) Ingres (1981) Interbase (Borland/open) Informix (⇒ IBM 2001) MySQL (open source) PostgreSQL (open source) ODBMS ObjectStore Objectivity Versant Poet FastObjects ORDBMS Oracle8 (1998), 8i (1999), 9i (2001), 10g (2004) Illustra (1995 ⇒ Informix) SFDV3002

  17. We are here… • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002

  18. Functions of a full DBMS: “must haves”(✓ = covered in SFDV3002) • Defining database structure ✓ • Defining integrity constraints ✓ • Catalog/data dictionary ✓ • Data manipulation ✓ • Security management ✓ • Transaction support ✓ • Concurrent access control ✓ • Recovery services • Physical database tuning ✓ • Application programming interfaces ✓ We will explain some of them: SFDV3002

  19. Defining database structure(SFDV3002 §2.2) • Data definition language (DDL)(⇒ logical schema). • Tables, domains, attributes, table links (FKs), views, etc. • Possibly user-defined/complex data types. • Object DBMSs add object behaviour, inheritance, etc. • Examples: SQL (RDBMS), ODL (ODMG), graphical (Access). SFDV3002

  20. Defining integrity constraints(SFDV3002 §2.2, 4.2) • Integrity rules define how data are kept correct and consistent. • Oracle10g supports arbitrarily complex integrity constraints (PL/SQL). SFDV3002

  21. Data manipulation(SFDV3002 §1.2, 2.3) • Data manipulation language (DML). • Four basic operations: • create, read, update and delete (CRUD). • SQL: INSERT, SELECT, UPDATE, DELETE. • ODMG: • OQL for reading only (also methods) • create via object construction (e.g., Java new) • update and delete via methods SFDV3002

  22. Physical database tuning(SFDV3002 ch. 5) • Large, complex corporate databases: • hundreds of gigabytes/terabytes/… • hundreds of relations & relationships • tens/hundreds/… simultaneous users • Disk at least 100,000 times slower than RAM. • Need to tune physical storage and access. SFDV3002

  23. Data communications services(Figure 1–5) • Communication with client hardware and/or software. • Communication with other databases. • Internet access to databases (e.g., web services). • Essential in today’s network environments. SFDV3002

  24. Distributed computing(Kifer chs. 16, 24) • Spreading data and/or processing across several machines on a network. • Client/server distributes application processing (e.g., client application accesses database server). • Distributed database distributes data and data management (e.g., multinational corporation with many sites). SFDV3002

  25. XML support(Kifer ch. 15) • Extensible Markup Language. • Hierarchically structured data in text form. • Excellent for data interchange. • Essential for web services. SFDV3002

  26. Summary of Section 1.1 • DBMS: software for managing (large, multi-user) databases. • Many types, different data models; we focus on relational (RDBMS). • Primary functions of a DBMS—expected of all large commercial DBMS products. Next: Representing data at the logical level using the Relational Model of Data. SFDV3002

  27. The Relational Model of Data 1.2 Representing dataat the logical level SFDV3002

  28. We are here… • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002

  29. The need for (logical) data models(Kifer §3.1) • Codd: “…users…must be protected from having to know how the data is [sic] organized in the machine (the internal representation)”. • Shield users & applications from physical data representation (files, records, bits, bytes, etc.) ⇒ physical data independence. • Make general query languages possible. • Relational Model of Data most widespread. SFDV3002

  30. True data models have three parts(“Data models in database management”, ACM SIGMOD Record 11(2), E. F. Codd, 1981) Structure defines how data are structured, e.g., files, relations, objects. Integrity rules define how data are kept correct and consistent. Operators define how data are manipulated. SFDV3002

  31. The Relational (logical) Model of Data(Kifer §3.2; “A relational model of data for large shared data banks”, CACM 13(6), E. F. Codd, 1970) • Mathematical base (set theory + first order predicate logic). • Formal & precise. • Independent of physical implementation. • Relations, attributes, tuples, domains. • Modern DBMSs mostly relational. SFDV3002

  32. **Insert details of your ten favourite songs into this table. The speaker icons can link to actual audio samples from the song if you wish.** Example of a relation SFDV3002

  33. We are here… • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002

  34. Structural elements of the relational model(Kifer §2.2, 3.2.1; D’Orazio & Happel, pp. 63–64, Figure 3.4 & Table 3.1) • Degree of a relation = number of attributes (cf. degree of ERD relationship = number of associated entities). Table ≡ Relation Row ≡Tuple Attribute name (role) Attribute type (domain) Column ≡ SFDV3002

  35. A closer look at domains(Date ch. 5) Codd: “in effect, a pool of values” from which attribute values may be drawn. • simple domains, whose elements are atomic values • non-simple domains, whose elements are relations and therefore non-atomic Date: user-defined data types of arbitrary complexity. • e.g., student numbers, part weights, part quantities, order dates, customer photos, order documents, … • atomic if internal structure invisible at logical level (strict “encapsulation”); e.g., dates SFDV3002

  36. A closer look at attributes Part x is a sub-component of part y, and n units of x are needed to assemble one unit of y. • Each attribute: • has a unique name or role • has a defined domain • belongs to a particular relation • Attribute names may be re-used across relations: • different meanings or semantics • qualify with relation name, e.g., Component.Quantity vs. Product.Quantity SFDV3002

  37. Relation variables vs. values(Kifer pp. 35–38; Date §6.3–6.5) Relation variable = definition • Product(productno, type, description, quantity) • Static* Relation value = occurrence/instance • Dynamic SFDV3002

  38. Properties of a relation(Date §6.4) • Ordering of attributes & tuples irrelevant. • No identical tuples. • All attributes atomic (single-valued) = first normal form. (see SFDV3003) **substitute typical Omani names in this relation if you wish** non-atomic attribute SFDV3002

  39. Keys(Date §9.10) • Combine one or more attributes to uniquely identify a tuple. • “Invented” keys are called surrogates. • Keys with >1 attribute are composite. • Relational keys: Candidate (CK): unique, stable, minimal Primary (PK): “most important” CK Alternate (AK): CKs that aren’t the PK Foreign (FK): PK embedded in another relation; used to implement relationships (see slides 74–75) SFDV3002

  40. Attribute(s) of relation B Candidate key(s) (Alternate key(s)) Primary key Relational keys(Kifer pp. 41–43) Attribute(s) of relation A Candidate key(s) (Alternate key(s)) Primary key Foreign key(s) SFDV3002

  41. Examples of relational keys Composite PK FK Non-composite PK SFDV3002

  42. We are here… • What is a DBMS? • Functions of a DBMS • The Relational Model of Data • Structural aspects of the relational model • Relational operators • Data integrity SFDV3002

  43. Relational operators • Recall the four basic data manipulation operations (CRUD): • Create • Read • Update • Delete • Relational model supports these through: • relational algebra (mainly read) • relational calculus SFDV3002

  44. Relational algebra(Kifer §5.1; Date ch. 7; Codd 1970, 1972; Figure 1–6) • Relational “arithmetic”. • Codd defined eight operators: • set operators: product (×), union (∪), intersection (∩), difference (–) • relational operators: restrict (σ), project (π), join (⋈), divide (/) • All operators produce relations. • Expressions describe how to build result. • Mainly relevant to data retrieval. SFDV3002

  45. Rel(http://dbappbuilder.sourceforge.net/Rel.html) • Simple open source RDBMS. • Based on Date & Darwen’s Third Manifesto. • Query language is Tutorial D (no SQL). • Supports most “standard” operators. SFDV3002

  46. Primary algebraic operators(Figure 1.7) • πx,y,zA ⇒ filter A by attributes x, y, z.(Rel: A {x, y, z}) • σconditionB ⇒ filter B by tuples (condition).(Rel: B WHERE (condition)) • A×B ⇒ combine every tuple of relation A with every tuple of relation B.(Rel: no explicit operator) • A ⋈ B ⇒ combine tuples of A and B that share a common value for some attribute(s). (Rel: A JOIN B) • A ← B ⇒ A is assigned the value of B.(Rel: A := B) SFDV3002

  47. **substitute your own paper details if you wish ** Project (π) operator(Kifer pp. 131–133) Paper πPaper_code,Sem,Dept_code(Paper) Paper { Paper_code, Sem, Dept_code } πDescription,Pts(Paper) Paper { Description, Pts } SFDV3002

  48. Restrict (σ) operator(Kifer pp. 129–131) Paper σDept_code=‘INFO’(Paper) Paper WHERE (Dept_code = 'INFO') σDept_code=‘ACCT’(Paper) Paper WHERE (Dept_code = 'ACCT') SFDV3002

  49. Product (×) operator(Kifer pp. 135–137) Paper Department × = Paper × Department Paper × Department No explicit TIMES operator in Rel; this should work: Paper JOIN (Department RENAME Dept_code AS DCODE) SFDV3002

  50. Joins(Kifer pp. 137–143) • Combine tuples of two relations “horizontally” based on join condition. • Many forms: • basic: natural join, theta-join (θ), union join • variants: inner vs. outer ⇒ inner natural join, outer theta-join, … • Theta-joins use a comparison operator θ: • =, <, >, <=, >=, <> • equijoin, greater-than join, less-than-or-equal-to join, … SFDV3002

More Related