340 likes | 841 Views
Principles of Database Management Systems. ( Tietokannanhallintajärjestelmät ) Pekka Kilpeläinen Fall 2001. Credits. Based on Stanford CS 245 lecture notes by original authors Hector Garcia-Molina, Jeff Ullman and Jennifer Widom
E N D
Principles of Database Management Systems (Tietokannanhallintajärjestelmät) Pekka Kilpeläinen Fall 2001 Notes 1: Introduction
Credits • Based on Stanford CS 245 lecture notes by original authors Hector Garcia-Molina, Jeff Ullman and Jennifer Widom • Responsibility of any errors due to modifications belongs to Pekka Kilpeläinen Notes 1: Introduction
Relations Statements Results Isn’t Implementing a Database System Simple? Notes 1: Introduction
Introducing the MEGATRON 3000 Database Management System • The latest from Megatron Labs • Incorporates latest relational technology • UNIX/Linux compatible Notes 1: Introduction
! ! First sign non-disclosure agreement Megatron 3000 Implementation Details Notes 1: Introduction
Megatron 3000 Implementation Details • Relations stored in files (ASCII) e.g., relation R(A,B,C) is in /usr/db/R Smith # 123 # CS Jones # 522 # EE . . . Notes 1: Introduction
domains/types relations/tables attributes/columns Megatron 3000 Implementation Details • Schema file (ASCII) in /usr/db/schema: R1 # A # INT # B # STR … R2 # C # STR # A # INT … . . . Notes 1: Introduction
Megatron 3000Sample Sessions % MEGATRON3000 Welcome to MEGATRON 3000! & & . . . quit % Notes 1: Introduction
columns/attributes rows/tuples Megatron 3000Sample Sessions & select * from R ; Relation R ABC Smith 123 CS Jones 522 EE ... & Notes 1: Introduction
Megatron 3000Sample Sessions & select A,B from R,S where R.B = S.B and S.C > 100; AB Smith 123 Jones 522 & Notes 1: Introduction
Megatron 3000Sample Sessions & select * from R | LPR ; & Result sent to LPR (printer). Notes 1: Introduction
Megatron 3000Sample Sessions & select * from R where R.A < 100 | T ; & New relation T created. Notes 1: Introduction
Megatron 3000 • To execute “select * from R where condition”: (1) Read dictionary to get attributes of R (2) Check validity of condition (3) Display attributes of R as the header (4) Read file R; for each line: (a) Check condition (b) If TRUE, display Notes 1: Introduction
Megatron 3000 • To execute “select * from R where condition | T”: (1) Process select as before (2) Write results to new file T (3) Append new line to usr/db/schema Notes 1: Introduction
Megatron 3000 • To execute “select A,B from R,S where condition”: (1) Read dictionary to get attributes of R and S (2) Read file R: for each line r: (a) Read file S: for each line s: (i) Create join tuple r&s (ii) Check condition (iii) If TRUE, display r&s[A,B] Notes 1: Introduction
What’s wrong with the Megatron 3000 DBMS? • No GUI Notes 1: Introduction
What’s wrong with the Megatron 3000 DBMS? • Tuple layout on disk E.g., - Change a string from ‘Cat’ to ‘Cats’ and we have to rewrite the end of the file - Updates are expensive - ASCII storage is expensive; E.g., MAXINT = 231-1=2147483647 takes 4 B; string “2147483647” takes 10 B Notes 1: Introduction
What’s wrong with the Megatron 3000 DBMS? • Search expensive; no indexes e.g., - Cannot find tuples with given key quickly - Always have to read the full relation Notes 1: Introduction
What’s wrong with the Megatron 3000 DBMS? • Brute force query processing e.g., select * from R,S where R.A = S.A and S.B > 1000 - Do selection using S.B > 1000 first? - More efficient join? Notes 1: Introduction
What’s wrong with the Megatron 3000 DBMS? • No concurrency control: • simultaneously working processes (transactions) could cause inconsistent database state Notes 1: Introduction
What’s wrong with the Megatron 3000 DBMS? • No reliability • In case of error, say, power failure - Can lose data - Can leave operations half done Notes 1: Introduction
What’s wrong with the Megatron 3000 DBMS? • No security • File system security is coarse • Unable to restrict access, say, to some fields of relations Notes 1: Introduction
What’s wrong with the Megatron 3000 DBMS? • No application program interface (API) e.g., How can a payroll program get at the data? Notes 1: Introduction
Course Overview • Physical data storage Blocks on disks, records in blocks, fields in records • Indexing & Hashing B-Trees, hashing,… • Query Processing Methods to execute SQL queries efficiently • Crash Recovery Failures, stable storage, logging policies, ... Notes 1: Introduction
Course Overview • Concurrency Control Correctness, serializability, locks,… • Information integration • (if time permits) Notes 1: Introduction
Simplified DBMS structure User/ Application Storage manager Query processor Transaction processor Buffers Permanent storage Indexes User Data System Data Notes 1: Introduction
Why study DBMS implementation techniques? • Computer scientists’ core knowledge • Techniques applicable in implementing DBMS-like systems • Understanding of DBMS internals necessary for database administrators • NB: This course is not about designing DB-based applications or about using some specific database systems Notes 1: Introduction
Administration • Course homepage: http://www.cs.uku.fi/~kilpelai/DBMS01/ • assignments, announcements, notes • Lecturer: Pekka.Kilpelainen@cs.uku.fi • Assistant: Tarja.Lohioja@cs.uku.fi Notes 1: Introduction
Details • LECTURES: Oct. 29 - Dec. 17, Microteknia MT2 • TEXTBOOK: Garcia-Molina, Ullman, Widom; "DATABASE SYSTEM IMPLEMENTATION" • ASSIGNMENTS: Seven written homework assignments; solutions discussed in exercise sessions. No programming. • GRADING: (32*Exam/MaxExam + 12*HomeWork/MaxHomeWork - 8)/3 • WEB SITE: Assignments & notes will be posted on our Web site athttp://www.cs.uku.fi/~kilpelai/DBMS01 • Plase check it periodically for last minute announcements. Notes 1: Introduction
Reading assignment • Refresh your memory about basics of the relational model and SQL • from your earlier course notes • from some textbook • E.g. Garcia-Molina, Ullman & Widom, pp. 14-20 Notes 1: Introduction