170 likes | 430 Views
Introduction. CMPT 454. Course Website. http://www.cs.sfu.ca/CourseCentral/454/johnwill/. Assessment. Assignments – 30% Midterm exam in class – 25% Final exam on the 12 th of April– 45%. CMPT 454. In CMPT 354 we learned about database design, creation, and use
E N D
Introduction CMPT 454
Course Website • http://www.cs.sfu.ca/CourseCentral/454/johnwill/
Assessment • Assignments – 30% • Midterm exam in class – 25% • Final exam on the 12thof April– 45%
CMPT 454 • In CMPT 354 we learned about database design, creation, and use • ER model and relational model • Relational algebra and SQL • Implementation of database applications • In CMPT 454 we learn about the construction of a database management system • How is data stored and accessed? • How are SQL queries processed? • What is a transaction, and how is it processed? • How do multiple users use the same database • What happens if there is a system failure?
A Simple DBMS • One file for each table • Separate records by newline characters • Separate fields in records by some special character • e.g. file customer might store • Kent#123#journalist • Banner#322#unemployed • Store the database schema in a special file • e.g. the Customer and Account schema • Customer#name#STR#id#INT#job#STR • Account#acc_id#INT#id#INT#balance#FLOAT
A Simple Query Processor SELECT * FROM Customer WHERE job = 'journalist' • Read the file schema to find the attributes of the Customer table and their types • Check that the condition, c, is semantically valid for Customer • Create a new file (T) for the query results • Read the Customer file, and for each line (i.e. record) • Check c • If c is true write the line to T • Add a line for T to the file schema
Handling Joins SELECT balance FROM Customer C, Account A WHERE C.name = 'Jones' AND C.id = A.id • Simple join algorithm: FOR each record c in Customer FOR each record a in Account IF c and a satisfy the WHERE condition THEN print the balance field from Account
Simple DBMS – Problems • Changing one customer's occupation requires that the entire file be rewritten • Searching for one record means reading the entire file • There is no efficient method for processing queries • In the join query every customer was matched to every account, even customers who were not journalists • There is no provision for multiple users reading or writing a file at the same time • What happens if two users tried to change an account balance at the same time? • If the system (or disk) crashes, data may be lost
Memory • Memory hierarchy • Main memory vs. secondary storage • Volatile and non-volatile storage • Speed of access to different level of memory • Cost metric for DB access
Hard Disks • Mechanics of disks • Access characteristics • Organizing data on disk • Algorithms for disk access • Disk failures • Improving access and reliability • RAID
Files and Records • Arranging records on a block • Fixed length records • Variable length records • Representing addresses and pointers • BLOBs
Indexing • Index structures • B trees • Hash tables • Multidimensional indexes • Bitmap indexes
Query Execution • Operators • Scanning tables • Measuring query execution • Algorithms • One pass algorithms • Nested loop joins • Two pass algorithms • Sorting • Hashing • Index based algorithms
Query Compiler • Parsing • Algebraic laws • Logical query plans • Estimating operation cost • Cost based selection • Join order
Recovery • Transactions • Undo logging • Redo logging • Undo/Redo logging • Media failures
Concurrency • Serial and serializable schedules • Conflict serializability • Locking • Two phase locking • Locking scheduler • Lock modes • Architecture • Concurrency control
Transaction Management • Serializability and Recoverability • Deadlocks • Long transactions