1 / 37

Database Systems

Database Systems. Marcus Kaiser School of Computing Science Newcastle University. Recap: Data Inconsistency when a Computer Fails. A bank wishes to move £500 from account 12 to account 17 The sequence of actions is: 1. Reduce the Balance of account 12 by £500

Download Presentation

Database Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Systems Marcus Kaiser School of Computing Science Newcastle University

  2. Recap: Data Inconsistency when a Computer Fails • A bank wishes to move £500 from account 12 to account 17 • The sequence of actions is: 1. Reduce the Balance of account 12 by £500 2. Increase the Balance of account 17 by £500 What if the computer crashes after 1 and before 2 ?

  3. Transactions • A Transaction is a logical unit of work • A Transaction can consist of a sequence of database operations • e.g. 1. Reduce the Balance of account 12 by £500 2. Increase the Balance of account 17 by £500 • A Transaction either executes in its entirety or is totally cancelled

  4. Transactions (examples) • A Transaction either executes in its entirety or is totally cancelled • e.g. Start Transaction 1. Reduce the Balance of account 12 by £500 2. Increase the Balance of account 17 by £500 End Transaction • A. Both steps of the transaction complete successfully • The database has been changed • B. The computer crashes after Step 1. • the DBMS restores the database to the state it was in before the Transaction began • How does it do this ?......

  5. Logging • The DBMS keeps a log (on Disk) in which it records: • transactions starts • db updates (old and new values) • transaction ends • e.g. Start Transaction 1. Reduce the Balance of account 12 by £500 2. Increase the Balance of account 17 by £500 End Transaction • if all goes well, the log entries are: Start Transaction Update (Account: 12 , 1000 -> 12 , 500) Update (Account: 17 , 2000 -> 17 , 2500) End Transaction

  6. Logging • e.g. Start Transaction 1. Reduce the Balance of account 12 by £500 2. Increase the Balance of account 17 by £500 End Transaction If the computer crashes after Step 1, the log entries are: Start Transaction Update (Account: 12 , 1000 -> 12 , 500) • When the computer comes up again, the DBMS can undo all updates made by incomplete transactions

  7. Id Surname Initial Title Job Id Wage 12 Smith A Mr Designer 1 12 15000 75 75 Smith K Ms Designer 2 17000 34 Implementer 16000 34 Brown M Mrs Logging Example • A personnel department keep a database with 2 Tables: Employee: Payroll: Mrs Brown is promoted to Manager, and her salary increased to 20000 • The transaction is: • Start Transaction • 1. Update the Job of Employee Id 34 to Manager • 2. Update the Wage of Id 34 to 20000 • End Transaction

  8. Avoiding Data Loss when a Disk Fails The Log can also allow us to Recover from Disk Failure • The database is regularly copied onto tape (archiving) • nightly is common • The log is stored on a different disk to the database

  9. Actions on Disk Failure: If a Database Disk fails: 1. Replace the Disk 2. Copy the last database archive back onto the disk 3. Process all log entries made after the last archive if the log entry is an update for a completed transaction then do it

  10. Sue @ Newcastle University ATM 11.00 Check Balance 11.01 ATM says there’s £200 11.02 Ask for £200 11.03 ATM Finds £200 in account 11.04 ATM Gives £200 11.05 ATM Stores £0 in Balance Jim @ Metro Centre ATM 11.00 Check Balance 11.01 ATM says there’s £200 11.02 Ask for £200 11.03 ATM Finds £200 in account 11.04 ATM Gives £200 11.05 ATM Stores £0 in Balance Recap: Simultaneous Access to the Data • Sometimes problems can occur when a file is being updated if there is more than one user. • e.g. Sue and Jim have a joint bank account. • they go shopping separately and both run out of money at the same time • they both head for the nearest ATM.......

  11. Atomic Transactions • Transactions can be made atomic • An atomic action has exclusive access to the data • Changing the balance on an account is atomic • Either Sue or Jim will get exclusive access to the balance change • The other will need to wait until the atomic transaction is finished • At which stage they will see the new balance

  12. Keys Primary Keys, Foreign Keys and Candidate Keys

  13. Keys • Keys are a subset of the fields of a table which uniquely define a record • Primary Key – the key within a table • Foreign key – a primary key in another table • Candidate Key – one of the possible options for primary key Primary Key

  14. Database Normalization Helping to identify good designs

  15. Database Normal forms • Normal forms are a set of requirements on a database • They won’t tell you you’ve got a good design • Just tell you your design isn’t bad • There are ~7 Normal Forms each becoming progressively more restrictive • Though the first three are the ones that are used most often • When you’ve designed your tables you can use them to check you’ve not made a bad design

  16. First Normal Form 1NF • A table is in 1NF if: 1. There's no top-to-bottom ordering to the rows. 2. There's no left-to-right ordering to the columns. 3. There are no duplicate rows. 4. Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else). 5. All columns are regular [i.e. rows have no hidden components such as row IDs, object IDs, or hidden timestamps].

  17. Examples of NOT 1NF M2 M2 M3

  18. Fixing 1NF problems • In general splitting a table into separate tables can fix 1NF problems

  19. Second Normal Form 2NF • A table is in 2NF if • It is in 1NF and • Any attribute in the table depends on the whole of the candidate key and not just part of it.

  20. Examples of NOT 2NF CK CK

  21. Fixing 2NF problems • Again in general splitting the table up will solve 2NF problems

  22. Third Normal Form 3NF • A Table is in 3NF if • It is in 2NF and • Any attribute is only dependent on the candidate key and nothing else

  23. Examples of NOT 3NF CK

  24. Fixing 3NF problems • Again splitting into separate tables can fix 3NF

  25. Mnemonic for normal forms • Data should depend on the key (1NF) no duplicate entries the whole key (2NF) not only part of the candidate key and nothing but the key (3NF) no dependency on other attribute

  26. More SQL

  27. Running Example • A Company keeps records for boat hires • Each boat is crewed by a sailor • Boats can be reserved • Three tables: Sailors, Boats and Reservations Sailor Boat Reservation

  28. Rough Matching • If you don’t know exactly what you’re looking for in a string you can use LIKE • _ - Matches with exactly one unknown character • % - Matches with 0 or more unknown characters SELECT Age FROM Sailor WHERE Sailor.Sname LIKE ‘Ne_o’ • Would match ‘Nemo’, ‘Neto’, ‘Nebo’, …. SELECT Age FROM Sailor WHERE Sailor.Sname LIKE ‘Ne%o’ • Would match ‘Neo’, ‘Nemo’, ‘Nemo von bo’, …

  29. Mathematical Operations • You can use mathematical operations within the SELECT statement • You’ve already seen this as MIN, MAX, AVG • You can also have +, -, * , /, % • What is the sailors rating per year? SELECT Rating / Age FROM Sailor

  30. Union, Intersect, Except • SQL provides set-manipulation constructs: • UNION () • INTERSECT () • EXCEPT () • By default, duplicates are eliminated in results • To retain duplicates, use UNION ALL, INTERSECT ALL, EXCEPT ALL

  31. Find Those who’ve hired a red OR Green boat SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND (B.colour=‘red’ OR B.colour=‘green’) • OR SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘red’ UNION SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘green’ • Why?...

  32. Find Those who’ve hired a red AND Green boat SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND (B.colour=‘red’ AND B.colour=‘green’) • OR SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘red’ INTERSECT SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘green’

  33. Find Those who’ve hired a red boat but not a Green boat SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND (B.colour=‘red’ AND B.colour!=‘green’) • OR SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘red’ EXCEPT SELECT S.sid FROM Sailors AS S, Boats AS B, Reservations AS R WHERE S.sid=R.sid AND R.bid=B.bid AND B.colour=‘green’

  34. GROUP BY • So far, we’ve applied aggregate operators to all (qualifying) records. Sometimes, we want to apply them to each of several groups of records. • Consider: Find the age of the youngest sailor for each rating level. • How many rating levels are there? What are the rating values for these levels? In general, we don’t know! • Suppose we know that rating values go from 1 to 10; we can write 10 queries that look like this (!): SELECT MIN (S.age) FROM Sailors S WHERE S.rating = i For i = 1, 2, ... , 10:

  35. GROUP BY • To write such queries, we need GROUP BY clause, a major extension to the basic SQL query form. • E.g. Find the age of the youngest sailor for each rating level - can be expressed as follows SELECT S.rating, MIN (S.age) FROM Sailors AS S GROUP BY S.rating • S.rating after ‘GROUP BY’ is called a grouping-list

  36. HAVING • What if we’re only interested in some of the groups? • We can restrict this with HAVING • For sailors over 18 what is the highest rating for each age? SELECT S.age, MAX(S.rating) FROM Sailors AS S GROUP BY S.age HAVING S.age > 18

  37. Summary • Databases are the primary way in which information is managed in organisations • They offer a range of valuable functions • querying, security, transactions… • Database design is important • Normal forms (1NF, 2NF, 3NF):data should depend on the key, the whole key, and nothing but the key • You should now be able to: • explain the main functions of databases • identify opportunities to exploit them to meet business needs • design databases • perform queries against an SQL database

More Related