320 likes | 464 Views
CS370 Database Management Systems. Joe Meehean. Persistent Data. What if we want our data to last? b eyond single run of program b eyond life of machine b eyond life of building b eyond life of company. Persistent Data. What do we want from a persistent data store?. Persistent Data.
E N D
CS370 Database Management Systems Joe Meehean
Persistent Data • What if we want our data to last? • beyond single run of program • beyond life of machine • beyond life of building • beyond life of company
Persistent Data • What do we want from a persistentdata store?
Persistent Data • What do we want from a persistentdata store? • minimize data size • fast • multiple users (at the same time) • flexible • low maintenance • easy to use
Class Registration System • Things • ? • Actions • ? • Limitations • ?
Class Registration System • Things • Students • Faculty • Courses • Offerings • Actions • Courses are offered • Students enroll in courses • Faculty teach courses • Limitations • Class size limits • Faculty cannot teach two courses at the same time • Students cannot take two courses at the same time • Prerequisites
DISCUSSION BREAK!!! • Course registration system • how do we store this stuff? • how do we access it?
DISCUSSION BREAK!!! • What if we need to add a new field to the student record? • What if a second system wants to share our data? • meal plan wants to share student records • How do we manage a single student record for all campus data? • payroll • student loans • parking
Relational Databases • Solves many of these problems • persistent • shared • separates data access from data storage • allows data reuse • data management can be decomposed • Focus of this class • database creation and management • writing applications that use DBs
Vocabulary • DataBase Management System (DBMS) • program or program suite • create, access, and store databases • Oracle, MS Access, SQL Server, DB2 • Relational DataBase (RDB) • business or application data • organized using the entity-relationship model • stored in DBMS • student records, payroll, bank records
When to use Relational DBs • Well structured data • course registration • payroll • Need transactions (consistency) • ATM • class enrollment • payroll deductions • Needs to be moderately fast
When NOT to use Relational DBs • Need blazing speed • databases can be slow (tens of milliseconds) • Need to support thousands of users • databases do not scale well • Unstructured data • more on this later • No common data access patterns
Why learn Databases at all? • You will write software that uses them • New persistent storage response to relational databases • understand their shortcomings • understand advantages of new techniques • New persistent storage usessame techniques • kept what they needed, threw the rest away
TANGENT!!! • People to develop good relationships with • Database Administrators • they are experts • they will help you to avoid doing something stupid • they can make accessing your data easy or hard
TANGENT!!! • People to develop good relationships with • System Administrators and Administrative Assistants • everything you do depends on something they did for you • they are often overworked • every time they do something for you,it is a favor. Say thank you.
Entity Relationship Model • RDBs built on simple data model • Entities • things, stuff, concepts • e.g., students, course, offerings, grades • Relationships • connections between entities • e.g., students take courses, courses have offerings, • Similar to object-oriented programming model
ER Model in RDBs • Entities translated into tables • Relationships connect tables
How to store table data? • Disk Geometry • seek time: move disk arm (8ms) • rotational Latency: data to spin under disk head (2-4ms) • data transfer: read data from disk (negible) • sequential reads and writes are faster than random R/W • Place data that will be used together close together
How to store table data? • Assume we are building a database management system • Need to write the student structs to disk • But what is the best way? struct student{ int id; char* f_name; char* l_name; char* major; float gpa; }
DISCUSSION BREAK!!! • We are building a RDB to store thousands of students • Organize file layout of student structs • How do we find student info quickly? • struct student fetchStudent(intstudent_id){…}
DISCUSSION BREAK!!! • Organize file layout of student structs • One line per student? • One line per data field? 1354, Phil, Park, CS, 2015 3549, Terry, Berry, Math, 2013 5467, Samantha, Small, Econ, 2012 1354, 3549, 5467 Phil, Terry, Samantha Park, Berry, Small CS, Math, Econ 2015, 2013, 2012
DISCUSSION BREAK!!! • Organize file layout of student structs • What if we want to calculate average GPA? • What if we want a sorted list of last names?
DISCUSSION BREAK!!! • Assume we built a library to find and read students struct student fetchStudent(intstudent_id){…} • Registrar uses the library • What if the dining hall wants to add a account balance field? • How can we prevent changing the interface due to changes in data fields?
Three Schema Architecture • Internal Schema • How the data is stored on disk • Conceptual Schema • Entities and relationships • External Schema • Application specific views of the data
Three Schema Architecture 1354, 3549, 5467 Phil, Terry, Samantha Park, Berry, Small CS, Math, Econ 2015, 2013, 2012 $23.45, $0.53, $45.16
Three Schema Architecture • Database management system maps from one schema to the other • e.g., removes account balance from registrar view of data • Provides data independence • changes in one schema do not affect structure of others • may change performance • may add functionality
Three Schema Architecture Application 1 Application 2 Application 3 External Schema 1 External Schema 2 External Schema 3 Conceptual Schema Internal Schema
Accessing and updating data • Applications care about data not how its stored (external schema) • should not need to know internal schema • Operations on data require complex procedures • iteration (loops) • lookups • coalescing and shifting data
Accessing and updating data • Declarative data language • non-procedural access • describe what data we need • not how to get it • Structured Query Language (SQL) • create database • store data • access data • aggregate data statistics
End result of success of RDBs • Application developers need to know… • entity-relationship data model • three schema architecture • SQL • Advantages of relational databases • DBMSs are well tested • DBMSs come with experts • RDBs allow unplanned combination of data • standard across all companies
Course Roadmap Application Using RDBs External Schema Relational Database Design Conceptual Schema Database Management System Design Internal Schema