200 likes | 216 Views
This resource covers the fundamentals of database systems, including data models, levels of abstraction, data independence, query optimization, transaction management, and more. It explains the importance and benefits of using a Database Management System (DBMS) for efficient data handling and security. The content explores the history, evolution, and relevance of databases in modern computing.
E N D
Introduction to Database SystemsChpt 1 Instructor: Xintao Wu Ramakrishnan & Gehrke
http://www.sigmod.org/record/issues/0606/index.html Ramakrishnan & Gehrke
History 60s C. Bachman GE network data model Late 60s IBM IMS hierarchical data model 70 E.Codd relational model 80s SQL IBM R trasaction J. Gray Late 80s-90s DB2, Oracle, informaix, sybase 90s-10s Data Warehouse, internet 10s - Big Data, NoSQL/NewSQL M. Stonebreaker Turing award and Turing test? Turing award listTuring website Ramakrishnan & Gehrke
What Is a DBMS? • A very large, integrated collection of data. • Models real-world enterprise. • Entities (e.g., students, courses) • Relationships (e.g., Madonna is taking ITCS6160) • A Database Management System (DBMS)is a software package designed to maintain and utilize databases. Ramakrishnan & Gehrke
Why Use a DBMS? • Data independence and efficient access. • Reduced application development time. • Data integrity and security. • Uniform data administration. • Concurrent access, recovery from crashes. Ramakrishnan & Gehrke
Why Study Databases?? • Shift from computation to information • at the “low end”: scramble to webspace • at the “high end”: scientific applications • Datasets increasing in diversity and volume. • Digital libraries, interactive video, Human Genome project, EOS project • ... need for DBMS exploding • DBMS encompasses most of CS • OS, languages, theory, “A”I, multimedia, logic Ramakrishnan & Gehrke
Data Models • A data modelis a collection of concepts for describing data. • Aschemais a description of a particular collection of data, using the given data model. • The relational model of datais the most widely used model today. • Main concept: relation, basically a table with rows and columns. • Every relation has a schema, which describes the columns, or fields. Ramakrishnan & Gehrke
Levels of Abstraction View 1 View 2 View 3 • Many views, single conceptual (logical) schemaand physical schema. • Views describe how users see the data. • Conceptual schema defines logical structure • Physical schema describes the files and indexes used. Conceptual Schema Physical Schema • Schemas are defined using DDL; data is modified/queried using DML. Ramakrishnan & Gehrke
Example: University Database • Conceptual schema: • Students(sid: string, name: string, login: string, age: integer, gpa:real) • Courses(cid: string, cname:string, credits:integer) • Enrolled(sid:string, cid:string, grade:string) • Physical schema: • Relations stored as unordered files. • Index on first column of Students. • External Schema (View): • Course_info(cid:string,enrollment:integer) Ramakrishnan & Gehrke
Data Independence • Applications insulated from how data is structured and stored. • Logical data independence: Protection from changes in logical structure of data. • Physical data independence: Protection from changes in physical structure of data. • One of the most important benefits of using a DBMS! Ramakrishnan & Gehrke
Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Structure of a DBMS These layers must consider concurrency control and recovery • A typical DBMS has a layered architecture. • The figure does not show the concurrency control and recovery components. • This is one of several possible architectures; each system has its own variations. Ramakrishnan & Gehrke
Transaction Management: ACID properties • Atomicity: All actions in the Xact happen, or none happen. • Consistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent. • Isolation: Execution of one Xact is isolated from that of other Xacts. • D urability: If a Xact commits, its effects persist. • The Recovery Manager guarantees Atomicity & Durability. Ramakrishnan & Gehrke
Motivation of concurrency control • Consistency • Isolation • Example • Two parallel transactions T1 and T2 • Serial execution • Execution with interleaving actions • Example shown on board Ramakrishnan & Gehrke
Example • Consider two transactions (Xacts): T1: BEGIN A=A+100, B=B-100 END T2: BEGIN A=1.06*A, B=1.06*B END • Intuitively, the first transaction is transferring $100 from B’s account to A’s account. The second is crediting both accounts with a 6% interest payment. • There is no guarantee that T1 will execute before T2 or vice-versa, if both are submitted together. However, the net effect must be equivalent to these two transactions running serially in some order.
Example (Contd.) • Consider a possible interleaving (schedule): T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B • This is OK. But what about: T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B • The DBMS’s view of the second schedule: T1: R(A), W(A), R(B), W(B) T2: R(A), W(A), R(B), W(B)
Motivation of recovery management • Atomicity: • Transactions may abort (“Rollback”). • Durability: • What if DBMS stops running? (Causes?) • Desired Behavior after system restarts: • T1, T2 & T3 should be durable. • T4 & T5should be aborted (effects not seen). crash! T1 T2 T3 T4 T5 Ramakrishnan & Gehrke
Handling the Buffer Pool • Force every write to disk? • Poor response time. • But provides durability. • Steal buffer-pool frames from uncommited Xacts? • If not, poor throughput. • If so, how can we ensure atomicity? No Steal Steal Force Trivial Desired No Force
Databases make these folks happy ... • End users and DBMS vendors • DB application programmers • E.g. smart webmasters • Database administrator (DBA) • Designs logical /physical schemas • Handles security and authorization • Data availability, crash recovery • Database tuning as needs evolve Must understand how a DBMS works! Ramakrishnan & Gehrke
Summary • DBMS used to maintain, query large datasets. • Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. • Levels of abstraction give data independence. • A DBMS typically has a layered architecture. • DBAs hold responsible jobs and are well-paid! • DBMS R&D is one of the broadest, most exciting areas in CS. Ramakrishnan & Gehrke