320 likes | 424 Views
Lecture 3: System R. Aug. 31, 2007 ChengXiang Zhai. Most slides are adapted from Kevin Chang’s lecture slides. System R. System R: 1974-1978 IBM San Jose Labs, lots of PhD researchers Gray: coming from OS, first CS PhD of Berkeley lots of influence in RSS
E N D
Lecture 3: System R Aug. 31, 2007 ChengXiang Zhai Most slides are adapted from Kevin Chang’s lecture slides
System R • System R: 1974-1978 • IBM San Jose Labs, lots of PhD researchers • Gray: coming from OS, first CS PhD of Berkeley • lots of influence in RSS • ACM SIGMOD Innovation Award 1992 • Turing Award 1998 • Won Kim is UIUC alum • Dissertation: Query Processing for Relational Database Systems
INGRES • INGRES: 1973-1977 • U.C. Berkeley faculty & graduate students • Mike Stonebraker (then an asst. prof) • ACM SIGMOD Innovation Award 1991 • Eugene Wong • Postgres PostgreSQL
System R and INGRES: Gray Jim Gray: (see System R 25th Reunion page) Hostility developed between the San Jose IBM group and the Berkeley group because they were working on very, very similar things and had very, very similar ideas… As a consequence we came to the conclusion that the best thing was not to talk to each other. The Berkeley folks thought the IBM guys were ripping off ideas from the INGRES project. We had a strained relationship.
Joint ACM Software System Award 1988 System R: Donald Chamberlin, James Gray, Raymond Lorie, Gianfranco Putzolu, Patrici Selinger, Irving Traiger INGRES: Gerald Held, Michael Stonebraker, Eugene Wong Citation The INGRES and System R systems demonstrated that a practical and efficient database management system (DBMS) could be implemented based on the relational data model. These systems were full-function DBMS's that supported non-procedural query languages (QUEL and SQL), automatic query optimization, alternative storage structures, transactions, crash recovery, views, integrity, and protection. They have revolutionized the database system industry by showing how data stored in a computer can be conveniently accessed by end users and while at the same time it can be used by production application programs. http://awards.acm.org/software_system/
Contributions of System R? • Bringing theory to practice • nice theory implemented into practical system • High level query language (SQL) • Codd’s relational algebra/calculus were criticized as too mathematical • “System” research in action • macro: design a complete system architecture • micro: identify key problems and provide solutions • Defining database landscape • industry product spec. and research directions
Complete System Study • Phase 0: 1974, 1975 • initial single-user prototype • try out ideas and find issues • felt a good idea to plan to throw away ver. 1.0 • Phase 1: 1976, 1977 • full-function, multi-user prototype • Phase 2: 1978, 1979 • evaluation and feedback • lots of good lessons learned • Very similar process took place in INGRES
System Modules Identified • view management • query parser/rewriter • query optimizer • query executor • data storage • access methods • buffer manager • lock manager • log/recovery system
System R Architecture • RDS/RSS divide remains in many systems • RDS: query processing (logical) • view, query parser, rewriter, optimizer, executor • RSS: storage/access methods (physical) • storage, access methods, buffer manager, lock, log/recovery
Views • View defined as a query • another consistent use of SQL (no separate DDL) • Query on views: • query rewriter to flatten view (unfold def.) • form a composite query tree • View transparency • Almost any queries on any views • Not fully transparent though: • update: only for single-relation views • no right meanings in some cases • many-to-one nature of view def. (ambiguity) • even none-to-one (some view state has no correspondence)
SQL as Query Language • High-level declarative, English-based language • declarative language: what not how • well-founded/simple semantics based on relational algebra • small set of well-understood operators, so optimizer knows: • how operators can be interchanged/transformed • what equivalent implementations are for each op • Consistent for different functionalities • data definition: e.g., table creation, view definition • data manipulation: e.g., queries, updates • Uniform for different usage scenarios • embedding from different host languages (canned queries) • ad hoc user queries from command lines • Unexpected benefit: Standardized DB interface (mid 80’s)
What makes SQL possible? Query: parse, access-path selection, code gen, execute • Cost-based access-path selection (optimization) • Pre-compilation for canned queries • remove preprocessing/optimization from run time • data indexes and statistics may change: • reoptimize and recompile by observing dependencies • alternative approaches? • caching of recent used query plans • trigger to invalidate cached plans on relevant events • e.g., on rebuilding system statistics, on index creation • in contrast, interpreted QUEL in INGRES admitted mistake
Query Optimizer: Cost Based • Cost-based optimizer: set up paradigm • largely unchanged since • Cost model: • C = weighted-sum(CPU time, #IO) • CPU time modeled as number of RSS calls
Query Optimizer: Access Path Selection • Access path selection based on expected costs • select people where job = programmer and city = champaign • path 1: job index --> check city; path 2: city index --> check job • more paths? • Data independence: what are hidden from users? • Cost estimation based on • index selectivity • job (=programmer) more selective or city (=chamapign)? • index clustering • records of same/neighboring key are packed physically together • minimize #IO to fetch records of same key or a range • a relation can typically has at most one clustering index. Why?
Query Optimizer: Join Strategies To evaluate R.a = S.a: • Nested-loop join: for each tuple r in R: use index, fetch S tuples s s.t. s.a=r.a • Q: B-tree or hashing index better? • Sort-merge join: sort R, sort S merge tuples in R and S in order • Q: Use B-tree index to speed up? Hashing? • All joins two-way; n-ary joins as binary trees • prune away lots of alternative plans for n-ary joins
Storage: Phase 0 • Tuples have TID, containing page number • direct access by TID to fetch the page • Tuples contain pointers to values in “domain” • pros: • clean and original “domain” notion in Codd’s model • ultimate data normalization • cons: inefficient • idea picked up by some web search engines to store cached documents • Design of Database Systems … --> 0AF1 00A0 A0B8 B001 …
Access Methods: Phase 0 • Inverted Index (an inversion is also an index) • domain value --> list of TIDs • similar idea used in text information retrieval for word to document ID mapping • Q: find documents with “cat” and “dog” • inverted index: cat --> d1, d2, d3 dog --> d1, d3, d5, d6 • A: d1, d3 • How is this different from, say, B-trees?
Access Methods: Phase 1 • B-trees • Hashing: rejected • place records to buckets based on hashing values • reason of rejection: cannot support range queries • Advantages? • to reach a record, you…
Lock Manager: Influential • Multiple granularity (hierarchy) of locks • records, relations, entire database • Intention locks for traversing data hierarchies • lock table with intention before locking tuples • acquiring locks from top down • Deadlock handling: detection • We will study this in more detail later (Gray)
Lock Manager: Predicate-Lock Abandoned • Lock all the tuples satisfying a predicate • e.g., lock “student.dept = CS” • Problems: • hard to determine if locks conflict • “dept = CS” vs. “GPA > 3.0”? • involving semantics • data snapshot also matters • locked set can be changing during locking
Lock Manager: Convoy Problem Lock resource may interfere with OS resources • Transaction T1 holds lock while to be dispatched by OS • Transaction T2, …, Tn all wait; convoy formed • T1 dispatched, and then release lock • say 10K instructions for dispatching • lock granted to T2, which is waiting to be dispatched by OS • T1 soon need the lock again, go back to end of convoy • say only executed 1k instructions between locking • T1 is now wasting its CPU time slice • Most cycles are for dispatching only! • T1: 10K dispatching, 1K execution, waiting in convoy
Log/Recovery: Failures • Transaction failures • i.e., transaction rollback if cannot be committed • System failures • all data updates in main memory buffer lost • Media failures • data on disk lost
Log/Recovery: Before Logging • “Before” (undo) logging: record old value • after crash, bring DB back to consistency • undo all uncommitted transactions • problems: • data must be flushed to disk before transaction commit • after crash, cannot bring to the most recent status
Log/Recovery: After Logging • “After” (redo) logging: record new value • after crash, bring DB forward to consistency • redo all committed transactions • problem: • must hold data in buffer until commit made • long history of “committed transactions”; slow in recovery
Log/Recovery: Before/After Logging • Combined before/after logging; to recover: • checkpoints to archive current healthy state • redo all committed transactions • undo all uncommitted transactions • Dual logs to prevent log failure • Shadow pages not a good idea • tend to lose physical clustering • too much overhead in maintaining • alternative: in-place updates; separate logs #
System R Influences: Other Ideas • catalog as relations-- metadata managed as data • security management (grant/revoke) • integrity constraints, triggers
System R Influence • Ellison's Oracle beats IBM to market by reading white papers. • IBM releases multiple RDBMSs, settles down to DB2. • Gray (System R), Jerry Held (Ingres) and others join Tandem (Non-Stop SQL), Kapali Eswaran starts EsVal, which begets HP Allbase and Cullinet • Relational Technology Inc (Ingres Corp), Britton-Lee/Sybase, Wang PACE grow out of Ingres group • CA releases CA-Universe, a commercialization of INGRES
What You Should Know • The main challenges the system R builders had to solve • Challenges related to the implementation of the relational model (storage, indexing, query optimization, query language) • Additional challenges (locking system, log/recovery system) • The main contributions of system R in implementing a relational data model (mostly corresponding to the challenges above)
Carry Away Messages • System development is as valuable as (or more valuable than?) theory formulation • Often, new problems/challenges would be discovered through system building • A system’s value depends on • Utility of the system (new/better functions that people care about) • Technical contributions (solutions to particular challenges) • System R has high values in both aspects!
Questions to Think About • Examples of ambitious systems? • How do you build a fuzzy entity-relation graph search/mining engine? • How do you build an XML search system? • How do you build a community information management system? • …