150 likes | 339 Views
Rethinking Database System Architecture: Towards a Self-Tuning RISC-style Database System. Surajit Chaudhuri Gerhard Weikum Microsoft Research University of the Saarland Redmond, USA Saarbruecken, Germany. Conclusion. Problem:
E N D
Rethinking Database System Architecture:Towards a Self-Tuning RISC-style Database System Surajit Chaudhuri Gerhard Weikum Microsoft Research University of the Saarland Redmond, USA Saarbruecken, Germany
Conclusion Problem: DBMS technology is packaged monolithically too many features, too much complexity • Solution: • RISC-style simplification and componentization • break up DBMS into layered packages with narrow APIs and self-tuning capabilities • compose appropriate packages into broader range of IT applications Think globally, fix locally
Outline • Analysis • Role Models for New Departure • Proposal
Passing of a Dream Old World New World dot com inventory payroll <?XML?> Web server DBMS order entry ERP Mining multi-tier architecture with many custom „data managers“ DBMS at center of the universe
Why Did This Happen? • Universality of DBMS was a leap of faith • SQL is unnatural and complex • Yet another failed example of transparency trap • Featurism has turned into a curse • Excessive bundling • Performance is unpredictable • (Auto-) Tuning is a nightmare • Unacceptable GPR for app system architects
Example of Poor GPR: DBMS Query Processor • Yet another indexing smart added • Yet another join method added • Yet another transformation rule added • Optimizer designers will admit • It is unpredictable • Hard to abstract principles • ERP/Mining/etc attempt to outsmart QP • Turning into black magic • Cannot educate next generation of engineers
Role Models for New Departure • Ex. 1: Aircraft with many subsystems (engine, fuselage, electrical control, etc.) • Ex. 2: RISC hardware • No single engineer understands entire system • Local theories for individual subsystems and reasonable understanding of interactions • Few points of interaction with stable and narrow interfaces • Built-in system support for debugging subcomponents (incl. performance)
RISC Philosophy for DBMS • DBMS technology must be packaged as components with simplified functionality • Enforce • Layered approach • Strong limits on interaction (narrow APIs) • Multiple consumers for a component • Components must have manageable complexity to be desirable for its potential consumers • Encapsulation must include predictable performance and self-tuning
Why Predictability is Crucial From best-effort to guaranteed performance ”Our ability to analyze and predict the performance of the enormously complex software systems ... are painfully inadequate" (PITAC Report) • Downtime is very expensive (100K$/min) • Very slow servers are like unavailable servers • Tuning for peak load requires predictability • of workload config performance function • Self-tuning requires mathematical models • Feasible at component scale
Check Availability (Look-Up Will Take 8-25 Seconds) Internal Server Error. Our system administrator has been notified. Please try later again.
RISC-style Engine (Components) • Design principles for components: • include only functionality that is self-tuning • apply Occam‘s razor for internal alternatives • Level 1 (base layer): SPJ only • only B-trees, with automatic index selection built-in • API includes prioritization & exec. time prediction • Level 2: Support for aggregation • Uses level 1 with narrow API • Self-tuning for aggregation considerations • Level 3: Full-fledged SQL • Layering sacrifices performance for manageability
RISC in the Large • Composition principles for IT solutions in the large: • Choose least-complexity components • IT solution can rely on predictable/guaranteed • performance of components • Use level 1 engine (SPJ, or merely record and index managers) for MP3 repository, simple E-service etc. • Use level 2 engine (SPJ + aggregation) for OLAP or ERP • Use level 3 engine (SQL) for full-fledged DW, legacy apps
Implications of RISC Approach • Need for Universal Glue for components • COM/Universal Runtime and EJB • Simplicity is key • Eliminate all second-order optimizations • Restrict alternatives • Not yet another join method or transformation rule • Don’t abuse extensibility!
Road Map • Demonstrate “plug and play” light-weight data servers for various scenarios (API and guaranteed performance): • MP3 repositories • OLAP server • Metadata manager • Open source “bazaar”?
Potential Caveatsand Rebuttals • We’ve been down this road before! But we now have better understanding of the appropriate components and APIs. • We will lose performance! But we win in terms of predictability and overall GPR. • There is no business incentive! As industries mature, predictability and manageability do matter for long-term benefit.