1 / 41

데이터베이스의 끝없는 도전 - Towards Self-Tuning Databases -

월간 마이크로소프트웨어 창간 20 주년 기념 세미나. 데이터베이스 ( 가제 ). 데이터베이스의 끝없는 도전 - Towards Self-Tuning Databases -. 이 상 원 wonlee@ece.skku.ac.kr http://vldb.skku.ac.kr 성균관대 & ㈜ 엑셈. Contents. A Quick Overview of Database History Why Self-Tuning Databases? Database Tuning in the 21 st century

amal
Download Presentation

데이터베이스의 끝없는 도전 - Towards Self-Tuning Databases -

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 월간 마이크로소프트웨어 창간 20 주년 기념 세미나 데이터베이스(가제) 데이터베이스의 끝없는 도전 - Towards Self-Tuning Databases - 이 상 원 wonlee@ece.skku.ac.kr http://vldb.skku.ac.kr 성균관대 & ㈜엑셈

  2. Contents • A Quick Overview of Database History • Why Self-Tuning Databases? • Database Tuning in the 21st century • Towards Self-Tuning Databases

  3. A Quick Overview of Database History • File systems(50s) • Hierarchy/Network databases(60s) • Relational databases(70s) • Object databases(80s) • OR databases(90s) • XOR databases(00s)

  4. A Quick Overview of Database History(2) • The state-of-the-arts functionalities of RDBMSs • A giantic black-hole of application intelligence • Developer  DB2, Oracle, SQL Server ROLAP (Cube) Active DB (Trigger) Deductive DB (Recursion) MOLAP EXCEL Spatial DB Data Mining Temporal DB • Procedure • C/C++, Java, .. XML/Stream/ Bioinformatics/ Appl. Server/ Web Service .. Multimedia File System e-mail Queue

  5. A Quick Overview of Database History(3) • The state-of-the-arts functionalities of RDBMSs • Rapidly growing internal optimization/tuning knobs • DBA   DB2, Oracle, SQL Server MVs Buffer Mgmt Partitioning Extensibility Bitmap Index Cost-Based Optimization Automatic Disk Mgmt ……….

  6. A Quick Overview of Database History(4) • Achievements of RDBMSs • Relational model and the “Esperanto” SQL  “productivity” • Query optimization  “can beat all but the best programmers” • Transaction management  “ACID” • E. F. Codd gone! but Relational forever!! • A vision for database community "My prediction is that eventually all storage systems will evolve to be database systems."(Jim Gray, 2002)

  7. A Quick Overview of Database History(5) DB = IT Core Platform in the 21st Century??? • ERP/CRM/SEM/BSC • DW/OLAP/Data Mining • Web Log Analysis • GIS/XML/Mobile • Bio-informatics • .... • Ubiquitous Databases

  8. A Quick Overview of Database History(6) • One big challenge for RDBMS • Simplicity is a big issue. (Jim Gray, 2003) • Commercial DBMSs are exponentially complicated • Even 10-year veteran Oracle DBAs do not know 50% of all the new features of Oracle’s latest release • Worsely, it would take more than 1 week for them to understand the performance behavior of a single new feature • The gap between human DB learning curve and new features becomes larger  “Crossing this chasm” seems to be impossible!! • It’s time for self-tuning intelligence!

  9. What is DB Tuning? • Car tuning • Goals? • What? • How?

  10. Why DB Tuning in the 21st Century? • DB performance influences all aspects of your business • DB performance management is one of the most important success factors in almost every IT projects • “미래 웹경쟁력은 데이터처리능력에”(Scott McNealy, Sun CEO, 전자신문) • DB tuning is not a technical issue, but a business issue! • Improve your productivity • Increase your revenue/profit • Increase your ROI(Return on Investment)

  11. Why DB Tuning in the 21st Century?(2) • More data - doubling every 9 month • More users - from CEO to sales representatives • More complex query - e.g. OLAP/data mining query • Faster responses • e.g. web response time impacts revenue! • e.g. real time personalization • Data growth vs. computer speedup * Moore’s Law -- # of transistors/chip doubles every 18 months (1965)

  12. Why DB Tuning in the 21st Century?(4) • Disk sales doubling every 9 months • Greg Papadopoulos, Sun CTO • Also from Winter VLDB survey (http://www.wintercorp.com) • Time to process all your data doubles every 18 months! • Without DB tuning, you need double-up investment on DB server and software every 6 month! • Maximize your DB performance!! (Let’s DBMax) • Maximize your PRODUCTIVITY • Maximize your REVENUE/PROFIT • Maximize your ROI

  13. Why Self-Tuning Databases? • War of TCO(Total Cost of Ownership) in 21st century • Major costs of IT infra.: H/W, S/W, Human Cost • Moore’s law: H/W, S/W cost ↓ • BUT! Human cost ↑ • cf. 1950s: telephone industry • manual switching by operators  automatic switching system

  14. Why Self-Tuning Databases?(2) Blackbox Database Tuning

  15. Self-Tuning: Targets UserProcesses Big Picture of a Simplied DBMS SQL Server Processes Oracle Instance SQL Exec. Engine Optimizer • SGA Redo LogBuffer Shared Pool Database Buffer Cache SQL Exec. Memory Oracle Database Parameter File Control Files Datafiles Redo Log Files

  16. Self-Tuning: Targets(2) • SQLs • Memory • Index/Materialized Views • Emerging fields

  17. SQL Self-Tuning: Optimizer • SQL  best execution plan • access method + join order + join method • Rule-based optimizer vs. Cost-based optimizer • The Selinger-style optmizer • P. Selinger et al, “Access path selection in a relational database management systems,” SIGMOD 79 • SYSTEM/R • A “MUST” reading for every SQL guys • cost estimation • dynamic programming

  18. SQL Self-Tuning(2): Cost-Based Optimzation 4. (Estimated) Best Execution Plan for SQL: Query Optimizer 1. SQL 2. Plan Generator P1, P2, .. , Pn Pi estimated cost RDBMS Instance Runtime Memory Environment: Buffer Cache Size, Sort/Hash Memory etc. for each plan Estimated Cost Cost Model Estimated Cardinality Combined Selectivity Selectivity & Cardinality Estimation Formula Real Data Data Dictionary Single Selectivity 3. Cost Estimator

  19. SQL Self-Tuning(3): Cost-Based Optimzation • Cost-based optimization technique itself is a self-tuning process, but it lacks 5% • 80% of SQL: Best plan • 15%: top 3 plan • 5%: not so good • Fallacies of CBO • Uniform distribution • Attribute Independence • Join Independence

  20. SQL Self-Tuning(4): Cost-Based Optimzation • How to fix the fallacies of CBO? • Various approaches: who? when? based on what? • Manual tuning • LEO • SOS [13] • Dynamic reoptimization • Parameteric query optimization • ….

  21. SQL Self-Tuning(5): IBM DB2 LEO • LEO( LEarning Optimizer ) • DB part of IBM SMART: Self-Managing And Resource Tuning (See IBM’s IBM Autonomic Computing )

  22. SQL Self-Tuning(6): IBM DB2 LEO Negative feed back!!! • eeed SELECT * FROM X,Y,Z WHERE X.PRICE>=100 AND Z.CITY=‘Denver’ AND Y.MONTH=‘Dec’ AND X.ID=Y.ID AND Y.NR=Z.NR GROUP BY A

  23. SQL Self-Tuning(7): Oracle’s v$sql_plan_statistics • Runtime statistics monitor(Since Oracle9iR2) • You can compare optimizer’s estimation with real statistics • Then, you can find out where is the source of errors? ** Someday, Oracle and MS will come with LEO-like features • SKKU and XM are developing SOS tool • “Save Our SQL” • How to tune SQL (semi-)automatically? • Find the negative feedback • Drill down to the causes • Analyze the effect • Hint the SQL

  24. Self-Tuning: Targets • SQLs • Memory • Index/Materialized Views • Emerging fields

  25. Memory Self-Tuning Response Time M1-pass = sqrt(I x S x 2) Moptimal = I . Sort . Memory 1-pass optimal M-pass • Complex queries use memory intensive operators • Sorting, hashing, bitmap index manipulation etc. • The amount of available memory greatly affects their performance • However, there is only a finited amount of memory, shared by all concurrent users

  26. Memory Self-Tuning(2): Oracle SQL Memory Mgmt. Sort Area, Hash Area, Bitmap Index • Pre-Oracle9i • Fixed PGA size • Sort(Hash)_Area_Size • Oracle9i [8] • PGA_Aggregate_Target • Dynamically allocated according to the needs Oracle Memory Model

  27. Memory Self-Tuning(3): Oracle SQL Memory Mgmt. • Manual vs. Automatic: Response time

  28. Memory Self-Tuning(4): Oracle SQL Memory Mgmt. • Manual vs. Automatic: Total Memory Usage Number of Users

  29. Memory Self-Tuning(5): Oracle SQL Memory Mgmt. Active work areas profiles WP4 Local Memory Manager Sn Sn Sn Sn … Memory Usage Statistics WPn SQL Statements WP1 pga_aggregate_target WP2 SQL Memory Target WP3 Memory Bound Computation Memory Target Computation Memory Bound SMM Daemon • DBA specifies PGA_AGGREGATE_TARGET • Oracle divides PGA memory for max. performance and optimal memory usage

  30. Self-Tuning: Targets • SQLs • Memory • Index/Materialized View(MV)s • Emerging fields

  31. Index/MV Self-Tuning • Selection of Index/MVs: “SQL Workloads + Available Disk Space  Best Index/MV sets” • Workload: queries + updates • Constraints: upper bound on storage space for indexes • Search: pick a configuration that is of “lowest” cost for the given database and workload • Industry cases • MS Autoadmin • IBM Design Advisor • Oracle OEM Index/Summary Advisor

  32. Index/MV Self-Tuning(2): MS SQL Server • AutoAdmin: Self-Tuning and -Administering Databases[12] “We achieve this by enabling databases to track the usage of their systems and to gracefully adapt to application requirements. Thus, instead of applications having to track and tune databases, databases actively auto-tunes itself to be responsive to application needs.” • Workload driven physical database design • To reduce TCO(Total Cost of Ownership)

  33. Index/MV Self-Tuning(3): MS SQL Server Workload Syntactic structure selection Microsoft SQL Server Candidate Index Selection Candidate Materialized View Selection Configuration Simulation and Cost Estimation Module Configuration Enumeration Final Recommendation Architecture of Index/MV Selection Tools Issue - search space • Large search space for indexes • Many columns to choose from • Kinds of indexes • Explosive search space for MVs • Query optimizers use physical design in novel ways • Physical design choices interact

  34. Self-Tuning: Targets • SQLs • Memory • Index/Materialized View(MV)s • Emerging areas for self-tuning

  35. Emerging Areas for Self-Tuning UserProcesses Big Picture of a Simplied DBMS SQL Server Processes Oracle Instance SQL Exec. Engine Optimizer • SGA Redo LogBuffer Shared Pool Database Buffer Cache SQL Exec. Memory Oracle Database Parameter File Control Files Datafiles Redo Log Files

  36. Emerging Areas for Self-Tuning(2) • Automatic configuration of init. parameter[5] • e.g. IBM DB2 V7.2 “Configuration Advisor” • For 36 parameters, 98.4% of DB2 performance experts • Automating physical database layout[9] • “How database objects such as tables and indexes are assigned to disk drives can impact the I/O performance of the system” (MS SQL Server) • Automatic partitioning of tables under parallel databases (DB2)

  37. Emerging Areas for Self-Tuning(3) • Accurate storage device access cost estimation[11] • CPU speed, disk I/O time, network, RAID rebuild • Inaccurate estimation cause (seriously) suboptimal query plan • Why wrong estimation? • storage or network may undergo heavy load or failure • older/newer than the h/w the optimizer assume • “We can achieve noticeable performance improvements by providing query optimizer with accurate and timely information about the current status of their storage devices”

  38. Emerging Areas for Self-Tuning(4) • System tuning[1] • Most urgent, but only ad-hoc or rule-of-thumb approaches with some monitoring tools: Bizmax/MaxGuage, Orange/Chakra, Spotlight, Patrol, OEM • Only the black box consulting • Highly dependent on consultant or DBA experience • Not scientic/engineering realm!! • We need to develop mathematical model for the interactions among various system tuning knobs; Or, we need to construct and share the rule/knowledge base of the system tuning experts Monitoring Drill-down to Bottleneck Root Cause Analysis Tuning

  39. Conclusion • Next decade will be self-tuning intelligence era • Better than average DBAs • BUT, You may be still the MVP for IT ROI only if • “Crossing the chasm between you and your DBMS technology” • Know your data • Know your application workloads • Know your DBMS intelligence Let’sDBMax!

  40. References • Gerhard Weikum et al., “Self-Tuning Database Technology and Information Services: from Wishful Thinking to Viable Engineering” VLDB 2002, Ten-Year Award • Gerhard Weikum et al, “The COMFORT Automatic Tuning Project”, Information Systems 19(5), 1994 • Gerhard Weikum et al, “Special Issues on Self-Tuning Database and Applications,” Data Engineering Bulletine, Jun., 1999 • IBM’s Autonomic Computing Perspectives: http://www.resarch.ibm.com/autonomic/manifesto/autonomic_computing.pdf • Guy Lohman et al, “Toward Automomic computing with DB2 Universal Database,” SIGMOD Record, Sep., 2002 • MS AutoAdmin Homepage: http://www.research.microsoft.com/ • Oracle Manageability: http://technet.oracle.com/products/manageability/content.html

  41. References(2) • Benoit Dageville and Mohamed Zait, “SQL Memory Management in Oracle9i,” VLDB 2002 • Jun Rao, Chun Zhang, Guy Lohman et al., “Automating Physical Database Design in a Parallel Database,” SIGMOD 2002 • S. Agrawal and Surajit Chaudhuri et al., “Automating Layout of Relational Databases,” ICDE 2003 • F. R. Reiss and T. Kanungo, “A Characterization of the Sensitivity of Query Optimization to Storage Access Cost Parameters,” SIGMOD 2003 • S. Agrawal and Surajit et al., “Automated Selection of Materialized Views and Indexes for SQL Databases,” VLDB 2000 • Michael Stillger, Guy M. Lohman et al., “LEO - DB2's LEarning Optimizer,” VLDB 2001

More Related