1 / 19

Database Performance: Facts and Myths

Database Performance: Facts and Myths. Akhtar Ali School of Computing, Engineering and Information Sciences May 2006. Agenda. The evolution of database technology Motivations Some myths Our approach to database performance study Experiments and results Conclusions.

ide
Download Presentation

Database Performance: Facts and Myths

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Performance: Facts and Myths Akhtar Ali School of Computing, Engineering and Information Sciences May 2006

  2. Agenda • The evolution of database technology • Motivations • Some myths • Our approach to database performance study • Experiments and results • Conclusions

  3. The Evolution of Database Technology • DBMS: Database Management System • HDBMS: Hierarchical DBMS • NDBMS: Network DBMS • RDBMS: Relational DBMS • SQL: Structured Query Language • OODBMS: Object-Oriented DBMS • ORDBMS:Object-Relational DBMS • OQL: Object Query Language • SQL3: SQL standard, 1999 – includes object-relational features.

  4. Motivations • There is a current interest in comparing and contrasting different database: • Systems • Models and • Design considerations • Several benchmarks have been developed: • TPC [1989], OO7 [1993], BUCKY[1997], BORD[2000] • Support for systems comparison rather models/design • In order to choose which database system to use, we need to be aware of (beside other issues) performance based comparison across the board. • Relational vs. Object-Relational vs. Object-Oriented

  5. Some Myths and Expectations • OO is the superior database technology. • Relational technology is simple but inefficient when compared to OO/OR. • OR being a hybrid of relational and OO technologies should outperform relational and leverage OO technologies. • OR database systems: The Next Great Wave! • OR technology is the database silver bullet. • OO database systems are slow. • OO database systems do not support queries.

  6. Our Approach • Comparison based on different data models and design alternatives rather than system A vs. B: • Database systems often have different configurations, algorithms, buffering techniques, block size etc. • Comparing system A against B involves several variables. • Difficult to isolate the impact of how the database is structured. • Using uniform experimental environment: • Oracle 9i for relational vs. object-relational (OR) • Lambda-DB for relational vs. object-oriented (OO) • OR vs. OO (not yet)

  7. Relational vs. Object-Relational DBsStudy # 1 [Golhar and Ali, 2003] • Used the OO7 benchmark [Carey, M., et al. 1993], which was developed primarily for comparing OODBMSs. • Implemented the OO7 database schema using Oracle 9i: • Relational SQL (tables) • Object-Relational SQL (object types, object tables, references etc) • Implemented queries/traversals from the OO7 benchmark on both databases (DBs). • Compared and contrasted elapsed time of executing queries/traversals against both DBs. • Results are consistent with BUCKY [Carey, M., et al. 1997], which was implemented over Informix Universal Server. • But against our expectations: • Relational database outperformed its object-relational counterpart.

  8. OO7 Benchmark Schema

  9. Results (I) • Traversals • T1:Traverse the assembly hierarchy. As each base assembly is visited, visit each of its referenced unshared composite parts. As each composite part is visited, perform a depth first search on its graph of atomic parts. Return a count of the number of atomic parts visited when done. • Read-only traversal (T1, T6): T6 is a variant of T1. • Simple update traversal (T2A, T2B, T2C) • 1 or 20 or 4 x 20 objects are updates per iteration • Index update traversal (T3A, T3B, T3C) • Queries • Single level query (Q5): Base Assemblies • Ad-hoc join query (Q8): Documents and Atomic Parts

  10. Results (I) …

  11. Relational vs. Object-Relational DBsStudy # 2 [Keivani and Ali, 2006] • Study #1 used an OO benchmark for Relational vs. OR comparison. • Although the results of study #1 were consistent with the results of BUCKY [Carey, M., et al. 1997], we were rather surprised to see OR not performing as expected. • We felt that a study based on an OR benchmark on a current ORDBMS is badly needed. • Also there is a need to investigate why OR technology is not yet able to deliver on its promises. • So the BUCKY Benchmark [Carey, M., et al. 1997] has been used in Study #2, which was developed specifically for OR databases.

  12. Relational vs. Object-Relational DBsStudy # 2 [Keivani and Ali, 2006] … • Implemented the BUCKY database schema using Oracle 9i. • The OR version of BUCKY database has been implemented in two different ways due to the limitations of Oracle and in the absence of an OR-based database design method. • Implemented queries/traversals from the BUCKY Benchmark on both DBs. • Compared and contrasted elapsed time of executing queries against both DBs. • Execution plans of how Oracle performs queries have been studied. • The outcome is not much far from BUCKY [Carey, M., et al. 1997], which was implemented over Informix Universal Server. • The OR technology has made some improvements since 1997, however, it has not yet delivered on the promises. • Overall, the relational version of the BUCKY database outperforms the OR versions.

  13. Results (II) – Calculated OR efficiency index per query

  14. Relational vs. Object-Oriented DBs [Ali, 2003/4] • Based on the assumption: • An Object DBMS can implement a relational design (analogous to writing C style programmes using C++ compiler) • Implemented the OO7 database schema using Lambda-DB, an OODBMS: • Realizing a relational design using ODL and OQL • Mapping the UML schema directly using ODL and OQL • Implemented few traversals from the OO7 benchmark on both DBs. • Compared and contrasted elapsed time of executing traversals against both DBs. • Results are according to our expectations: • OO version of the database outperforms the relational one by 3 to 45 times.

  15. Results (III) • To our astonishment – T1 took 15 days, 15 hours and 34 minutes to complete over the relational version of the database: • Of course we did not run T1 for 15 days. • Actually we aborted the program several times and executed it for a smaller dataset. • Then estimated the elapsed time for the entire dataset. • We realized that indexes were not created on foreign keys: • The same query took about 10 minutes. • This re-confirms the importance of basic query optimization principles e.g adequate use of index

  16. Results (III) …

  17. RDBs have overcome some of the performance issues by clever optimization techniques. ORDBs are predominantly relying on the relational core. For example, Oracle implements nested tables with explicit storage outside the containing objects. Requires expensive unnesting and pointer de-referencing. ORDBs may need to stand on their own footing rather than on their predecessors - RDBs. An implementation of ORDBMS from scratch may reveal a clearer picture whether OR is better alternative to relational and OO. There is lack of database design methodology based on OR technology. All major relational products come with OR features. So the way forward might be to use OR features only when there are real benefits for the database application otherwise keep using the relational core. Our study # 2 has thrown some light on the circumstances in which the use of OR features might be beneficial. Conclusions (I): Relational databases (RDBs) vs. OR databases (ORDBs)

  18. Why the performance of OR technology is not according to our expectations? Conclusions (I): Relational databases (RDBs) vs. OR databases (ORDBs) …

  19. Indexes on foreign keys make a big difference in RDBs. One of the main reason behind OODBs performance is that relationships are kept persistent within the object structures. Objects are stored as if they were in an object-network rather than scattered into several tables (as in the case of RDBs). RDBs have to re-construct the object-network in order to support the application’s demands. An OODB may not perform as expected if only an OODBMS is used but the design of the database is mainly relational i.e. an OO syntax alone does not yield gains in performance. Conclusions (II): Relational vs. OO databases (OODBs)

More Related