230 likes | 241 Views
This study evaluates the performance of TPC-D benchmark and datawarehouses using Tetris algorithm and UB-Tree clustering methods. Tests were conducted on Compaq Proliant 5000 with Oracle RDBMS.
E N D
MISTRALPerformance of TPC-D Benchmark and Datawarehouses Prof. R. Bayer, Ph.D. Dr. Volker Markl Dept. of Computer Science, Technical University Munich and Bavarian Research Center for Knowledgebased Systems (FORWISS)
Test Bed for Performance Measurements • Hardware • Compaq Proliant 5000 • 4 Pentium II 200 MHz • 512 MB RAM • hard disk: 7 * 4 GB = 28 GB • Operating System • Windows NT 4.0 • RDBMS • Oracle 8 • 8kB pages • Access Methods • Tetris Algorithm for UB-Trees • Oracle IOT (clustering B*-Tree) • Oracle FTS (full table scan)
Shipping Priority Query (Q3) SELECT L_ORDERKEY, SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS REVENUE, O_ORDERDATE, O_SHIPPRIORITYFROM CUSTOMER, ORDER, LINEITEMWHERE C_MKTSEGMENT = 'FOOD' AND C_CUSTKEY = O_CUSTKEY AND L_ORDERKEY = O_ORDERKEY AND O_ORDERDATE < DATE 1.5.98 AND L_SHIPDATE > DATE 1.6.98GROUP BY L_ORDERKEY, O_ORDERDATE, O_SHIPPRIORITYORDER BY REVENUE DESC, O_ORDERDATE
Forecasting Revenue Change Query (Q6) SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUEFROM LINEITEMWHERE L_SHIPDATE >= [date] AND L_SHIPDATE <= [date] + INVERVAL 1 YEAR ANDL_DISCOUNT BETWEEN [discount] -0.01 AND [discount] + 0.01 AND L_QUANTITY < [quantity]
Performance Measurements GFK • DBMS • TransBase (covering, clustering compound B*-Trees) • UB/API on top of TransBase (UB-Tree, two ESQL Statements are optimized and processed per UB-Tree page access) • TransBase Hypercube (UB-Tree inside the DBMS Kernel) • Database • real world data warehouse from GFK • 3D Snowflake Schema • Time (3 years = 18 MP) • Segment (10500 outlets) • Product (~ 500000 items in 604 product groups) • 42 Mio fact tuples (~ 4 GB fact table size) • Computer • Sun ULTRA 1 Workstation (64 MB Main Memory)
Indexes • MHC to encode hierarchies: • TIME_CS (5 bits) • SEGMENT_CS (24 bits) • PRODUCT_CS (29 bits) • Compound on (PRODUCT_CS, TIME_CS, SEGMENT_CS) or(TIME_CS, SEGMENT_CS, PRODUCT_CS) • UB-Tree (UB/API) on{TIME_CS, PRODUCT_CS, SEGMENT_CS}
GFK Datawarehouse Reports selectivity << 1%
Clustering of UB-Trees Ø = 0.85 s / dc clustering factor
Summary UB-Tree • Excellent performance on large real DBs, > factor 10 • Very low storage requirement • 1st answer extremely fast, interactive use!! • Response time proportional to size of answer • Wide applicability: all DBs are multidimensional!! • Easy integration into DBMS, simple DDL extension • Very useful as middleware • Patent applications