190 likes | 377 Views
From A to E: Analyzing TPC’s OLTP Benchmarks. The obsolete, the ubiquitous, the unknown. Pınar Tözün Ippokratis Pandis* Cansu Kaynak Djordje Jevdjic Anastasia Ailamaki. École Polytechnique Fédérale de Lausanne *IBM Almaden Research Center. OLTP Benchmarks of TPC. 2005. 2015.
E N D
From A to E:Analyzing TPC’s OLTP Benchmarks The obsolete, the ubiquitous, the unknown Pınar Tözün Ippokratis Pandis*CansuKaynakDjordjeJevdjicAnastasia Ailamaki École Polytechnique Fédérale de Lausanne *IBM Almaden Research Center
OLTP Benchmarks of TPC 2005 2015 1985 1995 2007 1989 1992 1990 Brokerage house Wholesale supplier TPC-E TPC-C TPC-B Banking TPC-A TPC-A, TPC-B: Obsolete TPC-C: Ubiquitous – Most common Allow fair product comparisons Drive innovations for better performance TPC-E: Unknown – Results from one DBMS vendor
How is TPC-E different? Micro-architectural behavior Under-utilization due to instruction stalls Fewer cache misses and higher IPC Hardware Where does time go? Harder to partition requests Logical lock contention Storage Manager Characteristics/Statistics More page re-use Complex schema & transactions Longer held locks Workload
Outline Preview Setup & Methodology Micro-architectural behavior Within the storage manager Conclusions
Methodology * * • Shore-MT • Scalable open-source storage manager • Shore-Kits • Application layer for Shore-MT • Workloads: TPC-B, TPC-C, TPC-E, ++ • Micro-architectural • Xeon X5660: Vtune, Niagara T2: cputrack • Measured at peak throughput • Storage manager profiling • Niagara T2: dtrace *https://sites.google.com/site/shoremt
Outline Preview Setup & Methodology Micro-architectural behavior Within the storage manager Conclusions
IPC on Fat & Lean Cores Intel Xeon X5660 Sun Niagara T2 Maximum Maximum OLTP utilizes lean cores better TPC-E has higher IPC
Execution Cycles and Stalls Intel Xeon X5660 More than half of execution time goes to stalls Instruction stalls are the main problem
Cache Misses Intel Xeon X566032KB L1-I & 32 KB L1-D Sun Niagara T2 16KB L1-I & 8KB L1-D L1-I misses dominate TPC-E has lower data miss ratio (MPKI)
Why TPC-E has lower miss ratio? Average per transaction More scans of TPC-E Increased page reuse
Outline Preview Setup & Methodology Micro-architectural behavior Within the storage manager Conclusions
From A to E: Schema TPC-E TPC-B TPC-C Fixed Scaling warehouse customer branch Growing Increasing schema complexity
From A to E: Transactions More complexity & variety in transaction mix Harder to partition
Within the Storage Manager Sun Niagara T264 HW Contexts SF 64 – 0.6GB Spread SF 64 – 8.2GB Spread SF 1 – 20GB No-Spread
Within the Storage Manager Sun Niagara T264 HW Contexts SF 64 – 0.6GB Spread SF 64 – 8.2GB Spread SF 1 – 20GB No-Spread Lock manager is the main bottleneck for TPC-E
Inside the Lock Manager SF 64 – 0.6GB Spread SF 64 – 8.2GB Spread SF 1 – 20GB No-Spread Logical contention even for a large DB
Conclusions • Modern hardware is still highly under-utilized • TPC-E: fewer misses, less stall time, higher IPC • OLTP utilizes less aggressive cores better • Instruction footprint is too large to fit in L1-I • Spread instructions, (software guided) prefetching • Code/Compiler optimizations • Logical lock contention due to hotspots • Increased complexity in schema and transactions • TPC-E: harder to physically partition • Logical partitioning, OCC
The obsolete TPC-B TPC-C The ubiquitous TPC-E The unexplored Also starring: Shore-MT, Xeon X5660, Niagara T2 Directed by Produced by