150 likes | 380 Views
An Analysis of Database Workload Performance on Simultaneous Multithreaded Processors. Jack L. Lo, Luiz André Barroso, Susan Eggers Kourosh Gharachorloo, Henry Levy, Sujay Parekh. Motivation. DBMS and scientific workloads are different DBMS workload is intrinsically multithreaded
E N D
An Analysis of DatabaseWorkload Performance onSimultaneous MultithreadedProcessors Jack L. Lo, Luiz André Barroso, Susan Eggers Kourosh Gharachorloo, Henry Levy, Sujay Parekh
Motivation • DBMS and scientific workloads are different • DBMS workload is intrinsically multithreaded • DBMS is memory intensive, therefore low processor utilization • Potential poor memory performance introduced by SMT cache sharing
Objectives • Identify the memory-system behavior of database systems • Evaluate the negative effect of cache sharing introduced by SMT, and try to eliminate it • Evaluate SMT performance for DBMS workloads
Methodology • SMT model • Based on out-of-order, superscalar architecture • During each cycle, 8 instructions can be fetched from up to 2 of the 8 hardware contexts • FUs: 6 integer, 4 FP • 128K I + 128K D, 16MB L2 cache • Workloads • Oracle DBMS and Digital UNIX • On-line transaction processing (OLTP) • Decision support system (DSS)
Database Workload Characterization • 3 segments of memory that are accessed by dominating processes: • Instruction text • Program Global Area (PGA) • Shared Global Area (SGA) • SGA buffer cache • SGA other
Memory Behavior • High instruction miss rate for OLTP • Large memory footprint • High instruction/data reuse • Replacement is too frequent
Multi-Thread Cache Interference • Two types of interference • Destructive interference • One thread’s data replaces another thread’s data • Higher conflict misses • Constructive interference • Data loaded by one thread is used by another simultaneously-scheduled thread • Fewer misses
Identifying source of misses • PGA misses are the dominating factor • Caused by destructive interference
Page-mapping Policies • Affect L2 cache conflicts • Two policies • Page coloring • Spatial locality • Bin hopping • Temporal locality
Application-Level Offsetting • Affect L1 cache conflicts • Offset the conflicting structures of different processes
SMT Performance on DBMS Workloads • SMT is highly effective in tolerating the high miss rates
Conclusions • While database workloads have large footprints, there is substantial reuse that results in a small, cacheable “critical” working set • Additional data cache conflicts caused by SMT can be nearly eliminated • SMT’s latency tolerance is highly effective for database applications