100 likes | 189 Views
Memory System Characterization of Commercial Workloads L.A. Barroso, K. Gharachorloo and E. Bugnion Western Research Laboratory Digital Equipment Corporation. Presented by: Eric Carty-Fickes. Introduction. commercial workloads > engineering but most still using scientific benchmarks (in 1998)
E N D
Memory System Characterization of Commercial WorkloadsL.A. Barroso, K. Gharachorloo and E. Bugnion Western Research LaboratoryDigital Equipment Corporation Presented by: Eric Carty-Fickes
Introduction • commercial workloads > engineering • but most still using scientific benchmarks (in 1998) • difficult to create commercial benchmarks • large, expensive, proprietary, changing • paper uses commercial workloads to study current trends
Database Workloads • first two run on Oracle DB server • OLTP • small r/w queries on part of DB • models banking req’s in dedicated mode • more kernel time; hides I/O • DSS (decision support systems) • long read-only queries on much of DB • models wholesaler’s SQL queries • fewer context-switches
Database Workloads • Web Index Search • doesn’t require DB server • multiple threads hide misses • read-only req’s and cached recent searches
Test Systems • 4 processor AlphaServer 4100 and 8 processor 8400 for hardware testing • IPROBE tool for event counting • DCPI for profiling • ATOM for studying ORACLE • SimOS for testing architectural changes • models Alpha 21164 • simplified, but still with some detail
Aspects of Testing • 3 issues: memory size, I/O bandwidth, runtime • scale down DB • change block buffer cache sizes • OLTP and DSS: need to warm up SGA before testing; need to scale DB to be resident • Web Index: no scaling – same system
Hardware Results • OLTP – higher CPI, maybe due to TPC-B • long secondary cache latency • lots of primary cache misses, esp Icache • dirty miss latency significant, lots of communication • DSS – lower CPI means this config works • only suggestion is larger 1st level caches • AltaVista – use 8400 just like original • good CPI, well written code • 1st level caches important
Simulator Results • simulator like hardware, some cache and consistency differences = different timing • close cycle counts, miss rates • OLTP – test assoc and Bcache size • idle time increase when servers can’t hide I/O • lots of cache intricacies… • bigger caches = fewer replacemt, inst misses – more important for OLTP than DSS • bigger lines = more true sharing, less cold missing
Conclusions • scaled OLTP and DSS give a decent estimate of real performance • fairly narrow range of architectural issues explored • more processes/processor = less I/O latency, fewer dirty misses • simulators gloss over important details for ease of use (timing, OS, etc.)
Questions • Can you get enough information by scaling down the DB and playing tricks with block buffer sizes?