100 likes | 210 Views
A.R. Alameldeen, M.M.K. Martin, C.J. Mauer, K.E. Moore, M. Xu, D.J. Sorin, M.D. Hill, D.A. Wood Presented By: Derek Hower. Simulating a $2M Commercial Server on a $2K PC. Contributions. Develop a cost and time efficient simulation methodology for multiprocessor systems.
E N D
A.R. Alameldeen, M.M.K. Martin, C.J. Mauer, K.E. Moore, M. Xu, D.J. Sorin, M.D. Hill, D.A. Wood Presented By: Derek Hower Simulating a $2M Commercial Server on a $2K PC
Contributions • Develop a cost and time efficient simulation methodology for multiprocessor systems. • Tuned and scaled benchmarks • Dealing with variability • Extended timing simulator
Workload Tweaking • Wisconsin Commercial Workload Suite • OLTP – On-Line Transaction Processing • SPECjbb – Java Middleware • Apache – Static Web Server • Slashcode – Dynamic Web Server • Scaled to reduce memory and disk usage • Tuned on an actual multiprocessor server to discover bottlenecks
Case Study: OLTP • Based on TPC-C v3.0, using IBM DB2 V7.2 EEE • Scaled to 3 sales districts per warehouse, 30 customers per district, and 100 items per warehouse • Compared to 10, 30,000 and 100,000 required by TPC • Set up on a Sun E5000 • Disk images were moved to simulator
Case Study: OLTP cont • Initial Scaling - • Reduced entire simulation to fit in 1GB of memory (10 100MB warehouses) • Kernel/device tuning • Changed limits on semaphore usage, threads, locks, etc • Database separated from kernel and spread out over 5 physical disks • Reducing contention • increased # of warehouses, keeping db size constant
Case Study: OLTP cont • Additional Concurrency • Added more users
Simulation • Shorten simulations as much as possible while still maintaining accuracy • Start with warm workloads using snapshots • Fixed simulation length based on # of transactions • Account for variability by introducing random memory access delays and by averaging multiple simulation runs
Timing • Added proc and memory timing models to Simics • Timing-first simulation • Memory model: • cache coherence • cache latencies and bandwidth • memory • interconnection network
Evaluation • Simulated system using Bandwidth Adaptive Snooping Hybrid (BASH)
Thoughts • Validation • Mentioned briefly but skirted the issue • Can we trust the data? • Is there a loss of generality when scaling and tuning workloads?