1 / 7

Pre-Silicon Simulation of Multi-Core Benchmarks

Pre-Silicon Simulation of Multi-Core Benchmarks. Shubu Mukherjee Principal Engineer Director, SPEARS Group Intel Corporation Panel in Symposium on Workload Characterization, Sep 27, 2007. Detailed Model Good for Core Analysis. Socket.

Download Presentation

Pre-Silicon Simulation of Multi-Core Benchmarks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pre-Silicon Simulation of Multi-Core Benchmarks Shubu Mukherjee Principal Engineer Director, SPEARS Group Intel Corporation Panel in Symposium on Workload Characterization, Sep 27, 2007

  2. Detailed Model Good for Core Analysis Socket • Single core simulation model executes ~ 12 milliseconds of a real machine’s execution • Assumes core speed = 1 KIPS (kilo simulated insts per second) • Assumes each simulation run is about 10 hours Core Uncore

  3. Four-Socket Platform Model Too Slow • 1-socket simulation model executes ~ 1-3 milliseconds of a real machine’s execution • 4-socket simulation model executes only 100s of microseconds of a real machine’s execution (recall disk latency is in milliseconds) Need at least a 10x Boost in Platform Performance Model Speed

  4. What 10x Speed Improvement Gives Us? • Improved Accuracy • Via greater coverage of benchmark slices • Better glassjaw analysis Faster Turnaround • Improved Latency • Faster debugging Improved Benchmarking • Greater coverage of benchmarks • Enables multithreaded (cooperative) benchmarks

  5. Approaches to Boost Simulation Speed(one key charter for SPEARS) •  Improve Basic Infrastructure •  Create Faster Core Models That are Less Accurate •  Go Parallel in a Modular Fashion • Use Accelerators, such as FPGAs

  6. What’s Novel Here? • Parallel Simulation is an Old Technology • Distributed, discrete-event simulation, Fujimoto, 1990 • Wisconsin Wind Tunnel I + II, Reinhardt, et al 1992 & Mukherjee, et al. 1997 • Customized for specific applications (e.g., shared memory) So, What Are the Challenges? • Starting point is several millions of lines of non-parallel C++ code (!) • This is production software  must be stable (unlike “research” software) • Parallel infrastructure must be modular, built once, used repeatedly without changing any architecture model code • Deal with new problems: load imbalance at multiple levels Current Status: Created infrastructure, Work-In-Progress

  7. Speedup of the Pthread-per-socket Model(on Clovertowns) • Speedup scales linearly with problem size • LOT more room for improvement exists

More Related