1 / 24

Fast Cache and Bus Power Estimation for Parameterized System-on-a-Chip Design

This paper presents a fast cache and bus power estimation approach for parameterized system-on-a-chip (SoC) design, allowing for efficient power/performance tradeoffs. The approach combines simulation data collection with rapid power prediction using equations and a fast bus estimation approach. Results show that the approach is orders of magnitude faster than simulation while maintaining good accuracy.

Download Presentation

Fast Cache and Bus Power Estimation for Parameterized System-on-a-Chip Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fast Cache and Bus Power Estimation for Parameterized System-on-a-Chip Design Tony D. Givargis & Frank Vahid Department of Computer Science University of California Riverside, CA 92521 {givargis,vahid}@cs.ucr.edu Jörg Henkel C&C Research Laboratories, NEC USA 4 Independence Way, Princeton, NJ 08540 henkel@ccrl.nj.nec.com A DAC scholarship and a NSF grant in part supported this research. University of California, Riverside & NEC USA

  2. Introduction • Systems-on-a-chip (SOC) era • increased chip capacity • parametrizable core based system design • Large power/performance tradeoffs possible just by varying bus/cache parameter values [givargis99] • But, simulation based cache/bus power evaluation is slow University of California, Riverside & NEC USA

  3. Introduction • We present a two-step approach for fast cache power evaluation • collect intermediate data using simulation • use equations to rapidly predict power • couple with a fast bus estimation approach • Our approach is • orders of magnitude faster than simulation • yields good accuracy University of California, Riverside & NEC USA

  4. Bus A Bus B I-Cache CPU D-Cache Memory Bridge Peripheral Bus Peripheral 1 Peripheral 2 Peripheral n Target Architecture University of California, Riverside & NEC USA

  5. I-Cache Bus A Bus B CPU D-Cache Memory Bridge Peripheral Bus Peripheral 1 Peripheral 2 Peripheral n Focus on Cache/Bus Parameters Power dissipation breakdown in a Digital Camera example University of California, Riverside & NEC USA

  6. Bus A Bus B I-Cache CPU D-Cache Memory Bridge Peripheral Bus Peripheral 1 Peripheral 2 Peripheral n Cache Parameters University of California, Riverside & NEC USA

  7. Tag Index Offset V T D V T D == == Mux Data Cache Parameters • Line Size • Associativity • Cache Size University of California, Riverside & NEC USA

  8. Bus A Bus B I-Cache CPU D-Cache Memory Bridge Peripheral Bus Peripheral 1 Peripheral 2 Peripheral n Bus Parameters University of California, Riverside & NEC USA

  9. Bus A/B Mux Demux Mux Demux C1 Bus A/B Mux Demux Mux Demux C2 C1 < C2 Bus Parameters Change Bus Width [givargis98] University of California, Riverside & NEC USA

  10. Bus A/B Encoder Decoder Encoder Decoder Bus A/B Encoder Decoder Encoder Decoder invert_ctr Bus Parameters Change Data Representation (Bus Invert) [Stan95] Reduce Bus Switching University of California, Riverside & NEC USA

  11. 0 1 0 0 1 0 1 1 0 1 1 0 1 0 0 1 inverted_ctr 0 1 Bus Parameters Binary Encoding Bus-Invert Encoding 0 1 0 0 1 0 1 1 1 0 0 1 0 1 1 0 Hamming Dist = 6 Hamming Dist = 3 University of California, Riverside & NEC USA

  12. Related Work • Important to explore various cache and bus parameters for best performance and power [Wilton96][Li98][givargis99] • large number of cache/bus configurations • need to estimate power/performance in constant time • Trace stripping [Wolf99], configuration ordering, single pass simulation [Kirovski]) University of California, Riverside & NEC USA

  13. # of misses (N) } } } Size (S) Approach Overview • Given a trace of memory refs • Cache parameters • Size (S) • Line/block-size (L) • Associativity (A) • Compute # of misses (N) University of California, Riverside & NEC USA

  14. Approach Overview • Capture improvements obtainable by: • changing line-size at small/large values of cache-size • changing associativity at small/large values of cache-size University of California, Riverside & NEC USA

  15. Approach Overview • Bus equation: • m items/second (denotes the traffic N on the bus) • n bits/item • k bit wide bus • binary encoding • random data assuption University of California, Riverside & NEC USA

  16. Approach Overview • Bus equation: • m items/second (denotes the traffic N on the bus) • n bits/item • k bit wide bus • bus-invert encoding • random data assumption University of California, Riverside & NEC USA

  17. I-Cache Bus A Bus B CPU D-Cache Memory Bridge Peripheral Bus Peripheral 1 Peripheral 2 Peripheral n Experiments • Cache parameters • size: 128, 256, 512, 1k, • 2k, 4k, 8k, 16k, 32k • assoc: 2, 4, 8 • line: 8, 16, 32 • Bus Parameters • width: 4, 8, 16, 32 • code: binary/bus-invert • Analyzed 45K sets exhaust. • 3d-Image • CKey • MPEG • Diesel • 5kB to 230kB of C code University of California, Riverside & NEC USA

  18. Performance ISS CPU Power Power Memory Power Trace Generator Cache Simulator + Bus Simulator I/D Cache Power Experiment Setup C Program • Dinero [Edler, Hill] • CPU power [Tiwari96] University of California, Riverside & NEC USA

  19. Experiment Results • Diesel application’s performance • Blue (light-gray) is obtained using full simulation • Red (dark-gray) is obtained using our equations 4% error 320x faster University of California, Riverside & NEC USA

  20. Experiment Results • Diesel application’s energy consumption • Blue (light-gray) is obtained using full simulation • Red (dark-gray) is obtained using our equations 2% error 420x faster University of California, Riverside & NEC USA

  21. Experiment Results • CKey application’s performance • Blue (light-gray) is obtained using full simulation • Red (dark-gray) is obtained using our equations 8% error 125x faster University of California, Riverside & NEC USA

  22. Experiment Results • CKey application’s energy consumption • Blue (light-gray) is obtained using full simulation • Red (dark-gray) is obtained using our equations 3 % error 125x faster University of California, Riverside & NEC USA

  23. Time (hours) Power Error (%) Experiment Results • 125 - 400x speedup • 1-18% absolute error (power & performance) • 2% average power error University of California, Riverside & NEC USA

  24. Conclusion • Presented a technique for rapidly estimating the power and performance of cache and bus sub-systems • orders of magnitude faster than exhaustive simulation • yields good accuracy • Enable exploration of parameters in parameterized system-on-a-chip architecture University of California, Riverside & NEC USA

More Related