1 / 11

Cores vs. Caches

Cores vs. Caches. CS 838 Project Matt Ramsay & Chris Feucht. Motivation. As feature sizes push smaller, additional hardware can be placed on chip Various trade-offs result Among these for a CMP is how many cores and how much cache on each chip

Download Presentation

Cores vs. Caches

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cores vs. Caches CS 838 Project Matt Ramsay & Chris Feucht

  2. Motivation • As feature sizes push smaller, additional hardware can be placed on chip • Various trade-offs result • Among these for a CMP is how many cores and how much cache on each chip • Our project results suggest an optimal configuration for a 16-processor system running web-based applications

  3. Outline • Motivation • Experiments Performed • Simulator Environment • Results • Project Shortcomings • Future Work • Conclusions & Summary

  4. Experiments • Intended experiments not performed due to simulator limitations • Intended experiments: Each core equivalent to .5 MB L2 cache • Ran apache_8, oltp_2, zeus_8

  5. Simulator Environment • All nodes include 32 KB, 2 way L1 I & D caches • Each nodes has its own L2 bank, regardless of L2 size or assoc. • All other ruby and opal settings left at default

  6. Results - Apache

  7. Results - OLTP

  8. Results – Zeus

  9. Project Shortcomings & Future Work • Longer runs needed for convincing data • Test different number of processors/system • Add L3 cache to memory hierarchy

  10. Conclusions • CPI (IPC) changes little in a 16-processor system as number of cores/chip varies • This happens despite rapid system-wide L2 cache growth with added chips • Best performance per cost is with all 16 processors on one chip • Even with 2 MB total L2 • Would be helped by off-chip L3

  11. Project Summary We look here! 50 miles

More Related