1 / 6

Chip Multi Processors are becoming mainstream Cache data access is a major bottleneck in CMPs

Software Systems Lab Department of Electrical Engineering, Technion. Introduction. Chip Multi Processors are becoming mainstream Cache data access is a major bottleneck in CMPs

una
Download Presentation

Chip Multi Processors are becoming mainstream Cache data access is a major bottleneck in CMPs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Software Systems Lab Department of Electrical Engineering, Technion Introduction • Chip Multi Processors are becoming mainstream • Cache data access is a major bottleneck in CMPs • NUCA – Non Uniform Cache Access – The physical distances on a chip are becoming relevant • Nahalal architecture speeds up the hit rate and data access time • How can we improve NAHALAL’s performance without increasing the cache sizes? • How can we calculate the working set?

  2. Software Systems Lab Department of Electrical Engineering, Technion Nahalal Architecture • Proposes a layout with one shared cache which is physically close to all the CPUs • Small set of cache lines accounts for a significant portion of memory accesses (80%|20% rule)

  3. Software Systems Lab Department of Electrical Engineering, Technion Dynamic cache allocation • There is a trade off between the cache’s size and hit-time • By dynamically allocating the cache sizes we can overcome this tradeoff • Theoretical potential speed-up compared to original NAHALAL

  4. Software Systems Lab Department of Electrical Engineering, Technion Working set calculation • The working set signature is an n-bit vector formed by mapping working set elements into n-buckets using a randomizing hash function. • The bit-vector is cleared at the beginning of every interval. • Given the fraction of the signature filled, the working set size can be estimated using the relation:

  5. Software Systems Lab Department of Electrical Engineering, Technion JSO algorithm

  6. Software Systems Lab Department of Electrical Engineering, Technion Results and Conclusions • It is possible to improve cache performance by dynamically allocating the shared cache • The working-set signature is sufficiently accurate to estimate the real working set size • Dynamic cache allocation is most efficient when some CPUs are idle Miss rate (smaller is better)

More Related