1 / 27

But... That’s three wishes in one!!!

Mommy, mommy! I want’ a hardware cache with few conflicts and low power consumption that is easy to implement!. But... That’s three wishes in one!!!. Refinement and Evaluation of the Elbow Cache or The Little Cache that could. Mathias Spjuth. Cache. Address Space. {. {. Sets. {. {.

avon
Download Presentation

But... That’s three wishes in one!!!

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mommy, mommy! I want’ a hardware cache with few conflicts and low power consumption that is easy to implement! But... That’s three wishes in one!!! Uppsala Architecture Research Team

  2. Refinement and Evaluation of theElbow CacheorThe Little Cache that could Mathias Spjuth Uppsala Architecture Research Team

  3. Cache Address Space { { Sets { { 2-way Set Associative Cache A A B B E E C C D D F F G H G H Memory References: Memory References: A Memory References: A-B-C Memory References: A-B-C-D Memory References: A-B Memory References: A-B-C-D-E-F Memory References: A-B-C-D-E-F-G Memory References: A-B-C-D-E-F-G-H Memory References: A-B-C-D-E Uppsala Architecture Research Team

  4. Conflicts (cont.) Traditional way of reducing conflicts is to use set associative caches. ++ Lower miss rate (than direct-mapped) -- Slower access -- More complexity (uses more chip-area) -- Higher power consumption Uppsala Architecture Research Team

  5. Address Space 2-way Skewed Associative Cache A B Cache Bank 1 E C A D C F F G G H Cache Bank 2 B D H E Memory References: A-B-C-D-E-F-G-H Memory References: A-B-C-D-E-F-G Memory References: Memory References: A-B-C-D-E Memory References: A-B-C-D Memory References: A-B-C Memory References: A Memory References: A-B Memory References: A-B-C-D-E-F Uppsala Architecture Research Team

  6. Address Space 2-way Skewed Associative Cache A B Cache Bank 1 E C A D C F F G G H H No Conflicts! Cache Bank 2 B D H E Memory References: Memory References: A-B-C-D Memory References: A-B-C-D-E Memory References: A-B-C-D-E-F Memory References: A-B-C-D-E-F-G Memory References: A-B-C-D-E-F-G-H Memory References: A-B-C Memory References: A-B Memory References: A Uppsala Architecture Research Team

  7. Skewed associative caches Uses different hashing (skewing) functions for indexing each cache bank ++ Lower missrate (than set-assoc.) ++ More predictable -- Slightly slower (hashing) -- ”Cannot” use LRU replacement -- ”Cannot” use VI-PT Uppsala Architecture Research Team

  8. Elbow Cache Improve the performance of a skewed associative cache by reallocating blocks within the cache. By doing so we get a broader choice of which block to choose as the victim. Use timestamps as replacement metric. Uppsala Architecture Research Team

  9. Finding the victim Two methods: • Look-aheadConsider all possible placements before the first reallocation is made. • FeedbackOnly consider the immediate placements, then iterate. Uppsala Architecture Research Team

  10. Address Space 2-way Elbow Lookahead Cache A B Cache Bank 1 E C A D D F X F X G G H Replacement paths: F-B-A E-D-H Cache Bank 2 B C H E Memory References: A Memory References: A-B-C-D-E-F-G Memory References: A-B-C-D-E-F-G-H-X Memory References: Memory References: A-B-C-D-E-F Memory References: A-B-C-D-E Memory References: A-B-C-D Memory References: A-B-C Memory References: A-B Uppsala Architecture Research Team

  11. Address Space 2-way Elbow Feedback Cache A B Cache Bank 1 E C A D D F F X G G H Temp. Register Cache Bank 2 B C H E X Memory References: A-B-C-D-E Memory References: A-B-C-D-E-F Memory References: A-B-C-D-E-F-G Memory References: A-B-C-D-E-F-G-H-X Memory References: Memory References: A-B-C Memory References: A-B Memory References: A Memory References: A-B-C-D Uppsala Architecture Research Team

  12. Finding the victim (cont.) Look-ahead: ++ Most optimal -- Difficult to implement (>1 transformation) Feedback: ++ Easy to implement (feed victim back to write buffer) -- Needs extra space in the write buffer Uppsala Architecture Research Team

  13. Replacement Metrics • Enhanced-Not-Recently-Used (NRUE): • The best policy for skewed caches known so far. • Each block contains two extra bits, a recently-used and very-recently-used bit, that are set on access to the block. • These bits are regularly cleared. The very-recently-used bit is cleared more often. • First, try to find a victim with no bit set. • Then one with only the recently-used bit set. • Then use random replacement. Uppsala Architecture Research Team

  14. Timestamps Increase counter on every cache allocation TA A 10100100000 10100100010 10100100001 Counter Tcurr TB B Timestamp Data { Tcurr– TAif Tcurr >= TA Dist(A)= Tmax–Tcurr+ TAif Tcurr < TA Uppsala Architecture Research Team

  15. Tcurr Tcurr TA TA TB Timestamps Timestamp [ticks] Tmax 0 Dist(A) < Dist(B); B older than A Dist(A) > Dist(B); A older than B Uppsala Architecture Research Team

  16. Implementation Lookahead: • At most one transformation (4 possible victims) each replacement. • Do the transformation and load the new data at the same time. Uppsala Architecture Research Team

  17. Implementation Feedback: • Up to 7 transformations (max. 8 possible victims) each replacement. • Temporary victims are moved to the write buffer, before reallocation. • Extra control field in write buffer. Uppsala Architecture Research Team

  18. Feedback Y writemem 2:1 2:1 Xid1 Xid2 b Step Data+TagTmSt Data+Tag TmSt Data+Tag TmSt ATmSt BTmSt Read X Write Write Buffer CTmSt Bank I Bank II j & ≥1 i s N v 2:2 k b readmem X Uppsala Architecture Research Team

  19. Test Configurations • Set associative: 2-way, 4-way, 8-way, 16-way • Fully associative cache • Skewed associative, LRU • Skewed associative, NRUE • Skewed associative, 5-bit timestamp • Elbow cache, 1-step lookahead, 5-bit timestamp • Elbow cache, 7-step feedback, 5-bit timestamp Uppsala Architecture Research Team

  20. Test Configurations (2) General configuration: • 8 KB, 16 KB, 32 KB cache size • L1 data cache with 32 byte block size • Write Back – No Allocate on Write &infinite write buffer (all writes ignored) Miss Rate Reduction (MRR): MRR = (MRref – MR)/MRref Uppsala Architecture Research Team

  21. Uppsala Architecture Research Team

  22. Uppsala Architecture Research Team

  23. Conclusions • For a 2-way skewed cache, timestamp replacement gives almost the same performance as LRU. • Timestamps are useful. • A 2-way elbow cache has roughly the same performance as an 8-way set associative cache of the same size. Uppsala Architecture Research Team

  24. Conclusions (2) • The lookahead design is slightly better than the feedback. • There are drawbacks with all skewed caches (skewing delays, VI-PT). • If the problems can be solved, the elbow cache is a good alternative to set associative caches. Uppsala Architecture Research Team

  25. Future Work Power awareness: How does an elbow cache stand up against traditional set associative caches when power consumptions is considered? Uppsala Architecture Research Team

  26. Links UART web: www.it.uu.se/research/group/uart/ Uppsala Architecture Research Team

  27. ? Uppsala Architecture Research Team

More Related