1 / 18

Cache Design

This lecture slides covers the basic cache algorithm, cache search methods, and various replacement strategies for CPU data cache. It also explains direct-mapped, fully associative, and set-associative caches, along with write policies and dirty bits for write-back caches.

tookes
Download Presentation

Cache Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cache Design + = Handouts: Lecture Slides

  2. CPU Data Tag Mem[A] A Mem[B] B MAIN MEMORY Basic Cache Algorithm MISS: X not found in TAG of any cache line • REPLACEMENT SELECTION: • Select some line k to hold Mem[X] • READ: Read Mem[X] Set TAG(k)=X, DATA(k)=Mem[X] • WRITE: Start Write to Mem(X) Set TAG(k)=X, DATA(k)= new Mem[X] • ON REFERENCE TO Mem[X]: Look for X among cache tags... • HIT: X = TAG(i) , for some cache line i • READ: return DATA(i) • WRITE: change DATA(i); Start Write to Mem(X) a (1-a)

  3. How do we search the cache? • Have to perform search in parallel and/or limit the number of places in the cache that a particular address (block) will be found. • Direct-mapped cache: Block can be in only one place in the cache. • Fully associative cache: Block can be anywhere in cache and we search all cache lines in parallel • Set-Associative cache: Block can be in a few (usually 2 to 8) places in the cache which are searched in parallel • A set is a collection of cache locations in which a given memory block may be placed.

  4. Each set = 1 Line 4 sets in cache Direct-Mapped Cache Block Offset Tag Index t k b V Tag Data Block 2k lines t 2b words in block = HIT Data Word or Byte

  5. Set = whole cache Fully Associative CacheTo Minimize Collisions V Tag Data Block t = Tag t = HIT Block Offset Data Word or Byte = b

  6. Each set = 2 lines 4 sets in cache 2-Way Set-Associative CacheTo Reduce Overhead Block Offset Tag Index b t k V Tag Data Block V Tag Data Block Way A Way B t Data Word or Byte = = HIT

  7. ISSUE: Replacement Strategy address Associativity implies choices… Direct-mapped N-way set associative Fully associative address tag data address tag data N index tag tag • compare addr with only one tag in cache • location A can be stored in exactly one cache line • compare addr with N tags in cache simultaneously • location A can be stored in exactly one set, but in any of the N cache lines belonging to that set • compare addr with each tag simultaneously • location A can be stored in any cache line

  8. LRU (A,B,C,D) Hit C  (C,A,B,D) new line is brought into cache, replacing LRU line at way D and is used by the processor. Replacement Strategies • LRU (Least-recently used) • Keeps most-recently used locations in cache • For each set, need to keep ordered list of N items O(N log2N) “LRU bits” per set + complex logic • Cheaper options: First-in First-out, Random LRU Example: 4-way SA cache has 4 lines in each set. Focus on Set i (C,A,B,D) Hit C  (C,A,B,D) (C,A,B,D) Miss  (D,C,A,B) (D,C,A,B) Hit B  (B,D,C,A)

  9. 100 37 38 R 100 01 1 42 MISS 100 41 42 R 010 11 0 33 MISS 010 33 46 R 101 11 1 23 MISS 101 16 23 R 010 11 0 33 MISS 010 33 46 because of collision Direct-Mapped Cache Example(Slightly different from L18, Slide 17) OperationAddressDataBehavior 26 words in Main Memory, word addressed BLOCK Word 0 Word 1 R 100 00 0 37 MISS TAG tag index block offset Line 00 Line 01 Line 10 R 100 00 1 38 HIT Line 11

  10. 1000 37 38 1000 41 42 0101 33 46 1011 16 23 R 1000 0 0 37 MISS (B,A)  (A,B) (B,A)  (B,A) offset tag index R 1000 1 1 42 MISS (B,A)  (A,B) R 0101 1 0 33 MISS (A,B)  (B,A) R 1011 1 1 23 MISS (B,A)  (A,B) 2-Way SA Cache with LRU Example Way A Way B OperationAddressDataBehaviorSet 0 LRUSet 1 LRU TAG Word 0 Word 1 TAG Word 0 Word 1 Set 0 Set 1 R 1000 0 1 38 HIT (A,B)  (A,B) R 0101 1 0 33HIT (A,B)  (B,A)

  11. Valid Bits V Tag Data Block Byte Byte 0 tag of A <A> <A+1> 1 0 Proc. Memory tag of B <B> <B+1> 1 0 • Valid bit must be 1 for cache line to HIT. • At power-up or reset, we set all valid bits to 0. • Set valid bit to 1 when cache line is first replaced. • Flush cache by setting all valid bits to 0, under external program control.

  12. Write Policy • What happens when we want to write data into a particular address? • Some possibilities are: • Write-Through: Writes go to main memory and cache. • Write-Back: Write cache, write main memory only when block is replaced. Address Address Main Memory Processor CACHE Data Data

  13. Write Back: Write Data(k) to Mem[Tag[k]] Write-back ON REFERENCE TO Mem[X]: Look for X among cache tags... HIT: X = TAG(i) , for some cache line i • READ: return DATA(i) • WRITE: change DATA(i); Start Write to Mem[X] MISS: X not found in TAG of any cache line • REPLACEMENT SELECTION: • Select some line k to hold Mem[X] • READ: Read Mem[X] Set TAG[k] = X, DATA[k] = Mem[X] • WRITE: Start Write to Mem[X] Set TAG[k] = X, DATA[k] = new Mem[X]

  14. Dirty Bits for Write-Back Caches Dirty bits signify data that has been modified in the cache but not in main memory D V Tag Data Block Byte Byte 0 0 tag of A <A> <A+1> 1 1 0 0 Proc. Memory tag of B <B> <B+1> 0 1 0 0 0 0 When the line corresponding to A is replaced, its data block has to be written to main memory since its dirty bit is set. No need to write back the line corresponding to B.

  15. Write-back w/ “Dirty” bits ON REFERENCE TO Mem[X]: Look for X among cache tags... HIT: X = TAG(i) , for some cache line i • READ: return DATA(i) • WRITE: change DATA(i); Start Write to Mem[X] D[i] = 1 MISS: X not found in TAG of any cache line • REPLACEMENT SELECTION: • Select some line k to hold Mem[X] • If D[k] == 1 (Write Back) Write Data(k) to Mem[Tag[k]] • READ: Read Mem[X]; Set TAG[k] = X, DATA[k] = Mem[X], D[k] = 0 • WRITE: Start Write to Mem[X] D[k] = 1 Set TAG[k] = X, DATA[k] = new Mem[X]

  16. MAIN MEMORY Caches in Beta IF RF Cache ALU MEM WB Problem: Memory access times (IF, MEM) limit Beta clock speed. Solution: Use cache for both instruction fetches and data accesses:- assume HIT time for pipeline clock period;- STALL pipe on misses.

  17. Memory Hierarchy in Modern Processors registers Main Memory P L1 L2 DISK Access time Capacity Block Size Associativity 0.5 clk 4KB 32B Explicitly managed (compiler) 1 clk 64KB 64B 2-way set assoc. 10 clks 4MB 128B Direct mapped 100 clks 4GB 4-16 KB Fully Associative 106 clks 1TB

  18. Next Time: Communication Technology Dilbert : S. Adams

More Related