570 likes | 586 Views
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 7 8 9 80 6 7 8 9 81. 31 . . . 16 15 . . . 4 3 2 1 0. Address. Tag. Index. 25 3.
E N D
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 7 8 9 80 6 7 8 9 81
31 . . . 16 15 . . . 4 3 2 1 0 Address Tag Index 25 3 Byte Offset Block Offset 2 v Tag Word3 Word2 Word1 Word0 32 32 32 32 8 Entries 16 = Mux Hit Data 32
Block Address • 0 3 2 1 0 • 1 7 6 5 4 • 2 11 10 9 8 • 3 15 14 13 12 • 7 31 30 29 28 • 8 35 34 33 32 • 15 63 62 61 60 • X 4X+3 4X+2 4X+1 4X Word Addr 4 Word Address
Cache Address 0 1 2 3 7 Block Address • 0 3 2 1 0 • 1 7 6 5 4 • 2 11 10 9 8 • 3 15 14 13 12 • 7 31 30 29 28 • 8 35 34 33 32 • 15 63 62 61 60 • X 4X+3 4X+2 4X+1 4X Word Addr 4 Word Address
Cache Address 0 1 2 3 7 0 7 Block Address • 0 3 2 1 0 • 1 7 6 5 4 • 2 11 10 9 8 • 3 15 14 13 12 • 7 31 30 29 28 • 8 35 34 33 32 • 15 63 62 61 60 • X 4X+3 4X+2 4X+1 4X Word Addr 4 Word Address
Cache Address 0 1 2 3 7 0 7 X Modulo 8 Block Address • 0 3 2 1 0 • 1 7 6 5 4 • 2 11 10 9 8 • 3 15 14 13 12 • 7 31 30 29 28 • 8 35 34 33 32 • 15 63 62 61 60 • X 4X+3 4X+2 4X+1 4X Word Addr 4 Word Address
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 7 8 9 80 6 7 8 9 81 Cache Address =( Word Addr ) modulo 8 4
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 1 1 Miss 7 8 9 80 6 7 8 9 81 Cache Address =( Word Addr ) modulo 8 4
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 1 1 Miss 7 1 1 Hit 8 9 80 6 7 8 9 81 Cache Address =( Word Addr ) modulo 8 4
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 1 1 Miss 7 1 1 Hit 8 2 2 Miss 9 80 6 7 8 9 81 Cache Address =( Word Addr ) modulo 8 4
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 1 1 Miss 7 1 1 Hit 8 2 2 Miss 9 2 2 Hit 80 6 7 8 9 81 Cache Address =( Word Addr ) modulo 8 4
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 1 1 Miss 7 1 1 Hit 8 2 2 Miss 9 2 2 Hit 80 20 4 Miss 6 7 8 9 81 Cache Address =( Word Addr ) modulo 8 4
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 1 1 Miss 7 1 1 Hit 8 2 2 Miss 9 2 2 Hit 80 20 4 Miss 6 1 1 Hit 7 1 1 Hit 8 2 2 Hit 9 2 2 Hit 81 20 4 Hit Cache Address =( Word Addr ) modulo 8 4
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 1 1 Miss 7 1 1 Hit 8 2 2 Miss 9 2 2 Hit 68 6 1 7 1 8 2 9 2 69 Cache Address =( Word Addr ) modulo 8 4
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 1 1 Miss 7 1 1 Hit 8 2 2 Miss 9 2 2 Hit 68 17 1 Miss 6 1 7 1 8 2 9 2 69 Cache Address =( Word Addr ) modulo 8 4
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 1 1 Miss 7 1 1 Hit 8 2 2 Miss 9 2 2 Hit 68 17 1 Miss 6 1 1 Miss 7 1 1 Hit 8 2 2 Hit 9 2 2 Hit 69 Cache Address =( Word Addr ) modulo 8 4
Consider a Direct Mapped Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address Hit or Miss 6 1 1 Miss 7 1 1 Hit 8 2 2 Miss 9 2 2 Hit 68 17 1 Miss 6 1 1 Miss 7 1 1 Hit 8 2 2 Hit 9 2 2 Hit 69 17 1 Miss Cache Address =( Word Addr ) modulo 8 4
How about putting a block in any unused block of the eight blocks? Tag Word3 Word2 Word1 Word0
How about putting a block in any unused block of the eight blocks? Tag Word3 Word2 Word1 Word0 How can you find it?
How about putting a block in any unused block of the eight blocks? Tag Word3 Word2 Word1 Word0 How can you find it? Expand the Tag to the block address and compare
How about putting a block in any unused block of the eight blocks? Block Address – 28 bits Address Tag Word3 Word2 Word1 Word0 Fully Associative Memory – Addressed by it’s contents
Fully Associative Memory – Addressed by it’s contents Block Offset Block Address – 28 bits Address Byte Offset • For practical Hit time, must have parallel comparisons • of the Tag and the Block Address • Only feasible for small number of blocks
Fully Associative Memory – Addressed by it’s contents Block Offset Block Address – 28 bits Address Byte Offset Tag Data Tag Data Tag Data Tag Data Blk Addr = = = = + Mux Block Offset selects Word Valid bit not shown Data Hit
Fully Associative Memory – Addressed by it’s contents Block Offset Block Address – 28 bits Address Byte Offset Tag Data Tag Data Tag Data Tag Data Blk Addr = = = = + Mux Hardware Not Feasible for large Cache Valid bit not shown Data Hit
Make sets of Blocks Associative Two-way set associative Valid bit not shown 0 1 . . . Tag0 Data0 Tag1 Data1 Index • Addr by Index • Compare Two • Tags in parallel • for Hit 2k-1
Make sets of Blocks Associative Two-way set associative Valid bit not shown 0 1 . . . Tag0 Data0 Tag1 Data1 Index • Addr by Index • Compare Two • Tags in parallel • for Hit 2k-1 Address Block Offset Tag Index Byte Offset
Block replacement strategies • For each Index there are 2, 4, ... n options for replacement. • Strategies • LRU – Least Recently Used • Replace the block that has been unused for the longest time • Implementation
Block replacement strategies • For each Index there are 2, 4, ... n options for replacement • Strategies • LRU – Least Recently Used • Replace the block that has been unused for the longest time • Random • Select the block to be replaced randomly • Implementation
Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address(Set) Hit or Miss Entry 0 Entry 1 6 7 8 9 68 6 7 8 9 69 Cache Address =( Word Addr ) modulo 4 4
Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address(Set) Hit or Miss Entry 0 Entry 1 6 1 1 Miss 7 1 1 Hit 8 2 2 Miss 9 2 2 Hit 68 6 7 8 9 69 Cache Address =( Word Addr ) modulo 4 4
Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address(Set) Hit or Miss Entry 0 Entry 1 6 1 1 Miss 7 1 1 Hit 8 2 2 Miss 9 2 2 Hit 68 17 1 Miss 6 7 8 9 69 Cache Address =( Word Addr ) modulo 4 4
Consider a Two Way Associative Cache with 4 word blocks with size of 8 blocks or 32 words. Reference Sequence Word Address Block Address Cache Address(Set) Hit or Miss Entry 0 Entry 1 6 1 1 Miss 7 1 1 Hit 8 2 2 Miss 9 2 2 Hit 68 17 1 Miss 6 1 1 Hit 7 1 1 Hit 8 2 2 Hit 9 2 2 Hit 69 17 1 Hit Cache Address =( Word Addr ) modulo 4 4
Make sets of Blocks Associative Valid bit not shown Four-way set associative Index 0 1 . . . Tag0 Data0 Tag1 Data1 Tag2 Data2 Tag3 Data3 2m-1 • Addr by Index • Compare Four • Tags in parallel • for Hit
Make sets of Blocks Associative Valid bit not shown Four-way set associative Index 0 1 . . . Tag0 Data0 Tag1 Data1 Tag2 Data2 Tag3 Data3 2m-1 Address Block Offset Tag Index Byte Offset
Make sets of Blocks Associative Valid bit not shown Four-way set associative Index 0 1 . . . Tag0 Data0 Tag1 Data1 Tag2 Data2 Tag3 Data3 2m-1 Address Block Offset Tag Index Byte Offset Can generalize to n-way associative
DECStation 3100 with 64KB instruction cache and 64KB data cache each with 4 word block size Program = gcc Instruction Data Combined Associativity miss rate miss rate miss rate 1 2.0% 1.7% 1.9% 2 1.6% 1.4% 1.5% 4 1.6% 1.4% 1.5%
Four-way set associative Block Offset 2 Address 32 bit Tag Index Byte Offset v v v v Tag0 Data0 Tag1 Data1 Tag2 Data2 Tag3 Data3
Number of Blocks = 2n • Select 4, then n = 2
Four-way set associative Block Offset 2 2 Address 32 bit Tag Index Byte Offset v v v v Tag0 Data0 Tag1 Data1 Tag2 Data2 Tag3 Data3
Number of Blocks = 2n • Select 4, then n = 2 • Select number of entries in the cache ( power of 2) • If 256, then Index is 8 bits.
Number of Blocks = 2n • Select 4, then n = 2 • Select number of entries in the cache ( power of 2) • If 256, then Index is 8 bits. • Cache has 256 x 4 blocks = 1K blocks • = 1 K blocks x 4 words/ block = 4 K words • = 16 KB
Number of Blocks = 2n • Select 4, then n = 2 • Select number of entries in the cache ( power of 2) • If 256, then Index is 8 bits. • Cache has 256 x 4 blocks = 1K blocks • = 1 K blocks x 4 words/ block = 4 K words • = 16 KB • Tag = 32 – 2 – 2 – 8 = 20 bits • Each entry has 4 x ( 1 + 20 + 128 ) bits • = 4 x 149 = 596 bits • Total Cache Memory = 256 x 596 bits • = 152576 bits • = 149 K bits
Four-way set associative Block Offset 2 2 Address 32 bit 20 Tag Index Byte Offset 8 v v v v 0 1 . . . 255 Tag0 Data0 Tag1 Data1 Tag2 Data2 Tag3 Data3 = = = = Hit0 Hit1 Hit2 Hit3
Four-way set associative Block Offset 2 2 Address 32 bit 20 Tag Index Byte Offset 8 v v v v 0 1 . . . 255 Tag0 Data0 Tag1 Data1 Tag2 Data2 Tag3 Data3 = = = = MISS 4 OPTIONS Hit0 Hit1 Hit2 Hit3
LRU Approximation Add the following three bits to each entry of the cache MRR(0) = 1 if Data 0 or Data 1 Read Last = 0 if Data 2 or Data 3 Read Last MRR(1) = 1 if Data 1 Read Last = 0 If Data 0 Read Last MRR(2) = 1 if Data 2 Read Last = 0 if Data 3 Read Last
LRU Approximation Add the following three bits to each entry of the cache MRR(0) = 1 if Data 0 or Data 1 Read Last = 0 if Data 2 or Data 3 Read Last MRR(1) = 1 if Data 1 Read Last = 0 If Data 0 Read Last MRR(2) = 1 if Data 2 Read Last = 0 if Data 3 Read Last LRU Approximation If MRR(0) = 1, then choose Data 2, Data 3 pair If MRR(2) = 1, then choose Data 3 as LRU
LRU Approximation Add the following three bits to each entry of the cache MRR(0) = 1 if Data 0 or Data 1 Read Last = 0 if Data 2 or Data 3 Read Last MRR(1) = 1 if Data 1 Read Last = 0 If Data 0 Read Last MRR(2) = 1 if Data 2 Read Last = 0 if Data 3 Read Last LRU Approximation If MRR(0) = 1, then choose Data 2, Data 3 pair If MRR(2) = 1, then choose Data 3 as LRU Note the LRU could have been Data 0 or Data 1.
Four-way set associative Block Offset 2 2 Address 32 bit 20 Tag Index Byte Offset 8 v v v v 0 1 . . . 255 Tag0 Data0 Tag1 Data1 Tag2 Data2 Tag3 Data3 = = = = Write Hit0 Hit1 Hit2 Hit3
Write – Through • Write to the block in cache and in main memory • 4-way associative example: • Read Valid and Tag to find the block.