60 likes | 336 Views
Memory access times. # clock cycles to send the address (say 1) #clock cycles to initiate each DRAM access (say 15) #clock cycles to transfer a word of data (say 1). Clock cycles required to access 4 words:. 1 + 4x15 + 4x1. 1 + 1x15 + 1. 1 + 1x15 + 4x1. Improving performance.
E N D
Memory access times • #clock cycles to send the address (say 1) • #clock cycles to initiate each DRAM access (say 15) • #clock cycles to transfer a word of data (say 1) Clock cycles required to access 4 words: 1 + 4x15 + 4x1 1 + 1x15 + 1 1 + 1x15 + 4x1
Improving performance • Two ways of improving performance: • decreasing the miss ratio: associativity • decreasing the miss penalty: multilevel caches
Decreasing miss ratio with associativity 2 blocks / set block 4 blocks / set 8 blocks / set
Tag size versus associativity Cache of 4K blocks, four word block size (or four-word cache lines), and 32-bit addresses • Direct mapped • Byte offset = 4 bits (each block = 4 words = 16 bytes) • Index + Tag = 32 – 4 = 28 bits • For 4K blocks, 12 index bits are required • #Tag bits for each block = 28 – 12 = 16 • Total #Tag bits = 16 x 4 = 64Kbits • 4-way set-associative • #Sets = 1K, therefore 10 bits index bits are required • #Tag bits for each block = 28 – 10 = 18 • Total #Tag bits = 4 x 18 x 1K = 72Kbits
Block replacement policy In a direct mapped cache, when a miss occurs, the requested block can go only at one position. In a set-associative cache, there can be multiple positions in a set for storing each block. If all the positions are filled, which block should be replaced? • Least Recently Used (LRU) Policy • Randomly choose a block and replace it