90 likes | 221 Views
Cache Basics (Section 1.7, 5.1). A cache is a small, fast memory located close to the CPU that holds the most recently accessed code or data A block is a fixed-size collection of data containing the requested word, that is retrieved from the memory
E N D
Cache Basics (Section 1.7, 5.1) • A cache is a small, fast memory located close to the CPU that holds the most recently accessed code or data • A block is a fixed-size collection of data containing the requested word, that is retrieved from the memory • Temporal locality tells us that we are likely to need this word again in the near future • Spatial locality tells us that the other data in the block may be needed soon.
Cache Basics(Cont’d) • The time required for the cache miss depends on the latency and bandwidth of the memory • Latency determines the time to retrieve the first word of the block • Bandwidth determines the time to retrieve the rest of the block • Hit (or Miss) rate in the fraction of cache accesses that result in a hit (or a miss) • Example on page 42
Performance of Cache Memory Stall Cycles = No of misses x Miss penalty = IC x(Misses/instruction) x Miss penalty = IC x Memory references per instruction x Miss rate x Miss penalty CPU Executive Time = (CPU clock cycles + Memory stall cycles) x Clock cycle time Example on page 43
When Can a Block Be Placed in a Cache? (Figure 5.2) • Direct Mapped:each block has only one place in the cache. (Block address) mod (no.of blocks in cache) • Fully Associative:a block can be placed anywhere in the cache • Set Associate: A block can be placed in a restricted set of places in the cache. A set is a group of blocks. If n blocks in a set, it is n-way set- associative cache placement. A set in chosen as (Block address) mod (no. of sets in cache)
How IS a Block Found if it is in the Cache? • Each block has an address tag and an index that give the block address (Fig 5.3) • A block offset points to the desired data within the block • The index field selects the set • The tag field is compared to determine a hit • Increasing associativity means increasing the tag field and decreasing the index field • Fully associative caches have no index field
Which Block Should Be Replaced? • A block needs to be replaced in the cache whenever there is a cache miss • In direct-mapped cache, there is a fixed place for each block, so the choice is simple • In fully associative or set-associative caches, three strategies exist for replacement • Random • Least-recently used (LRU) • FIFO
What Happens on a Write? Options when writing to cache: • Write through: write to the cache and to the memory • Next lower level has the most current copy • Write back: write to the cache only • Write occurs at the speed of cache • Dirty bit specifies if the block has been modified or not Options on a write miss • Write allocate: the block is loaded on a write miss • No-write allocate: the block is not loaded into the cache
Alpha AXP 21064 Data Cache (Figure 5.5) • 8192 bytes data cache • 32-byte blocks (5 –bit offset, 8-bit index) • Direct-mapped • Write through with a 4-block write buffer • No-write allocate: write around the cache on a miss • 34-bit address: 21-bit tag, 8-bit index, 5-bit offset • Write buffers use merging • Separate instructions and data caches
Cache Performance • Average memory access time = Hit time + Miss rate x Miss penalty • CPU Time = (CPU execution CCs + Mem. Stall CCs) x CC time Examples on pages 384-389 CC: Clock Cycle