160 likes | 285 Views
Caches in Systems. Learning Objectives. To understand: “3 x C’s” model of cache performance Time penalties for starting with empty cache Systems interconnect issues with caching and solutions! Caching and Virtual Memory. Describing Cache Misses . Compulsory Misses Capacity Misses
E N D
Caches in Systems COMP25212 Cache 4
Learning Objectives • To understand: • “3 x C’s” model of cache performance • Time penalties for starting with empty cache • Systems interconnect issues with caching • and solutions! • Caching and Virtual Memory COMP25212 Cache 4
Describing Cache Misses • Compulsory Misses • Capacity Misses • Conflict Misses COMP25212 Cache 4
Cache Performance again • Today’s caches, how long does it take: a) to fill L3 cache? (8MB) b) to fill L2 cache? (256KB) c) to fill L1 D cache? (32KB) • Number of lines = (cache size) / (line size) • Number of lines = 32K/64 = 512 • 512 x memory access times at 20nS = 10 uS • 20,000 clock cycles at 2GHz COMP25212 Cache 4
Caches in Systems e.g. disk, network L1 Inst Cache L2 CPU fetch RAM Memory L1 Data Cache data On-chip internconnect stuff how often? Input/Output COMP25212 Cache 4
Cache Consistency Problem 1 • Problem: • I/O writes to mem; cache outdated L1 Inst Cache L2 “I” CPU fetch RAM Memory Data Cache “I” data “I” On-chip internconnect stuff Input/Output COMP25212 Cache 4
Cache Consistency Problem 2 L1 Inst Cache L2 “I” CPU fetch RAM Memory Data Cache “I” data “I” On-chip internconnect stuff • Problem: • I/O reads mem; cache holds newer Input/Output COMP25212 Cache 4 COMP25212 Cache 4
Cache Consistency Software Solutions • O/S knows where I/O takes place in memory • Mark I/O areas as non-cachable (how?) • O/S knows when I/O starts and finishes • Clear caches before&after I/O? COMP25212 Cache 4
Hardware Solutions:1 Unfortunately: tends to slow down cache L1 Inst Cache L2 “I” CPU fetch RAM Memory Data Cache “I” data “I” On-chip internconnect stuff Input/Output COMP25212 Cache 4 COMP25212 Cache 4
Hardware Solutions: 2 - Snooping snoop L1 Inst Cache L2 “I” CPU fetch RAM Memory Data Cache “I” data “I” On-chip internconnect stuff • L2 keeps track of L1 contents • Snoop logic in cache observes every memory cycle Input/Output COMP25212 Cache 4 COMP25212 Cache 4
Caches and Virtual Addresses • CPU addresses – virtual • Memory addresses – physical • Recap – use TLB to translate v-to-p • What addresses in cache? COMP25212 Cache 4
Option 1: Cache by Physical Addresses • BUT: • Address translation in series with cache $ TLB CPU address RAM Memory data On-chip • SLOW COMP25212 Cache 4
Option 2: Cache by Virtual Addresses $ TLB CPU address RAM Memory data On-chip • BUT: • Snooping? • Aliasing? • More Functional Difficulties COMP25212 Cache 4 COMP25212 Cache 4
3: Translate in parallel with Cache Lookup • Translation only affects high-order bits of address • Address within page remains unchanged Low-order bits of Physical Address = low-order bits of Virtual Address Select “index” field of cache address from within low-order bits Only “Tag” bits changed by translation COMP25212 Cache 4 COMP25212 Cache 4
Option 3 in operation: 20 5 7 Virtual address virtual page no index within line TLB tag line data line Physical address multiplexer compare = ? Hit? Data
The Last Word on Caching? You ain’t seen nothing yet! L1 Inst Cache L1 Inst Cache L1 Inst Cache L1 Inst Cache L1 Inst Cache L1 Inst Cache L3 L3 L3 L2 L2 L2 L2 L2 L2 CPU CPU CPU CPU CPU CPU fetch fetch fetch fetch fetch fetch RAM Memory L1 Data Cache L1 Data Cache L1 Data Cache L1 Data Cache L1 Data Cache L1 Data Cache data data data data data data On-chip On-chip On-chip Input/Output COMP25212 Cache 4