1 / 26

CSCE 212 Chapter 7 Memory Hierarchy

CSCE 212 Chapter 7 Memory Hierarchy. Instructor: Jason D. Bakos. Memory Hierarchy. Programmers want more memory and faster memory Problems: Denser memories require longer access times Example: papers on your desk vs. papers in your filing cabinet

dusan
Download Presentation

CSCE 212 Chapter 7 Memory Hierarchy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSCE 212Chapter 7Memory Hierarchy Instructor: Jason D. Bakos

  2. Memory Hierarchy • Programmers want more memory and faster memory • Problems: • Denser memories require longer access times • Example: papers on your desk vs. papers in your filing cabinet • Fast memories are extremely expensive per unit capacity • Examples: • SRAM: .5 – 5 ns access time, $1K/GB • DRAM: 50 – 70 ns access time, $100/GB • Magnetic disk: 5 – 20 ms access time, $.10/GB

  3. Locality • Goal: • Achieve the access time of smaller memories but have the effective capacity of larger memories • Solution: • Temporal locality • memory locations are accessed more than once • Spatial locality • when a memory location is accessed, there’s a good chance a nearly location will be accessed in the near future

  4. Memory Hierarchy

  5. Memory Hierarchy • Each level of the hierarchy stores a subset of the level below it • Each level can only communicate with the level below it • For now, assume 2-level hierarchy • CPU-cache-RAM • cache is usually on-chip • Sometimes the data we need is not in cache • hit rate • Block or line • spatial locality • miss penalty • time required to move a line to the top of the hierarchy (may vary) main memory CPU cache

  6. Caches • Questions: • How do we know if the requested location is in the cache? • How do we find it?

  7. Cache Organization tags address(31 downto (log2n + 2)) • Fully associative • Too many tags to compare! n words

  8. Direct Mapped Cache

  9. Direct Mapped Cache • Direct mapped – each memory location maps to only one location in the cache tags addr(31:8) 8 words addr(7:5) 000 001 010 011 100 101 110 111

  10. Addresses • The memory address can be partitioned: • Example: 128 lines, 16 word lines: index log2lines bits (which line in each set?) word offset log2lines_size bits (which word in the line?) byte offset 2 bits (which byte in the word?) tag bits 31:10 9:3 5:2 1:0 index word offset byte offset tag bits

  11. Cache Organization

  12. The Three C’s • Three different kinds of misses: • Compulsary (cold-start) misses • First access to a block • Capacity misses • Replaced block is needed again • Because… cache capacity isn’t sufficient for the program • Conflict (collision) misses • Multiple blocks compete for the same set

  13. Associativity • 2-way set associative: • Two choices where to store a given line • Replacement policy (ex. LRU) tags 0 addr(31:8) tags 1 addr(31:8) 8 words 8 words addr(7:5) 000 001 010 011 100 101 110 111

  14. Associative Cache Organization

  15. Cache Behavior • Hits at the top-level cache can usually be performed in one (or a few) clock cycles • Misses stall the processor • Writes can be handled using • Write-through (write allocate, write no-allocate) • When cache data is changed, the lower level memory is updated immediately • Use a write buffer • Write-back • When cache data is changed, the lower level memory isn’t updated until the cache line containing the changes is replaced

  16. Memory Systems • Main memory is DRAM, designed for density (not access time) • How to reduce miss penalty?

  17. Average Memory Access Time • AMAT = hit_time + miss_rate * miss_penalty • Reduce miss rate: • Larger cache (capacity misses) • Increase associativity (conflict misses) • Replacement policy • Each of these may increase hit time and miss penalty • Reduce miss penalty: • Wider or banked memory bus

  18. Virtual Memory • Main memory acts as a cache to secondary storage • Allows memory to be shared • Make memory appear to be larger than it physically is • Each program has own address space • Enforces protection • Virtual memory block is called a page, a miss is called a page fault • Virtual addresses are translated into physical addresses • Address mapping / address translation • Combination of hardware and software

  19. Virtual Memory

  20. Virtual Memory

  21. Page Faults • Main memory is 100,000 times faster than disk • Page faults are expensive • Reduce page fault rate • Fully associative placement of pages in memory • Each process has a page table that maps virtual addresses to physical addresses • OS creates space on disk for all the process’s pages • Swap space • OS maintains another table that keeps track of each page in main memory • During a page fault, the OS must decide which page to replace • Least recently used (LRU) • Write-back used for writes

  22. Page Table

  23. Page Table

  24. TLB • Page lookups must be performed in hardware • Page table is cached on-chip • Translation-lookaside buffer • Small fully associative or large limited associative

  25. Integrating Cache and VM • Data cannot be in the cache unless it is present in main memory • Cache can be • physically addressed (TLB in critical path) • virtually addressed (TLB out of critical path) • Cache miss requires TLB access • TLB miss means: • page is in memory but we need the TLB entry, or • page is not in memory (page fault) • (both handled by OS software)

  26. TLB Misses and Page Faults • When a virtual address causes a page fault… • Look up page table entry and find location on disk • Choose a physical page to replace, write-back if dirty • Read page from disk into chosen physical page (allow another process to run) • TLB miss in MIPS • BadVAddr set, special exception triggered (8000 0000), go to TLB miss handler • Context register: • bits 31:20  base of the page table • bits 19:2  virtual address of the missing page • Use Context register directly to load missing entry • If the page table entry is invalid, a page fault exception occurs at the normal handler (8000 0180) • Move missing entry to EntryLo register • Execute tlbwr to move EntryLo to TLB at address stored in Random register (free running counter) • Execute eret to return • TLB miss exception doesn’t save process state (fast) while page fault does (slow)

More Related