170 likes | 320 Views
Caching IV. Andreas Klappenecker CPSC321 Computer Architecture. Virtual Memory. Processor generates virtual addresses Memory is accessed using physical addresses Virtual and physical memory is broken into blocks of memory, called pages A virtual page may be
E N D
Caching IV Andreas Klappenecker CPSC321 Computer Architecture
Virtual Memory • Processor generates virtual addresses • Memory is accessed using physical addresses • Virtual and physical memory is broken into blocks of memory, called pages • A virtual page may be • absent from main memory, residing on the disk • or may be mapped to a physical page
Virtual Memory • Main memory can act as a cache for the secondary storage (disk) • Virtual address generated by processor (left) • Address translation (middle) • Physical addresses (right) • Advantages: • illusion of having more physical memory • program relocation • protection
Pages: virtual memory blocks • Page faults: if data is not in memory, retrieve it from disk • huge miss penalty, thus pages should be fairly large (e.g., 4KB) • reducing page faults is important (LRU is worth the price) • can handle the faults in software instead of hardware • using write-through takes too long so we use writeback • Example: page size 212=4KB; 218 physical pages; main memory <= 1GB; virtual memory <= 4GB
Page Faults • Incredible high penalty for a page fault • Reduce number of page faults by optimizing page placement • Use fully associative placement • full search of pages is impractical • pages are located by a full table that indexes the memory, called the page table • the page table resides within the memory
Page Tables The page table maps each page to either a page in main memory or to a page stored on disk
Making Memory Access Fast • Page tables slow us down • Memory access will take at least twice as long • access page table in memory • access page • What can we do? Memory access is local => use a cache that keeps track of recently used address translations, called translation lookaside buffer
Making Address Translation Fast A cache for address translations: translation lookaside buffer
Translation Lookaside Buffer • Some typical values for a TLB • TLB size 32-4096 • Block size: 1-2 page table entries (4-8bytes each) • Hit time: 0.5-1 clock cycle • Miss penalty: 10-30 clock cycles • Miss rate: 0.01%-1%
More Modern Systems • Very complicated memory systems:
Some Issues • Processor speeds continue to increase very fast — much faster than either DRAM or disk access times • Design challenge: dealing with this growing disparity • Trends: • synchronous SRAMs (provide a burst of data) • redesign DRAM chips to provide higher bandwidth or processing • restructure code to increase locality • use prefetching (make cache visible to ISA)
Algorithm for Success • Read Chapters 5 - 7 • get the big picture • Read again • focus on the little details • do calculations • work problems • Get enough sleep! • What should be reviewed?
Project • Provide a working solution • it is better to submit a working solution implementing a subset of instructions • if you submit a faulty version, comment your bugs • have test programs that exercise all instruction • have a full report that explains your design • should include a table of control signals