1 / 21

Understanding Virtual Memory Management in Microprocessors

Learn about virtual memory concepts, address translation, page tables, TLB, and the role of the OS in managing memory resources efficiently for multiple programs.

dvalenzuela
Download Presentation

Understanding Virtual Memory Management in Microprocessors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECE 463/563Fall `18 Virtual Memory Prof. Eric Rotenberg ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  2. Virtual Memory • Every program has its own virtual memory • Large virtual address space • Divided into virtual pages • When a program runs, it needs physical memory • Physical memory is actual storage: • DRAM – main memory • Hard Disk – overflow storage for main memory • Operating System (O/S) manages physical memory as a shared resource among many running programs • When a program first accesses a particular virtual page, O/S is invoked to allocate a physical page in main memory that will now correspond to the program’s virtual page • Upon starting a program: The O/S “loader” allocates initial physical pages for the program’s text (code) and data segments (globals, stack) • During program execution: The O/S “page-fault handler” allocates physical pages for first-time accesses to new virtual pages (e.g., heap, stack). The handler also swaps physical pages between the main memory (in DRAM) and the overflow storage for main memory (in Hard Disk). ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  3. swap space file system Hard Disk: Overflow Storage for Main Memory Virtual Memory (cont.) Virtual Memoryfor Program #1 Virtual Memoryfor Program #2 Physical Memory VirtualPageNumber (VPN) 0 VirtualPageNumber (VPN) 0 1 1 2 DRAM: Main Memory 2 3 3 4 4 0 5 PhysicalPageNumber(PPN) 5 1 6 6 2 7 7 3 8 8 4 5 … N … … 1 physical page 1 virtual page ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  4. swap space file system Hard Disk: Overflow Storage for Main Memory Virtual Memory (cont.) Virtual Memoryfor Program #1 Virtual Memoryfor Program #2 Physical Memory VPN: VPN: 0 0 1 1 2 DRAM: Main Memory 2 3 3 PPN: 4 4 0 5 5 1 6 6 2 7 7 3 8 8 4 5 … N … … ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  5. swap space file system Hard Disk: Overflow Storage for Main Memory Virtual Memory (cont.) Virtual Memoryfor Program #1 Virtual Memoryfor Program #2 Physical Memory VPN: VPN: 0 0 1 1 2 DRAM: Main Memory 2 3 3 PPN: 4 4 0 5 5 1 6 6 2 7 7 3 8 8 4 9 5 10 … 11 12 N … … ?? ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  6. O/S Page Tables • O/S maintains a page table (PT) per running program • PT is software data structure used to translate the program’s virtual addresses to physical addresses • PT is searched using the virtual address • There is one PT entry for each virtual page used by the program. Contents of PT entry: • Whether corresponding physical page is in main memory (DRAM) or in swap space (hard disk) • If in main memory: PT entry provides physical page number • If in swap space: PT entry provides location on disk • PT entry typically has other information too (recency of access, protection bits, etc.) ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  7. Virtual-to-PhysicalAddress Translation A runningprogram O/S Page Tablefor runningprogram AccessL1 I/D caches(entry point tomemoryhierarchy) physicaladdress virtualaddress translatevirtual address tophysical address ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  8. Virtual-to-Physical Address Translation VPN DRAM First time VPNreferenced? yes PPN first everaccessto VPN Scenario 3 Disk no In DRAMoron Disk? DRAM on Disk PPN Scenario 2 Swap-infrom diskto DRAM(“page fault”) Scenario 1 Disk DRAM In DRAM PPN PPN ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  9. Overhead of Virtual Memory • Program counter is a virtual address • Each instruction fetch requires address translation • Loads and stores generate virtual addresses • Each load and store requires address translation • For every instruction fetch, load, and store, call O/S to search the page table??? • This would have unacceptable performance • The O/S address translation function takes 10s to 100s of instructions and several memory accesses (to access page table). Exact overhead depends on the scenario, page table organization, etc. • There has to be a better way… ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  10. Translation Lookaside Buffer (TLB) • The TLB is a small cache of recently used address translations • TLB is defined in the ISA, because software and hardware collaborate w.r.t. the TLB • Hardware role • Hardware provides and accesses the TLB • Provides: TLB is a hardware table • Accesses: Hardware searches the TLB for desired translation. If TLB doesn’t have translation, hardware calls the O/S. • Software role • Software manages the TLB • When there is a TLB miss, O/S is responsible for handling the miss. Once it has a translation, it writes the translation into the TLB, at a location of its choosing. ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  11. Virtual-to-PhysicalAddress Translation, with TLB translatevirtual address tophysical address TLB A runningprogram AccessL1 I/D caches(entry point tomemoryhierarchy) CPU is executingapplication physicaladdress virtualaddress 1 cycle TLB-write instructionputs translation into TLB TLB missexception O/S Page Tablefor runningprogram CPU is executingO/S TLB missexception handler 10s-100s ofcycles ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  12. TLB Organization • TLB organization • Can be direct-mapped, set-associative, or fully-associative • Many modern RISC ISAs define it to be fully-associative since software manages it • Unified versus split TLBs • Unified: One TLB for both instruction and data address translation • Split: Separate TLBs for instruction and data address translation • I-TLB: TLB for instruction address translation (program counter). I-TLB sits alongside L1 I-cache. • D-TLB: TLB for data address translation (loads and stores). D-TLB sits alongside L1 D-cache. ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  13. =? =? =? =? PPN PPN PPN PPN v v v v VPN VPN VPN VPN Using the TLB for translation • Example: Fully-associative TLB X valid bit virtual page number physical page number X’ X X’ ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  14. TLB increases hit time Without Virtual Memory With Virtual Memory virtual address TLB address 1 cycle L1 Cache physical address 1 cycle L1 Cache 1 cycle ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  15. 12 11 0 31 virtual page number page offset TLB 12 11 0 31 physical page number page offset 0 31 tag index block offset Using the TLB for translation:A closer look • Observation: What if index bits were entirely contained in page offset bits? • Then first part of cache access (indexing) would not wait on TLB Ex: page size = 4KB # page offset bits = log2(4KB) = 12 virtual address physical address This is how cache interprets the physical address ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  16. 11 0 index blockoffset 12 11 0 31 virtual page number page offset tag array data array TLB 12 31 physical page number 12 31 =? word select tag Accessing TLB and Cache in Parallel • Cache hit time reduces from two cycles to one! • Because cache can now be indexed in parallel with TLB access (only the final tag match uses the output from TLB) • But some constraints... ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  17. Constraint: Size of 1 cache way • Constraint for “physically-indexed cache with parallel cache/TLB access” • Index and block offset bits contained within page offset bits • Therefore: Total amount of storage in 1 way of the cache should not exceed page size block size set Way 1 Way 2 Way N # sets N-way set-assoc. cache … ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  18. Page size / associativity tradeoff • From previous slide: • Cache size equation: • Therefore: • Example: MC88110 • Page size = 4KB • I$, D$ both: 8KB 2-way set-associative • (8KB/4KB) = 2 ways • Example: VAX series • Page size = 512B • For a 16KB cache, need assoc. = (16KB / 512B) = 32-way set-associative! • Moral: sometimes associativity is thrust upon you ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  19. Backup Slides • The following slides are for interest only ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  20. Notation on following slide • X: Virtual page number to be translated into a physical page number • X’: Physical page number corresponding to X • V: Virtual page number of a victim page structpage_table_entry { bool resident; // ‘true’ if in DRAM, ‘false’ if on disk (swap space) unsigned intppn; // if resident==true, this is physical page numberdisk_loc_typedisk_loc; // if resident==false, this is location on disk} // PT is a hash table (unbounded array) of page table entries.// Key (index) of the hash table is virtual page number. If true: X is in DRAM, at PT[X].ppn. Thus X’ = PT[X].ppn. PT[X].resident If false: X is not in DRAM.PT[X].ppn is bogus, PT[X].disk_loc indicates disk location. ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

  21. Virtual-to-Physical Address Translation X Is X in PT? Is there afree DRAMpage? SwapOut(V) PT[V].resident = falsePT[V].disk_loc = … PT[X].resident = true PT[X].ppn = PT[V].ppn no no first everaccessto X PT[X].resident = true PT[X].ppn = freelist.pop() yes yes SwapOut(V) PT[V].resident = falsePT[V].disk_loc = … PT[X].resident = true PT[X].ppn = PT[V].ppn SwapIn(X) PT[X].resident Is there afree DRAMpage? false no Swap-inX from diskto DRAM(“page fault”) PT[X].resident = true PT[X].ppn = freelist.pop() SwapIn(X) yes true X’ = PT[X].ppn X’ ECE 463/563, Microprocessor Architecture, Prof. Eric Rotenberg

More Related