1 / 45

Introduction to Systems Programming Lecture 7

Introduction to Systems Programming Lecture 7. Paging. Reminder: Virtual Memory. Processes use virtual address space Every process has its own address space The address space can be larger than physical memory.

noleta
Download Presentation

Introduction to Systems Programming Lecture 7

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Systems Programming Lecture 7 Paging

  2. Reminder: Virtual Memory • Processes use virtual address space • Every process has its own address space • The address space can be larger than physical memory. • Only part of the virtual address space is mapped to physical memory at any time using page table. • MMU & OS collaborate to move memory contents to and from disk.

  3. Virtual Address Space Backing Store

  4. Virtual Memory Operation What happens when reference a page in backing store? Recognize location of page Choose a free page Bring page from disk into memory Above steps need hardware and software cooperation

  5. Steps in Handling a Page Fault Avishai Woollecture 7 - 5

  6. Where is the non-mapped memory? • Special disk area (“page file” or swap space) e.g. C:\pagefile.sys • Each process has part of its memory inside this file • The mapped pages, that are in memory, could be more updated than the disk copies (if they were written to “recently”)

  7. Issues with Virtual Memory • Page table can be very large: • 32 bit addresses, 4KB pages (12-bit offsets)  220 pages == over 1 million pages • Each process needs its own page table • Page lookup has to be very fast: • Instruction in 4ns  page table lookup should be around 1ns • Page fault rate has to be very low.

  8. 4-bit page number

  9. Multi-Level Page Tables • Example: split the 32-bit virtual address into: • 10-bit PT1 field (indexes 1024 entries) • 10-bit PT2 field • 12-bit offset  4KB pages  220 pages (1 million) • A 1024-entry index, with 4 bytes per entry, is exactly 4KB == a page • But: not all page table entries kept in memory!

  10. Second-level page table MMU does 2 lookups for each virtual  physical mapping Top-level page table Each top level entry points to 210 x 212 = 4MB of memory

  11. Why Multi-Level Page Tables Help? • Usually large parts of address space are unused. Example (cont.): • Process gets 12MB of address space (out of the possible 4GB) • Program instructions in low 4MB • Data in next 4MB of addresses • Stack in top 4MB of addresses

  12. Second-level page table Top-level page table • Only 4 page tables need to be in memory: • Top Level Table • Three 2nd levels tables

  13. Paging in 64 bit Linux

  14. Page lookup has to be very fast • A memory-to-register copy command: 100000: … 100004: mov bx, *200012 100008: … • References 2 memory addresses: • Fetch instruction (address=100004) • Fetch data (address = 200012)

  15. Paging as a Cache Mechanism • Idea behind all caches: overcome slow access by keeping few,most popular items in fast access area. • Works because some items much more popular than other. • Page table acts as a cache (items = pages)

  16. Cache Concepts • Cache Hit (Miss): requested item is (not) in cache : • in Paging terminology “miss” == page-fault • Thrashing: lots of cache misses • Effective Access Time:Prob(hit)*hit_time + Prob(miss)*miss_time

  17. Example: Cost of Page Faults • Memory cycle: 10ns = 10*10-9 • Disk access: 10ms = 10*10-3 • hit rate 98%, page-fault rate = 2% • effective access time = 0.98* 10*10-9 + 0.02* 10*10-3 = 2.00098*10-4 This shows that the real page-fault rate is a lot smaller!

  18. Page Replacement Algorithms Which page to evict?

  19. Structure of a Page Table Entry

  20. Page Table Entry Fields • Present = 1  page is in a frame • Protection: usually 3 bits, “RWX” • R = 1  read permission • W = 1  write permission • X = 1  execute permission (contains instructions) • Modified = 1  page was written to since being copied in from disk. Also called dirty bit. • Referenced = 1  page was read “recently”

  21. The Modified and Referenced bits • Bits are updated by MMU • Give hints for OS page fault processing, when it chooses which page to evict: • If Modified = 0 then no need to copy page back to disk; the disk and memory copies are the same. • So evicting such a page is cheaper • Better to evict pages with Referenced=0: not used recently  maybe won’t be used in near future.

  22. Page Replacement Algorithms • Page fault forces a choice • OS must pick a page to be removed to make room for incoming page • If evicted page is modified  page must first be saved • If not modified: can just overwrite it • Try not to evict a “popular” page • will probably need to be brought back in soon

  23. Optimal Page Replacement Algorithm • Replace page needed at the farthest point in future • Suppose page table can hold 8 pages, and list of page requests is • 1,2,3,4,5,6,7,8,9,8,9,8,7,6,5,4,3,2,1 Page fault Evict page 1

  24. Optimal not realizable • OS does not know which pages will be requested in the future. • Try to approximate the optimal algorithm. • Use the Reference & Modified bits as hints: • Bits are set when page is referenced, modified • Hardware sets the bits on every memory reference

  25. FIFO Page Replacement Algorithm • Maintain a linked list of all pages • in order they came into memory (ignore M & R bits) • Page at beginning of list replaced • Disadvantage • page in memory the longest time may be very popular

  26. Not Recently Used • At process start, all M & R bits set to 0 • Each clock tick (interrupt), R bits set to 0. • Pages are classified • not referenced, not modified • not referenced, modified (possible!) • referenced, not modified • referenced, modified • NRU removes page at random • from lowest numbered non empty class

  27. 2nd Chance Page Replacement • Holds a FIFO list of pages • Looks for the oldest page not referenced in last tick. • Repeat while oldest has R=1: • Move to end of list (as if just paged in) • Set R=0 • At worst, all pages have R=1, so degenerates to regular FIFO.

  28. Operation of a 2nd chance • pages sorted in FIFO order • Page list if fault occurs at time 20, A has R bit set(numbers above pages are loading times)

  29. Least Recently Used (LRU) • Motivation: pages used recently may be used again soon • So: evict the page that has been unused for longest time • Must keep a linked list of pages • most recently used at front, least at rear • update this list every memory reference !! • Alternatively keep counter in each page table entry • choose page with lowest value counter • periodically zero the counter

  30. Simulating LRU with Aging • More sophisticated use the R-bit • Each page has a b-bit counter • Each clock interrupt the OS does: counter(t+1) = 1/2*counter(t) + R*2(b-1) [same as right-shift & insert R bit as MSB] • After 3 ticks: Counter = R3*2(b-1) +R2*2(b-2) +R1*2(b-3) • Aging only remembers b ticks back • Most recent reference has more weight

  31. Example: LRU/aging • Hardware only supports a single R bit per page • The aging algorithm simulates LRU in software • Remembers 8 clock ticks back 120 176 136 32 88 40

  32. The Working Set Idea: processes have locality of reference: • At any short “time window” in their execution, they reference only a small fraction of the pages. • Working set = set of pages “currently” in use. • W(k,t) = set of pages used in last k ticks at time t • If working set is in memory for all t  no page faults. • If memory too small to hold working set  thrashing (high page fault rate).

  33. Evolution of the Working Set Size with k • w(k,t) is the size of the working set at time t • It isn’t worthwhile to keep k very large k

  34. The Working Set Page Replacement Algorithm • Each page has • R bit • Time-of-last-use (in virtual time) • R bit cleared every clock tick • At page fault time, loop over pages • If R=1  put current time in time-of-last-use • If R=0: calculate age = current-time - time-of-last-use • If age > τ : not in W.S., evict • Hardware does not set the “time-of-last-use”: it’s an approximate value set by OS.

  35. Working Set Algorithm - Example

  36. The WSClock Algorithm • [Carr and Hennessey, 1981] • Keep pages in a circular list • Each page has R, M, and time-of-last-use • “clock” arm points to a page

  37. WSClock - details At page fault: • If R=1, set R=0, set time-of-use, continue to next page • Elseif age > τ and M=0 (clean page)  use it • Elseif age > τ and M=1 (dirty page)  • Schedule a write to disk, and continue to next page • If hand goes all the way around: • If a write to disk was scheduled, continue looking for a clean page: eventually a write will complete • Else pick some random (preferably clean) page.

  38. WSClock example

  39. Review of Page Replacement Algorithms

  40. Modeling Paging Algorithms

  41. Belady's Anomaly: FIFO P's show which page references cause page faults FIFO with 3 page frames FIFO with 4 page frames

  42. Belady’s anomaly - conclusions • Increasing the number of frames does not always reduce the number of page faults! • Depends on the algorithm. • FIFO suffers from this anomaly.

  43. Model, showing LRU State of memory array, M, after each item in reference string is processed Memory Disk 2 3 5 3

  44. Details of LRU in Model • Page is referenced  moved to top of column. • Other pages are pushed down (as in a stack). • If it is brought from disk  models page fault. • Observation: location of page in column not influenced by number of frames! • Conclusion: adding frames (pushing line down) cannot cause more page faults. • LRU does NOT suffer from Belady’s anomaly.

  45. Page, Frame, Page Table Multi-level page tables Cache hit / Cache miss Thrashing Effective access time Modified bit / Referenced bit Optimal page replacement Not recently used (NRU) FIFO page replacement 2nd Chance / Clock Least Recently Used (LRU) Working Set Working Set page replacement WSClock Belady's Anomaly Concepts for review

More Related