450 likes | 584 Views
Introduction to Systems Programming Lecture 7. Paging. Reminder: Virtual Memory. Processes use virtual address space Every process has its own address space The address space can be larger than physical memory.
E N D
Reminder: Virtual Memory • Processes use virtual address space • Every process has its own address space • The address space can be larger than physical memory. • Only part of the virtual address space is mapped to physical memory at any time using page table. • MMU & OS collaborate to move memory contents to and from disk.
Virtual Address Space Backing Store
Virtual Memory Operation What happens when reference a page in backing store? Recognize location of page Choose a free page Bring page from disk into memory Above steps need hardware and software cooperation
Steps in Handling a Page Fault Avishai Woollecture 7 - 5
Where is the non-mapped memory? • Special disk area (“page file” or swap space) e.g. C:\pagefile.sys • Each process has part of its memory inside this file • The mapped pages, that are in memory, could be more updated than the disk copies (if they were written to “recently”)
Issues with Virtual Memory • Page table can be very large: • 32 bit addresses, 4KB pages (12-bit offsets) 220 pages == over 1 million pages • Each process needs its own page table • Page lookup has to be very fast: • Instruction in 4ns page table lookup should be around 1ns • Page fault rate has to be very low.
Multi-Level Page Tables • Example: split the 32-bit virtual address into: • 10-bit PT1 field (indexes 1024 entries) • 10-bit PT2 field • 12-bit offset 4KB pages 220 pages (1 million) • A 1024-entry index, with 4 bytes per entry, is exactly 4KB == a page • But: not all page table entries kept in memory!
Second-level page table MMU does 2 lookups for each virtual physical mapping Top-level page table Each top level entry points to 210 x 212 = 4MB of memory
Why Multi-Level Page Tables Help? • Usually large parts of address space are unused. Example (cont.): • Process gets 12MB of address space (out of the possible 4GB) • Program instructions in low 4MB • Data in next 4MB of addresses • Stack in top 4MB of addresses
Second-level page table Top-level page table • Only 4 page tables need to be in memory: • Top Level Table • Three 2nd levels tables
Page lookup has to be very fast • A memory-to-register copy command: 100000: … 100004: mov bx, *200012 100008: … • References 2 memory addresses: • Fetch instruction (address=100004) • Fetch data (address = 200012)
Paging as a Cache Mechanism • Idea behind all caches: overcome slow access by keeping few,most popular items in fast access area. • Works because some items much more popular than other. • Page table acts as a cache (items = pages)
Cache Concepts • Cache Hit (Miss): requested item is (not) in cache : • in Paging terminology “miss” == page-fault • Thrashing: lots of cache misses • Effective Access Time:Prob(hit)*hit_time + Prob(miss)*miss_time
Example: Cost of Page Faults • Memory cycle: 10ns = 10*10-9 • Disk access: 10ms = 10*10-3 • hit rate 98%, page-fault rate = 2% • effective access time = 0.98* 10*10-9 + 0.02* 10*10-3 = 2.00098*10-4 This shows that the real page-fault rate is a lot smaller!
Page Replacement Algorithms Which page to evict?
Page Table Entry Fields • Present = 1 page is in a frame • Protection: usually 3 bits, “RWX” • R = 1 read permission • W = 1 write permission • X = 1 execute permission (contains instructions) • Modified = 1 page was written to since being copied in from disk. Also called dirty bit. • Referenced = 1 page was read “recently”
The Modified and Referenced bits • Bits are updated by MMU • Give hints for OS page fault processing, when it chooses which page to evict: • If Modified = 0 then no need to copy page back to disk; the disk and memory copies are the same. • So evicting such a page is cheaper • Better to evict pages with Referenced=0: not used recently maybe won’t be used in near future.
Page Replacement Algorithms • Page fault forces a choice • OS must pick a page to be removed to make room for incoming page • If evicted page is modified page must first be saved • If not modified: can just overwrite it • Try not to evict a “popular” page • will probably need to be brought back in soon
Optimal Page Replacement Algorithm • Replace page needed at the farthest point in future • Suppose page table can hold 8 pages, and list of page requests is • 1,2,3,4,5,6,7,8,9,8,9,8,7,6,5,4,3,2,1 Page fault Evict page 1
Optimal not realizable • OS does not know which pages will be requested in the future. • Try to approximate the optimal algorithm. • Use the Reference & Modified bits as hints: • Bits are set when page is referenced, modified • Hardware sets the bits on every memory reference
FIFO Page Replacement Algorithm • Maintain a linked list of all pages • in order they came into memory (ignore M & R bits) • Page at beginning of list replaced • Disadvantage • page in memory the longest time may be very popular
Not Recently Used • At process start, all M & R bits set to 0 • Each clock tick (interrupt), R bits set to 0. • Pages are classified • not referenced, not modified • not referenced, modified (possible!) • referenced, not modified • referenced, modified • NRU removes page at random • from lowest numbered non empty class
2nd Chance Page Replacement • Holds a FIFO list of pages • Looks for the oldest page not referenced in last tick. • Repeat while oldest has R=1: • Move to end of list (as if just paged in) • Set R=0 • At worst, all pages have R=1, so degenerates to regular FIFO.
Operation of a 2nd chance • pages sorted in FIFO order • Page list if fault occurs at time 20, A has R bit set(numbers above pages are loading times)
Least Recently Used (LRU) • Motivation: pages used recently may be used again soon • So: evict the page that has been unused for longest time • Must keep a linked list of pages • most recently used at front, least at rear • update this list every memory reference !! • Alternatively keep counter in each page table entry • choose page with lowest value counter • periodically zero the counter
Simulating LRU with Aging • More sophisticated use the R-bit • Each page has a b-bit counter • Each clock interrupt the OS does: counter(t+1) = 1/2*counter(t) + R*2(b-1) [same as right-shift & insert R bit as MSB] • After 3 ticks: Counter = R3*2(b-1) +R2*2(b-2) +R1*2(b-3) • Aging only remembers b ticks back • Most recent reference has more weight
Example: LRU/aging • Hardware only supports a single R bit per page • The aging algorithm simulates LRU in software • Remembers 8 clock ticks back 120 176 136 32 88 40
The Working Set Idea: processes have locality of reference: • At any short “time window” in their execution, they reference only a small fraction of the pages. • Working set = set of pages “currently” in use. • W(k,t) = set of pages used in last k ticks at time t • If working set is in memory for all t no page faults. • If memory too small to hold working set thrashing (high page fault rate).
Evolution of the Working Set Size with k • w(k,t) is the size of the working set at time t • It isn’t worthwhile to keep k very large k
The Working Set Page Replacement Algorithm • Each page has • R bit • Time-of-last-use (in virtual time) • R bit cleared every clock tick • At page fault time, loop over pages • If R=1 put current time in time-of-last-use • If R=0: calculate age = current-time - time-of-last-use • If age > τ : not in W.S., evict • Hardware does not set the “time-of-last-use”: it’s an approximate value set by OS.
The WSClock Algorithm • [Carr and Hennessey, 1981] • Keep pages in a circular list • Each page has R, M, and time-of-last-use • “clock” arm points to a page
WSClock - details At page fault: • If R=1, set R=0, set time-of-use, continue to next page • Elseif age > τ and M=0 (clean page) use it • Elseif age > τ and M=1 (dirty page) • Schedule a write to disk, and continue to next page • If hand goes all the way around: • If a write to disk was scheduled, continue looking for a clean page: eventually a write will complete • Else pick some random (preferably clean) page.
Belady's Anomaly: FIFO P's show which page references cause page faults FIFO with 3 page frames FIFO with 4 page frames
Belady’s anomaly - conclusions • Increasing the number of frames does not always reduce the number of page faults! • Depends on the algorithm. • FIFO suffers from this anomaly.
Model, showing LRU State of memory array, M, after each item in reference string is processed Memory Disk 2 3 5 3
Details of LRU in Model • Page is referenced moved to top of column. • Other pages are pushed down (as in a stack). • If it is brought from disk models page fault. • Observation: location of page in column not influenced by number of frames! • Conclusion: adding frames (pushing line down) cannot cause more page faults. • LRU does NOT suffer from Belady’s anomaly.
Page, Frame, Page Table Multi-level page tables Cache hit / Cache miss Thrashing Effective access time Modified bit / Referenced bit Optimal page replacement Not recently used (NRU) FIFO page replacement 2nd Chance / Clock Least Recently Used (LRU) Working Set Working Set page replacement WSClock Belady's Anomaly Concepts for review