220 likes | 475 Views
Paging Algorithms. Vivek Pai / Kai Li Princeton University. Virtual Memory Gedankenexperiment. Assume memory costs $20 per 256MB What does it cost to fill a 32-bit system? What does it cost to fill a 64-bit system? What about at $1 per 256MB?
E N D
Paging Algorithms Vivek Pai / Kai Li Princeton University
Virtual Memory Gedankenexperiment • Assume memory costs $20 per 256MB • What does it cost to fill a 32-bit system? • What does it cost to fill a 64-bit system? • What about at $1 per 256MB? • What implications does this have for the design of virtual to physical translation when using 64-bit address spaces? • (hint: think hierarchical page tables)
Memory Hierarchy Revisited CPU What does this imply about L1 addresses? Where do we hope requests get satisfied? TLB L1 L2 Main Memory Devices
Memory Hierarchy Re-Revisited CPU Now what does this imply about L1 addresses? Any speed benefits? Any drawbacks? L1 TLB L2 Main Memory Devices
Definitions • Paging – moving pages to (from) disk ex: paging begins five minutes into the test • Pressure – the demand for some resource (often used when demand exceeds supply) ex: the system experienced memory pressure • Optimal – the best (theoretical) strategy • Eviction – throwing something out ex: cache lines and memory pages got evicted • Pollution – bringing in useless pages/lines ex: this strategy causes high cache pollution
Big Picture VM fault ref Load M i Page table Free frame
Really Big Picture • Every “page-in” requires an eviction • Hopefully, kick out a less-useful page • Dirty pages require writing, clean pages don’t • Where do you write? To “swap space” • Goal: kick out the page that’s least useful • Problem: how do you determine utility? • Heuristic: temporal locality exists • Kick out pages that aren’t likely to be used again
More definitions • Thrashing / Flailing – extremely high rate of paging, usually induced by other decisions • Dirty/Clean – indicates whether modifications have been made versus copy on stable storage • Heuristic – set of rules to use when no good rigorous answer exists • Temporal – in time • Spatial – in space (location) • Locality – re-use – it makes the world go round
What Makes This Hard? • Perfect reference stream hard to get • Every memory access would need bookkeeping • Imperfect information available, cheaply • Play around with PTE permissions, info • Overhead is a bad idea • If no memory pressure, ideally no bookkeeping • In other words, make the common case fast
Steps in Paging • Data structures • A list of unused page frames • Data structure to map a frame to its pid/ virtual address • On a page fault • Get an unused frame or a used frame • If the frame is used • If it has been modified, write it to disk • Invalidate its current PTE and TLB entry • Load the new page from disk • Update the faulting PTE and invalidate its TLB entry • Restart the faulting instruction
Optimal or MIN • Algorithm: • Replace the page that won’t be used for the longest time • Pros • Minimal page faults • This is an off-line algorithm for performance analysis • Cons • No on-line implementation • Also called Belady’s Algorithm
Not Recently Used (NRU) • Algorithm • Randomly pick a page from the following (in order) • Not referenced and not modified • Not referenced and modified • Referenced and not modified • Referenced and modified • Pros • Easy to implement • Cons • Not very good performance, takes time to classify
First-In-First-Out (FIFO) • Algorithm • Throw out the oldest page • Pros • Low-overhead implementation • Cons • May replace the heavily used pages Recentlyloaded Page out 5 3 4 7 9 11 2 1 15
FIFO with Second Chance Recentlyloaded Page out 5 3 4 7 9 11 2 1 15 • Algorithm • Check the reference-bit of the oldest page • If it is 0, then replace it • If it is 1, clear the reference-bit, move it to end of list, and continue searching • Pros • Fast and does not replace a heavily used page • Cons • The worst case may take a long time If reference bit is 1
Clock: A Simple FIFO with 2nd Chance Oldest page • FIFO clock algorithm • Hand points to the oldest page • On a page fault, follow the hand to inspect pages • Second chance • If the reference bit is 1, set it to 0 and advance the hand • If the reference bit is 0, use it for replacement • What is the difference between Clock and the previous one?
Enhanced FIFO with 2nd-Chance Algorithm • Same as the basic FIFO with 2nd chance, except that this method considers both reference bit and modified bit • (0,0): neither recently used nor modified • (0,1): not recently used but modified • (1,0): recently used but clean • (1,1): recently used and modified • Pros • Avoid write back • Cons • More complicated
More Page Frames Fewer Page Faults? • Consider the following reference string with 4 page frames • FIFO replacement • 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 • 10 page faults • Consider the same reference string with 3 page frames • FIFO replacement • 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 • 9 page faults! • This is called Belady’s anomaly
Least Recently Used (LRU) • Algorithm • Replace page that hasn’t been used for the longest time • Question • What hardware mechanisms required to implement LRU?
Implement LRU • Perfect • Use a timestamp on each reference • Keep a list of pages ordered by time of reference Least recently used Mostly recently used 5 3 4 7 9 11 2 1 15
Approximate LRU Most recently used Least recently used LRU N categories pages in order of last reference Crude LRU 2 categories pages referenced since the last page fault pages not referenced since the last page fault 8-bit count . . . 256 categories 0 1 2 3 254 255
Aging: Not Frequently Used (NFU) 00000000 00000000 10000000 01000000 10100000 00000000 10000000 01000000 10100000 01010000 10000000 11000000 11100000 01110000 00111000 • Algorithm • Shift reference bits into counters • Pick the page with the smallest counter • Main difference between NFU and LRU? • NFU has a short history (counter length) • How many bits are enough? • In practice 8 bits are quite good • Pros: Require one reference bit • Cons: Require looking at all counters 00000000 00000000 00000000 10000000 01000000
Where Do We Get Storage? • 32 bit VA to 32 bit PA – no space, right? • Offset within page is the same • No need to store offset • 4KB page = 12 bits of offset • Those 12 bits are “free” in PTE • Page # + other info <= 32 bits • Makes storing info easy