540 likes | 556 Views
Practical, transparent operating system support for superpages. Juan Navarro, Sitaram Iyer, Peter Druschel, Alan Cox OSDI 2002. Introduction. This paper addresses the issue of OS-level support for superpages A superpage is a page that is larger than the hardware base page.
E N D
Practical, transparent operating system support for superpages Juan Navarro, Sitaram Iyer, Peter Druschel, Alan Cox OSDI 2002
Introduction • This paper addresses the issue of OS-level support for superpages • A superpage is a page that is larger than the hardware base page. • Superpage capability must be present in the hardware – cannot be software-based.
Background • Page-based virtual memory • Page tables • Translation lookaside buffers (TLB) • Purpose • The problem with TLBs
Background Summary • Virtual memory automates the movement of a process’s address space (code and data) between disk and primary memory. • Virtual addresses must be translated to physical addresses using information stored in the page table. • Page tables are stored in primary memory.
Page Tables and Address Translation • Extra memory references due to page table degrades performance • TLB (translation lookaside buffer) – faster memory; caches portions of the page table • If most memory references “hit” in the TLB, the overhead of address translation is acceptable. • TLB coverage: the amount of memory that can be accessed strictly through TLB entries.
TLB Coverage – the Problem • Computer memories have increased in size faster than TLBs. • TLB coverage as a percentage of total memory has decreased over the years. • At the time this paper was written, most TLBs covered a megabyte or less of physical memory • Many applications have working sets that are not completely covered by the TLB • Result: more TLB misses, poorer performance.
The Situation • Problem: Reduced TLB coverage makes virtual memory systems less efficient. • A solution: Increase page size • Each TLB entry represents one page • Increasing page size increases TLB coverage • But …pages that are too large are inefficient: • Weaken locality and waste storage • Increase fragmentation
Superpages • Compromise: hardware supports more than one page size • Ordinary pages = base pages • Superpage: a power-of-2 multiple of the base page size • One superpage maps several base pages. • Let base page = 4KB and superpage = 64KB. Using superpages, TLB coverage is up to 16 times greater with no increase in TLB size
Hardware Support for Superpages • Superpage capability is common in modern computers; isn’t well supported by the OS. • Most modern computers provide several different page sizes • e.g. 64KB, 512KB, 4MB for an Alpha processor whose base page size is 8KB (authors’ example & testbed) • Although implemented for the Alpha chip, the design presented in this paper is general.
Why Several Page Sizes? • Large page sizes reduce the size of the page table, increase TLB coverage, optimize I/O time. • But … they can also greatly increase the memory requirements of a process • Some pages are only partially filled • Small localities = a kind of internal fragmentation (page only partially referenced) • If pages are not filled, paging traffic can actually increase instead of decrease.
Why Several Page Sizes? • Small page sizes reduce internal fragmentation (amount of wasted space in an allocated block). • But … they have all the problems that large pages solve, plus they also have the possibility of increasing page faults. • Solution: Use multiple page sizes
Allocated Space Allo Allocated and used space Allocated to A Free Space Allocated but unused space Allocated to B Allocated to C External Fragmentation Internal Fragmentation
Multiple Page Sizes Present Problems • Memory management becomes more complex • Uniform page size is simple • External fragmentation – reduces the opportunity to use superpages • Consequently most general purpose OS’s don’t use superpages, at least for user space.
SP1 SP1 SP4 leaves, is replaced by SP5. SP2 leaves. No room for a superpage External fragmentation SP2 SP3 SP3 SP5 SP4
Hardware-Imposed Constraints • Limited to page sizes provided by hardware • Must have enough contiguous free memory to store each superpage • Superpage addresses must be aligned on the superpage size: e.g., a 64KB SP must start at address 0, or 64KB, or 128KB, etc. • TLB entry only has one set of bits (R, M, etc.) and thus can only provide coarse-grained info – not good for efficient page management.
Design Decisions • Acquire base pages on demand and “promote” to superpage at some later date versus load entire superpage when one base page is faulted in • What size superpage should be created? • If base pages are acquired on demand • When to promote? • Reservation-based allocation: set aside space for a superpage when first base page is loaded versus Relocation-based: wait until a superpage is formed and then move existing pages to contiguous locations
Authors’ Assumptions • The virtual address space of a process is a collection of virtual memory objects: code, data, stack, heap, memory-mapped files, etc. • Each object is mapped contiguously to virtual address space • The virtual address space may be sparse – there may be gaps between objects. • OS will not automatically create a superpage when a new page is loaded – wait to see if it makes sense.
Issues: Allocation • Allocation: when a page is loaded because of a page fault it must be mapped to a physical frame. • In non-superpage systems any frame will do • In a superpage system we may later decide to include this page in a superpage: now we have to find room for the other pages that are contiguous with the one already loaded
Allocation Approaches • Relocation-based – incurs overhead of moving pages when superpages are created. • Reservation-based – how much space should you reserve? • Find a contiguous range of page frames, aligned on the correct address, to match a SP size • The authors developed a reservation-based system.
Reservation-based Allocation • When a page is initially loaded choose a superpage size and reserve contiguous frames to hold the eventual superpage. • Consider size and alignment • Don’t know if adjoining pages will ever be needed by the program • Decide now what the final superpage size will be.
Reservation-based Allocation – choosing the page size • Possibilities: • the largest superpage size available • a superpage size that most closely matches the VM object the page belongs to • a smaller size, based on memory availability. • Tradeoff: performance gains from large page versus possible loss of contiguous memory space that is needed later
Object mapping Mapped pages Virtual address space Superpage alignment boundaries Physical address space Allocated frames Unused page frame reservation Figure 2: Reservation based allocation
Relocation-based contiguity • Relocation-based methods have to re-copy pages into other frames when they decide to create a superpage. • This is less likely to cause fragmentation than reservation based, but has a heavier processing overhead, similar to compaction schemes. • Find contiguous space, move existing pages
Review - 3/31/09 • Superpage: a large page • Purpose: improve TLB coverage • Tradeoff: uniform (small) page size versus variable (large and small) page sizes • Simplicity versus complexity • No external fragmentation versus external fragmentation • Limited TLB coverage vs extended TLB coverage
Review - 3/31/09 • Constraints • Maintain address alignment • Manage fragmentation (maintain contiguity) • Restrict overhead • Relocation-based methods versus reservation-based methods • Copying overhead versus fragmentation
Outline • Issues for a superpage management system: • Allocation and fragmentation control (already discussed) • Promotion • Demotion • Eviction • Storage management • Details of Navarro, et al., system
Issues: Promotion • Initially, base pages are placed in a reserved block of frames, but are treated separately. • Promote when enough pages have been loaded to justify creating a superpage: • Combine TLB entries into one entry • Update page table to show new superpage size • Load remaining pages, if necessary • Promotion may be incremental • Tradeoff: early promotion (before all base pages have been faulted in) reduces TLB misses but wastes memory if all pages of the superpage are not needed
Issues: Demotion • Reduce superpage size • To individual base pages • To a smaller superpage • Required if memory is needed for new pages and unused base pages must be evicted (page replacement) • Difficulty: use bits and dirty bits aren’t as helpful as they are in the base page table.
Issues: Eviction • When memory is full, a superpage may be evicted • All its base pages are released. • If the dirty bit is set, the entire superpage must be written to disk, even if only part of it has changed. • If one of the pages is faulted in later, the process starts over
Design of System Proposed by Navarro, et al. • The system discussed in this paper is reservation-based. • It supports multiple superpage sizes to reduce internal fragmentation • It demotes infrequently referenced pages to reclaim memory frames • It is able to maintain contiguous pages without using compaction
Design Issues • Reservation-based allocation • Choosing a page size • Fragmentation control • Incremental promotions • Speculative demotions • Paging out dirty superpages • How does this system address the issues which have been previously identified?
Allocation in this system • A page fault triggers a decision: does the page have an existing reservation or not? • If not, then • select a preferred superpage size, • locate a set of contiguous, aligned frames • load the page into the correct frame • enter the mapping in the page table • reserve the remaining frames • Or, load the page into a previously reserved frame
Choosing a Superpage Size in This System • Since the decision is made early, can’t decide based on process’s behavior. • Base decision on the memory object type; prefer too large to too small • If the decision is too large, it is easy to reclaim the unneeded space • If the decision is too small, relocation is needed
Guidelines for Choosing Superpage Size • For fixed size memory objects (e.g. code segments) reserve the largest super page possible, considering alignment and existing reservations, that does not extend beyond the end of the object. • For dynamic-sized objects (stacks, heaps) that grow one page at a time: same guide- lines, but allow reservation to extend beyond end, to allow object to grow.
Preempting Reservations in This System • After a page fault, if the guidelines call for a superpage that is too large for any available free block: • Reserve a smaller size superpage or • Preempt an existing reservation that has enough unallocated frames to satisfy the request • This system uses preemption wherever possible.
Fragmentation Control • When different sizes of superpages are used in the same system physical memory can become fragmented. • Result: there are not enough large, properly aligned blocks of free memory. • Navarro et al. propose several implementation techniques to address this problem
Fragmentation Control in This System • The “buddy allocator” (free list manager) maintains multiple lists of free blocks, ordered by size • When possible, coalesce adjacent blocks of free memory to form larger blocks. • A page replacement daemon periodically selects pages to be swapped out. It is modified to include contiguity as one of the factors to be considered.
Promotion to Superpage Status in This System • How does a set of base pages become a superpage? • Suppose a superpage consists of 8 base pages – system reserves space for 8 when first is loaded. If other pages are referenced, load into reserved frames. • At some point, decide to treat as a super- page instead of several base pages.
Promotion in This System • If a sub-superpage is entirely populated, this system will promote it (incremental promotion); e.g. if 4 aligned pages of a 16 page superpage are faulted in, create a small superpage. • This system promotes only regions that are fully populated. • In some systems, promotion occurs if some fraction of the superpage is loaded. Before promoting, must load other pages.
Key Observation • Once a program accesses one page in a memory object, it is likely to access the rest of the pages shortly thereafter (or not at all): spatial locality • array references • mapped file • Conclude: If a superpage is not created soon after the initial pages are loaded, it probably isn’t going to happen.
Demotion (preemption) • Demotion is a side-effect of page replacement; when a base page is evicted, its superpage is demoted. • Demotion in this system is also recursively incremental. • Speculative demotion: demote active superpages to determine if the whole page is still in use or just parts. • When the paging daemon resets the R bit of a base page, demote accordingly if memory is scarce.
Paging Out Dirty Superpages • If a dirty superpage is to be flushed to disk, there is no way to tell if one page is dirty or all pages. • Writing out the entire superpage is a huge perfomance hit. • Navarro, et. al’s solution: Don’t write to clean superpages. • If a process tries to write to a clean SP, demote it. • Repromote later if all base pages are dirty.
Alternate Approach • The authors experimented with another method to allow their system to deduce if a base page had been modified. • Compute the cryptographic hash digest of a page’s contents when it is loaded; do so again when a page is flushed. If there is no change, the page is clean • Conclusion: too time consuming, but experiments with modifications were planned.
Tracking Reservations • Multi-list reservation scheme • One list for each hardware page size • A reserved block is placed on a list according to how large an extent could be preempted, without affecting allocated pages. • A reservation for 64KB may have only 8KB contiguous, aligned, un-allocated memory • Each list sorted by how recently reserved • Preempt from head of list (least recently allocated) • Fully populated pages aren’t in reservation lists
More Design Issues • Population map • Tracks allocated base pages • When a page fault occurs, can be used to find out if the page has a reservation • Also useful for deciding when to promote • Helps to identify unallocated regions in existing reservations.
Goal of Superpage Management Systems • Good TLB coverage with minimal internal fragmentation • Conclusion: create the largest superpage possible that isn’t larger than the size of the memory object (except for stack/heap). • If there isn’t enough memory, preempt existing reservations (these pages had their chance)
Current Usage • Superpages are most often used today to store portions of the kernel and various buffers. • Reason: the memory requirements for these objects are static and can be known in advance. • Superpage size can be chosen to fit the object. • Superpage use in application space is the harder issue.
Current Research • This paper focuses on the use of superpages in application memory, as opposed to kernel memory. • An ongoing research area: memory compaction – whenever there are idle CPU cycles, work to establish large contiguous blocks of free memory • Compare to disk management
Summary: Potential Advantages of Superpages • Ideally, superpages can improve performance • Without increasing size of TLB (which would be expensive and reduce TLB access time) • Without increasing base page size (which can lead to internal fragmentation) • Superpages allow use of small (base) and large (super) page sizes at the same time.
Summary - Tradeoff • Large superpages increase TLB coverage • Large superpages are more likely to fragment memory. (Why?) • Benefits of large superpages must be weighed against “contiguity restoration techniques” • Pages loaded into reserved areas must be loaded at the proper offset. • Must be enough space for the entire superpage • More overhead for free space management