1 / 28

Practical, transparent operating system support for superpages

Practical, transparent operating system support for superpages. Juan Navarro, Sitaram Iyer, Peter Druschel, Alan Cox (Rice University) Appears in: Fifth Symposium on Operating Systems Design and Implementation (OSDI 2002) Presented by: David R. Choffnes. Outline. The superpage problem

abeni
Download Presentation

Practical, transparent operating system support for superpages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Practical, transparent operating system support for superpages Juan Navarro,Sitaram Iyer, Peter Druschel, Alan Cox (Rice University) Appears in: Fifth Symposium on Operating Systems Design and Implementation (OSDI 2002) Presented by: David R. Choffnes

  2. Outline • The superpage problem • Related Approaches • Design • Implementation • Evaluation • Conclusion

  3. Introduction • TLB coverage • Definition • Effect on performance • Superpages • Wasted memory • Fragmentation • Contribution • General, transparent superpages • Deals with fragmentation • Contiguity-aware page replacement algo • Demotion/Eviction of dirty superpages

  4. 30% TLB miss overhead: 5% 5-10% The Superpage Problem • TLB coverage trend TLB coverage of % of main memory Factor of 1000 decrease in 15 years

  5. The Superpage Problem • Increasing TLB coverage • More TLB entries is expensive • Larger page size leads to internal fragmentation and increased I/O • Solution: use multiple page sizes • Superpage definition • Hardware-imposed constraints • Finite set of page sizes (subset of powers of 2) • Contiguity • Alignment

  6. A superpage TLB Alpha: 8,64,512KB; 4MB Itanium: 4,8,16,64,256KB; 1,4,16,64,256MB virtual memory base page entry (size=1) physical address virtual address superpage entry (size=4) TLB physical memory

  7. Superpage Issues and Tradeoffs • Allocation • Relocation • Reservation

  8. A B C D A C D D A C Issue 1: superpage allocation virtual memory B superpage boundaries physical memory B • How / when / what size to allocate?

  9. Superpage Issues (Cont.) • Promotion • Incremental • Timing (not too soon, not too late) • Demotion and Eviction • Hardware reference and dirty bit limitation

  10. Wait for app to touch pages? May lose opportunity to increase TLB coverage. Create small superpage? May waste overhead. Issue 2: promotion • Promotion: create a superpage out of a set of smaller pages • mark page table entry of each base page • When to promote? Forcibly populate pages? May cause internal fragmentation.

  11. Superpage Issues: Fragmentation • Fragmentation • Memory becomes fragmented due to • use of multiple page sizes • persistence of file cache pages • scattered wired (non-pageable) pages • Contiguity as contended resource

  12. Related Approaches • HP-UX and IRIX Reservations • Not transparent • Page Relocation • Used exclusively, leads to lower performance due to increased TLB misses • Hardware Support • Talluri and Hill: Remove contiguity requirement This approach: Hybrid reservation and relocation system with page replacement that biases toward pages that contribute to contiguity

  13. Design • Reservation-based superpage management • Multiple superpage sizes • Demotion of sparsely referenced superpages • Preservation of contiguity w/o compaction • Efficient disk I/O for partially modified SPs • Uses buddy allocator for contiguous regions

  14. Key observation • Example: array initialization • Opportunistic policies • superpages as large and as soon as possible • as long as no penalty if wrong decision Once an application touches the first page of a memory object then it is likely that it will quickly touch every page of that object

  15. Reservations • Set of frames initially reserved at page fault • Fixed-size objects: largest aligned superpage that is not larger than the object • Dynamic objects: same as fixed, but reservation is allowed to extend beyond the end of the object • Preemption • If no available memory for allocation request, system will preempt the reservation whose most recent page allocation occurred least recently

  16. Managing reservations largest unused (and aligned) chunk 4 2 1 best candidate for preemption at front: • reservation whose most recently populated frame was populated the least recently

  17. Other Design Issues • Fragmentation control • Coalescing • Contiguity-aware page replacement • Incremental promotions • Occurs as soon as a superpage region is fully populated • Speculative demotion • Occurs on eviction (recursively) • Occurs on first write to clean superpage • Overhead too high for hash digests • Daemon periodically demotes pages speculatively • Necessary due to reference bit limitation

  18. Incremental promotions Promotion policy: opportunistic 2 4 4+2 8

  19. More Design Issues • Multi-list reservation scheme • One list of each page size supported by hardware • Reservations sorted by allocation recency • Preemption removes from head of list • Reservation recursively broken into extents • Fully populated extents are not put in reservation lists • Population map • Reserved frame lookup • Overlap avoidance • Promotion decisions • Preemption assistance

  20. Implementation Notes • FreeBSD uses three lists of pages in A-LRU order: active, inactive, cache • Contiguity-aware page daemon • Cache considered available for allocation • Daemon activated when contiguity falls low • Clean file-backed pages moved to inactive as soon as file is closed • Wired page clustering • Multiple mappings

  21. Evaluation • Setup • FreeBSD 4.3 • Alpha 21264, 500 MHz, 512 MB RAM • 8 KB, 64 KB, 512 KB, 4 MB pages • 128-entry DTLB, 128-entry ITLB • Unmodified applications

  22. Best-Case Results • TLB miss reduction usually above 95% • SPEC CPU2000 integer • 11.2% improvement (0 to 38%) • SPEC CPU2000 floating point • 11.0% improvement (-1.5% to 83%) • Other benchmarks • FFT (2003 matrix): 55% • 1000x1000 matrix transpose: 655% • 30%+ in 8 out of 35 benchmarks

  23. Benefits of multiple page sizes Speedups TLB Miss Reduction

  24. Sustained benefits • Use Web server to fragment memory, then use FFTW to see how quickly memory is reclaimed • FFTW reaches a speedup of almost 55%, Web server performance degrades only 1.6% on successive run • Concurrent execution: only 3% degradation with modified page daemon

  25. no frag control frag control no speedup full speedup partial speedup web server FFT FFT FFT FFT web server FFT FFT FFT FFT Fragmentation control normalized contiguity of free memory .8 .6 .4 .2 time 0 10min

  26. Adversary applications • Incremental promotion • Slowdown of 8.9%, 7.2% is hardware-specific • Sequential access • 0.1% degradation • Preemption • 1.1% degradation • General overhead • Use superpage supporting mechanisms, but don’t promote: 1-2% performance degradation

  27. Cetera • Dirty Superpages • Performance penalty of not demoting is a factor of 20 • Scalability • Most operations O(1), O(S) or O(S*R) • Daemon, promotion, demotion and dirty/reference bit emulation are linear • Promotion/Demotion is amortized to O(S) for programs the need to change page size only early in life • Dirty/Reference bits: Motivates the need for clustered page tables either in OS or HW

  28. Conclusion • Effective, transparent and efficient support for superpages • Demonstrates effectiveness of multiple page sizes • Improved performance for nearly all applications • Minimal overhead • Scalable to large numbers of page sizes

More Related