Memory Management and RMAP VM of 2.6

Memory Management and RMAP VM of 2.6 By A.R.Karthick (karthick_r@infosys.com)

Memory Hierarchies L1 cache T I M E L2 cache RAM Hard Disk

Page Tables • Define the virtual to physical mapping • Page directory,page mid level directory,page table entry define the course of translation • Example: PGD 10 bits PTE 10 BITS (PMD folded in 32 bit) 0000 0000 00 | 00 1000 0000 | 1100 0000 1111 -> (0x00080c0f) pgd index(0) pte_index(1 << 7) , pmd is folded to pgd

Page Table Entry Status Bits (PTE Entry) PAGE_PRESENT PAGE_RW PAGE_USER PAGE_RESERVED PAGE_ACCESSED PAGE_DIRTY INTERNAL_STATUS

Page Fault • Processor Exception raised when there is a problem mapping the virtual address to physical address. • Handled by do_page_fault in arch/i386/mm/fault.c. • Write protection faults or COW faults map to do_wp_page. • For pages in swap, do_swap_page is called. • For pages not found, do_no_page is called that either faults in an anonymous zero page or an existing page. • Page faults populate the LRU cache.

Page Replacement Algorithms • Optimal Replacement  Not possible • Not Recently Used (NRU)  Crude hack • FIFO  Inefficient • Second Chance  Better than above • Clock Replacement  Efficient than above

Page Replacement Algorithms • LRU – Least Recently used replacement • NFU – Not Frequently Used replacement • Page Ageing based replacement • Working Set algorithm based on locality of references per process • Working Set based clock algorithms • LRU with Ageing and Working Set algorithms are efficient to use and are commonly used

Page replacement handling in Linux Kernel • Page Cache • Pages are added to the Page cache for fast lookup. • Page cache pages are hashed based on their address space and page index • Inode or disk block pages, shared pages and anonymous pages form the page cache. • Swap cached pages also part of the page cache represent the swapped pages. • Anonymous pages enter the swap cache at swap-out time and shared pages enter when they become dirty.

LRU CACHE • LRU cache is made up of per zone active lists and inactive lists. • Per-CPU lru active and inactive page vectors make lru cache additions faster. • These lists are populated during page faults and when page cached pages are accessed or referenced. • kswapd is the page out kernel thread per node that balances the LRU cache and trickles out pages based on an approximation to LRU algorithm. • Page stealing is performed on a page vector or performed in batches. • Active state • Inactive dirty state • Inactive clean state • Per-CPU cold pages

Zone Balancing • Kswapd performs zone balancing based on pages_high, pages_low and pages_min • Zone is considered balance with its free pages above pages_high • The page out process takes a page by scanning inactive pages in batches. • Batch page stealing scales well for large physical memory.

RMAP • Maintains mapping of a page to a pte/virtual address • Greatly speeds up the page unmap path without scanning the process virtual address space • Unmapping of shared pages is greatly improved because of availability of pte mappings for shared pages • Page faults are reduced because pte entries are unmapped only when required. • Reduced search space during page replacement as only inactive pages are touched. • Low overhead involved in adding reverse mapping during fork, page fault , mmap and exit paths.

RMAP struct pte_chain { unsigned long next_and_idx; pte_addr_t ptes[NRPTE]; }____cachelinealigned; • next_and_idx field contains both the index to the next pte in the same chain or a pointer to the next pte chain ,thus aiding in fast pte chaining. • pte chains have free slots at the top or the head of the chain and additions happen from the tail. • process mm_struct pointer is kept in the pages address space, that is used during swapout times.

VM-Overcommit Policies • Commit more than available/actual memory space which includes the swap space to the process. • Overcommit policies can be set through sysctl vm.overcommit_{memory,ratio} • 0 indicates no overcommit • 1 indicates overcommit totally. • 2 indicates overcommit with overcommit_ratio on total ram pages plus total swap space pages. • mmap,mprotect, munmap, brk, shared memory, affect overcommit.

References • Primarily Linux Kernel Source Code 2.6 • Towards an O(1) VM by Rik Van Riel – Proceedings of the Linux Symposium –Ottawa

Memory Management and RMAP VM of 2.6

Memory Management and RMAP VM of 2.6

Presentation Transcript

Memory Management and Virtual Memory

Memory Management

Memory Management and Debugging

Memory Management: Overlays and Virtual Memory

Memory management

SpaceWire RMAP IP Core

Computer Architecture Virtual Memory (VM) – x86

z/VM and Linux Performance Management

Thrashing and Memory Management

Memory Management and Paging

Computer Architecture Virtual Memory (VM)

Overview of Memory Management

Memory Management and Processor Management

Computer Architecture Virtual Memory (VM) – x86

Computer Architecture Virtual Memory (VM) – x86

Computer Architecture Virtual Memory (VM)

Review of Memory Management, Virtual Memory