320 likes | 400 Views
Resource Management: Beancounters. Agenda. Current state of resource management in the Linux kernel Beancounters overview User memory management I/O accounting Kernel memory management Network buffers accounting Performance. Current state. Per-process accounting and limiting (rlimits)
E N D
Agenda • Current state of resource management in the Linux kernel • Beancounters overview • User memory management • I/O accounting • Kernel memory management • Network buffers accounting • Performance
Current state • Per-process accounting and limiting (rlimits) • Manages individual processes • Memory limits are mostly ignored by the kernel • Group-based management • Absent • Global statistics • Not suitable for group isolation
Operating system resources • Memory • CPU time • IO bandwidth • Networking bandwidth • Disk space
Agenda • Current state of resource management in the Linux kernel • Beancounters overview • User memory management • I/O accounting • Kernel memory management • Network buffers accounting • Performance
Beancounters basics • A beancounter manages a group of tasks • Resource counters parameters • held – the current consumption level • limit – the maximal allowed level of consumption • barrier – the "shortage warn" line – each resource controller may take some precautions • fails – the number of allocation rejects • Beancounter is assigned once during process lifetime
Accounting details User space Kernel space Beancounter Process kernel object
Miscellaneous resources Number of tasks Number of files Number of sockets Number of file locks Number of PTYs Number of signals Active dentry cache Beancounters controlled resources • User memory • Length of mappings • RSS • Locked pages • Dirty page cache • Kernel memory • Network buffers
Agenda • Current state of resource management in the Linux kernel • Beancounters overview • User memory management • I/O accounting • Kernel memory management • Network buffers accounting • Performance
User memory management • VMA lengths accounting • Graceful rejects of VM region allocation • Take precautions against overcommitment • RSS accounting • Real memory usage • OOM killer priorities • Dirty page cache accounting • IO statistics and scheduling
VMA lengths accounting “Lengths of mappings” resource • VMAs classification • unreclaimable:private and anonymous • reclaimable:shared file mappings “RSS” resource Task address space Reclaimable VMAs Unused pages Used pages Unreclaimable VMAs • Pages classification • unused:parts of mapped regions • used:touched pages
Cons Hard limiting of address space growth VMA lengths accounting pros'n'cons • Pros • The way to track the host commitment level • Graceful rejects of address space growths
RSS accounting Drawbacks • Additional pointer on the struct page • Extra locking during page faults First touch N Touches beancounter page beancounter page
Shared pages accounting • Account the page to the first beancounter • Non uniform statistics for similar beancounters • Account a whole page for each beancounter • The values accounted are not related to the actual memory usage • Account page's fractions the all beancounters • The “middle” way used in the beancounters
Page fractions accounting Algorithm benefits • O(1) algorithm of adding and removing • The sum of RSS on all beancounters is an amount of all actually used pages ¼ 1 ½ BC1 BC4 BC3 ¼ ¼ ½ ¼ BC2
Agenda • Current state of resource management in the Linux kernel • Beancounters overview • User memory management • I/O accounting • Kernel memory management • Network buffers accounting • Performance
Dirty page cache accounting First touch N Touches Dirty Clean IO beancounter Last unmap Unmap
Cons Performance issues Memory consumption by auxiliary data structures RSS accounting pros'n'cons • Pros • Node memory utilization statistics • Asynchronous IO scheduling • Ground for fair page reclamation
Agenda • Current state of resource management in the Linux kernel • Beancounters overview • User memory management • I/O accounting • Kernel memory management • Network buffers accounting • Performance
Kernel memory management Reason • Limited normal zone • Mainly for 32-bit arches Major problem • Object freeing context • Reference counters • RCU
Kernel MM data structures (pages) • Buddy page allocator • Additional pointer on the struct page • Vmalloc • 0th page's pointer page struct vm_struct ...
Kernel MM data structures (slab) • Array of pointers after the slab kmem_bufctl_t[N] N objects ... ... ... struct slab beancounters
Kernel MM drawbacks • A slab can carry less objects • Slabs could become “offslab”
Cons No (all are already optimized out) Kernel MM pros'n'cons • Pros • Tracking of kernel memory usage
Agenda • Current state of resource management in the Linux kernel • Beancounters overview • User memory management • I/O accounting • Kernel memory management • Network buffers accounting • Performance
Network buffers accounting Mainstream accounting shortcomings • slab overhead is not included • up to 30% for usual Ethernet frames • unpredictable difference for non-ethernet MTU • no way to recalculate skb->truesize
Implementation basics • Separate accounting for • send and receive buffers • TCP and all the other types of traffic • Implementation is straightforward: • account actual memory usage for objects with undefined or infinite lifetime • select(2) compatibility • Buffer space guarantees
Packets context handling beancounter process Network socket SKB SKB
Agenda • Current state of resource management in the Linux kernel • Beancounters overview • User memory management • I/O accounting • Kernel memory management • Network buffers accounting • Performance
Performance • RSS accounting – the bottleneck
Main future directions • Optimization • Pre-charging • Kernel memory • VMAs lengths • On-demand accounting • Active dentry cache • RSS • RSS limits • Page reclamation • Better TCP window management
That's all folks • Questions? • Comments? http://download.openvz.org/~xemul/