310 likes | 480 Views
Region-Based Memory Management. Region-based Memory Management. Regions represent areas of memory Objects are allocated “in” a given region Various deallocation options Various safety (no access of freed objects) options. Region r = newregion(); for (i = 0; i < 10; i++) {
E N D
Region-based Memory Management • Regions represent areas of memory • Objects are allocated “in” a given region • Various deallocation options • Various safety (no access of freed objects) options Region r = newregion(); for (i = 0; i < 10; i++) { int *x = ralloc(r, (i + 1) * sizeof(int)); work(i, x); } deleteregion(r);
Policy choices • Deallocation • Garbage collection (GC) • per-object free (per-object) • region deletion (all-at-once) • Safety • none (none) • reachability (GC) • per-region reference counting (RC) • statically checked (static)
Some Existing Region Systems Deallocation Safety arenas all-at-once none apache all-at-once none zones per-object none Stoutamire GC GC vmalloc all-at-once or none per-object TT all-at-once static CWM all-at-once static C@/RC all-at-once RC
Why Regions ? • per-region allocation/deallocation policies • zones (D.T. Ross, 1967), vmalloc (K. Vo, 1996) • performance • arenas (D. Hanson, 1990) • locality benefits • Stoutamire (1997) • expressiveness • apache, arenas, C@/RC • target for compiler-inferred memory management • Tofte & Talpin (1994), Crary, Walker, Morisett (1999)
Why Regions ? (more reasons) • statically guaranteed memory safety • CWM (1999) • target for garbage collection • Wang & Appel (2001)
Region Performance: Allocation and Deallocation a region • Applies to all-at-once only • Basic strategy: • allocate a big block of memory • individual allocation is: • pointer increment • overflow test • deallocation frees the list of big blocks • all operations are fast wastage alloc point
Region Performance:Locality • Regions can express locality: • Sequential allocs in a region can share cache line • Allocs in different regions less likely to pollute cache for each other • Example: Moss • 24% faster when frequently accessed, small objects placed in different region than infrequently accessed, large object
Locality: moss • 1-region version: small & large objects in 1 region • 2-region version: small & large objects in 2 regions • 45% less cycles lost to r/w stalls in 2-region version
Region Expressiveness • Adds some structure to memory management • Few regions: • easier to keep track of • delay freeing to convenient "group" time (e.g., end of an iteration, closing a device, etc) • No need to write "free this data structure" functions
Region Static Checking:Region Type Systems • Basic idea: name regions in types • A simple region type system: • = int | region @ | <1, …, n> @ | ' | . • : region variables • Example: • .(<int, int> @ int)
Region Static checking:Tofte & Talpin • Regions follow stack discipline • letregion in e: • allocate a region named • evaluate e (can use ) • delete region • safe if: • (region) type of e does not use • is not free in the letregion's environment • deallocation of regions is required... • problem: pure stack discipline too restrictive ("leaks") • Aiken, Fähndrich, Levien: allocate late, deallocate early • Tofte & Talpin: other optimisations
Region Static Checking:Capabilities • Crary, Walker, Morisett: • capabilities available at each program point: • 1: read objects in , allocate in , freergn • guarantee: no other regions alias (so freergn safe) • +: read objects in , allocate in • 1 < + (capability "subtyping") • capabilities threaded through the program: • newrgn adds 1 to the current capabilities • freergn removes 1 from current capabilities • function calls can temporarily "lose" capabilities (but recoverable on return) • no capabilities allowed at exit: • deallocation of regions is required
Static Checking Limitations • Some types are not expressible: list of regions • Ease of programming is unknown • No clear bounds on memory usage
Region Dynamic checking:RC • Features of RC: • region-based allocation: • newregion/deleteregion/ralloc • safety via reference-counting (RC): • RC(region r) = number of references to objects in r from outside r • deleteregion(r) fails if RC(r) > 0 • type annotations to describe program's region structure
struct list { int i; struct list @next; } *a, *b; Region r = newregion(); Example a b r 0 RC
struct list { int i; struct list @next; } *a, *b; Region r = newregion(); b = rcons(r, 77, null); Example a 77 b r 1 RC
struct list { int i; struct list @next; } *a, *b; Region r = newregion(); b = rcons(r, 77, null); a = rcons(r, 23, b); Example a 23 77 b r 2 RC
struct list { int i; struct list @next; } *a, *b; Region r = newregion(); b = rcons(r, 77, null); a = rcons(r, 23, b); b->next = a; Example a 23 77 b r 2 RC
struct list { int i; struct list @next; } *a, *b; Region r = newregion(); b = rcons(r, 77, null); a = rcons(r, 23, b); b->next = a; a = b = null; Example a 23 77 b r 0 RC
struct list { int i; struct list @next; } *a, *b; Region r = newregion(); b = rcons(r, 77, null); a = rcons(r, 23, b); b->next = a; a = b = null; deleteregion(r); Example Region advantages (over regular RC): • good for cyclic structures • space cost of RCs is negligible a b
RC: Type annotations • User-view: • int *traditional x: "traditional" C pointer (not to region) • struct list { int i; struct list *sameregion next; }:pointer within same region • Abstract view (ignoring issues with null): • region type system (like for static systems) with addition of existential types: = … | . and runtime checks • anylist = .<int, 1.anylist[1]> @ • list = .<int, list[]> @ • runtime check that two region variables are identical: • chk 1 = 2
RC: Implementation • Compiles to C • Most RC updates for local variables are avoided • Assignments to fields and globals produce obvious RC updates (16-23 inst. cost) • Deleting a region is expensive (scan)
RC: Experiments • Machine: 333 MHz UltraSparc I, Solaris 2.7 • Benchmarks: 8 medium to large C programs • Regions vs malloc/free • C compiler: gcc 2.95 • Measurements with UltraSparc internal counters
The Benchmarks Eight C programs: • cfrac: factorise large integers • gröbner: Find the Gröbner basis of a set of polynomials • mudlle: byte-code compiler • lcc: the lcc compiler • tile: partitions text files based on word frequency • moss: software plagiarism detector • rc: RC compiler • apache: apache web server
Results: Ease of Use(from old implementation) • Size of substantive changes: • cfrac: 18 of 4203 lines • gröbner: 111 of 3219 lines • mudlle: 22 of 5078 lines • lcc: 349 of 12430 lines • tile: 10 of 926 lines • moss: 4 of 2675 lines • Types of changes: • extra copying • clear unused references • work around prototype limitations
Dynamic Checking Limitations • Runtime overhead: 0-20% • Must clear dangling references • Small number of objects/region is bad: • RC more painful • space & time overhead