250 likes | 345 Views
T H E U N I V E R S I T Y O F T E X A S. A T A U S T I N. Free-Me: A Static Analysis for Individual Object Reclamation. Samuel Z. Guyer Tufts University. Kathryn S. McKinley University of Texas at Austin. Daniel Frampton Australian National University. Motivation.
E N D
T H E U N I V E R S I T Y O F T E X A S A T A U S T I N Free-Me: A Static Analysis for Individual Object Reclamation Samuel Z. Guyer Tufts University Kathryn S. McKinley University of Texas at Austin Daniel Frampton Australian National University
Motivation • Automatic memory reclamation (GC) • No need for explicit “free” • Garbage collector reclaims memory • Eliminates many programming errors • Problem: when do we get memory back? • Frequent GCs: Reclaim memory quickly, with high overhead • Infrequent GCs: Lower overhead, but lots of garbage in memory
Example Read a token (new String) • Notice: String idName is often garbage Memory: Look up insymbol table void parse(InputStream stream) { while (not_done) { String idName = stream.readToken(); Identifier id = symbolTable.lookup(idName); if (id == null) { id = new Identifier(idName); symbolTable.add(idName, id); } computeOn(id); }} If not there, create new identifier, add to symbol table Compute onidentifier
Solution • Garbage does not accumulate Memory: void parse(InputStream stream) { while (not_done) { String idName = stream.readToken(); Identifier id = symbolTable.lookup(idName) if (id == null) { id = new Identifier(idName); symbolTable.add(idName, id); } elsefree(idName); computeOn(id); }} String idName is garbage, free immediately
Potential: 1.7X performance malloc/free vs GC in tight heaps (Hertz & Berger, OOPSLA 2005) Our approach • Adds free() automatically • FreeMe compiler pass inserts calls to free() • Preserve software engineering benefits • Can’t determine lifetimes for all objects • Works with the garbage collector • Implementation of free() depends on collector • Goal: • Incremental, “eager” memory reclamation Results: reduce GC load, improve performance
Outline • Motivation • Analysis • Results • Related work • Conclusions
FreeMe Analysis • Goal: • Determine when an object becomes unreachable Not a whole-program analysis* • Idea: pointer analysis + liveness • Pointer analysis for reachability • Liveness analysis for when Within a method, for allocation site “p = new A” where can we place a call to “free(p)”? I’ll describe the interprocedural parts later
symbolTable id (global) Identifier idName readTokenString Pointer Analysis • Connectivity graph • Variables • Allocation sites • Globals (statics) • Analysis algorithm Flow-insensitive, field-insensitive String idName = stream.readToken(); Identifier id = symbolTable.lookup(idName); if (id == null) { id = new Identifier(idName); symbolTable.add(idName, id); } computeOn(id);
From a variable: Live range of the variable From a global: Live from the pointer store onward From other object: Live from the pointer store until source object becomes unreachable Adding liveness An object is reachable only when all incoming pointers are live • Key: idName (global) Identifier Reachability is union of all these live ranges readTokenString
Liveness Analysis • Computed as sets of edges • Variables • Heappointers String idName = stream.readToken(); idName Identifier id = symbolTable.lookup(idName); if (id == null) id = new Identifier(idName); (global) Identifier symbolTable.add(idName, id); readTokenString computeOn(id);
Where can we free it? • Where object exists -minus- • Where reachable String idName = stream.readToken(); readTokenString Identifier id = symbolTable.lookup(idName); if (id == null) id = new Identifier(idName); symbolTable.add(idName, id); Compiler inserts call to free(idName) computeOn(id);
Interprocedural component • Detection of factory methods • Return value is a new object • Can be freed by the caller • Effects of methods called • Describes how parameters are connected • Compilation strategy: • Summaries pre-computed for all methods • Free-me only applied to hot methods String idName = stream.readToken(); Hashtable.add: (0 → 1) (0 → 2) symbolTable.add(idName, id);
Implementation in JikesRVM • FreeMe added to OPT compiler • Run-time: depends on collector • Mark/sweep Free-list:free() operation • Generational mark/sweep Unbump: move nursery “bump pointer” backward Unreserve: reduce copy reserve • Very low overhead • Run longer without collecting
Increasing alloc size Increasing alloc size DaCapo benchmarks SPEC benchmarks Volume freed – in MB 74 91 98 105 180 183 263 271 103 348 515 523 716 822 1544 8195 100% 50% 0% db ps fop mtrt pmd antlr jess jack xalan bloat javac jython hsqldb raytrace compress pseudojbb
73 73 45 163 673 278 30 222 FreeMe Mean: 32% 75 1607 34 24 57 16 22 0 Volume freed – in MB 74 91 98 105 180 183 263 271 103 348 515 523 716 822 1544 8195 100% 50% 0% db ps fop mtrt pmd antlr jess jack xalan bloat javac jython hsqldb raytrace compress pseudojbb
Compare to stack-like behavior • Notice: Stacks and regions won’t work for example • idName escapes some of the time • Not biased: 35% vs 65% • Comparison: restrict placement of free() Object must not escape • No conditional free • No factory methods • Other approaches: • Optimistic, dynamic stack allocation [Azul, Corry 06] • Scalar replacement
73 73 45 103 Stack-like Mean: 20% 21 1566 15 146 56 16 28 35 14 6 3 0 Volume freed – in MB 74 91 98 105 180 183 263 271 103 348 515 523 716 822 1544 8195 100% 50% 0% db ps fop mtrt pmd antlr jess jack xalan bloat javac jython hsqldb raytrace compress pseudojbb
30% 9% Mark/sweep – GC time All benchmarks
20% 15% 6% Mark/sweep – time All benchmarks
GenMS – time All benchmarks
GenMS – GC time Why doesn’t this help? Note: the number of GCs is greatly reduced All benchmarks FreeMe mostly finding short-lived objects Nursery reclaims dead objects for free (cost ~ survivors)
12% Bloat – GC time
Related work • Compile-time memory management • Functional languages [Barth 77, Hughs 92, Mazur 01] • Shape analysis [Shaham 03, Cherem 06] • Stack allocation [Gay 00, Blanchet 03, Choi 03] • Tied to stack frames or other scopes • Objects from a site must not escape • Region inference [Tofte 96, Hallenberg 02] • Cheap reclamation – no scoping constraints • Similar all-or-nothing limitation
Embedded applications: Compile-ahead Memory constrained Non-moving collectors Conclusions • FreeMe analysis • Finds many objects to free: often 30% - 60% • Most are short-lived objects • GC + explicit free() • Advantage over stack/region allocation: no need to make decision at allocation time • Generational collectors • Nursery works very well • Abandon techniques that replace nursery? • Mark-sweep collectors • 50% to 200% speedup • Works better as memory gets tighter