330 likes | 527 Views
Parallel, Real-Time Garbage Collection. Daniel Spoonhower Guy Blelloch, Robert Harper, David Swasey Carnegie Mellon University. Motivation: Support Real-Time in C#. Real-time requires predictability Focus on garbage collection in this work Real-time collection must be:
E N D
Parallel, Real-Time Garbage Collection Daniel Spoonhower Guy Blelloch, Robert Harper, David Swasey Carnegie Mellon University
Motivation: Support Real-Time in C# • Real-time requires predictability • Focus on garbage collection in this work • Real-time collection must be: • predictable – allow programmers to reason about code • efficient – RT apps must meet deadlines • scalable – multi-threaded apps multi-threaded GC Parallel, Real-Time Garbage Collection
Our Results • Collector implementation available online • incremental GC w/ stacklets real-soon-now • C# benchmarks also available • Published work in VEE ’05 • using residencyto balance GC tradeoffs • dynamically adapt to application behavior Parallel, Real-Time Garbage Collection
Outline of Talk • Implementing a real-time collector • incremental collection • challenges in Rotor • Using residency to balance GC tradeoffs • between copying and non-copying GC • semantic constraints (e.g. pinning) • use of space and time (i.e. overhead) Parallel, Real-Time Garbage Collection
Incremental Collection Divide collector work into small pieces • interleave collector execution with application • avoid long interruptions of application • divide call stack into fixed size stacklets • typical metric: (minimum) mutator utilization Parallel, Real-Time Garbage Collection
Minimum Mutator Utilization (MMU) Parallel, Real-Time Garbage Collection
Building on Previous Work • We extend replicating copying collection • Nettles & O’Toole; Cheng & Blelloch • based on semi-space copying collection • good time bounds, high space overhead • Rotor requires GC to support pinned objects • part of foreign function interface • determined by programmer Incompatible with pure copying GC! Parallel, Real-Time Garbage Collection
Outline of Talk • Implementing a real-time collector • incremental collection • challenges in Rotor • Using residency to balance tradeoffs • between copying and non-copying GC • semantic constraints (e.g. pinning) • use of space and time (i.e. overhead) Parallel, Real-Time Garbage Collection
Adaptive Hybrid Collection • Bartlett: mostly copying collection • reduce fragmentation in constrained GC • We extend this to uniformly account for: • pinned objects • large objects • densely populated pages Parallel, Real-Time Garbage Collection
Mostly Copying Collection • “Copy objects. . .most of the time.” • Key ideas: • divide heap into pages • determine strategy for each page dynamically: objects may be evacuated or promoted in place • Gain benefits of both copying and non-copying collection Parallel, Real-Time Garbage Collection
Mostly Copying Collection divide heap into pages… …reserve pages for allocation… ...allocate objects… …begin collection… …flip… …some objects are unreachable… …evacuate reachable objects… …pinned… …in-place promotion! ... and reclaim unused space. Parallel, Real-Time Garbage Collection
Don’t Relocate Large Objects • Large objects often relegated to separate heap • . . . because they are expensive to copy • We use in-place promotion instead. • avoids usual penalties of copying large objects • no additional heap space required • use small objects to fill in gaps Parallel, Real-Time Garbage Collection
Don’t Relocate Large Objects • They don’t cause (much) fragmentation • little benefit to moving them around • Similarly for densely populated pages: • expensive to evacuate • little fragmentation vs. Parallel, Real-Time Garbage Collection
Page Residency as Guiding Principle • Residency = density of reachable objects • avoid evacuating objects from dense pages • focus on fragmented parts of the heap • Large objects occupy pages w/ high residency. • therefore, they are promoted in place Parallel, Real-Time Garbage Collection
Revisiting Allocation • Copying collectors use simple space check • constant time when there is sufficient space • requires contiguous unused space • In-place promotion requires more sophisticated management of free space. • need to find an free gap of appropriate size Parallel, Real-Time Garbage Collection
Overview of Our Hybrid Algorithm • Collection strategy determined for each page • Two data structures for heap traversal: • Cheney queues for evacuated objects • mark stack for objects promoted in place • Allocation using a modified first-fit • favor free pages (treated as large gaps) • other free list algorithms also applicable Parallel, Real-Time Garbage Collection
Two Mechanisms to Reclaim Space • Reclaim space both. . . • via evacuation • use residency to determine which objects should be evacuated • during allocation • use residency to determine which pages have enough space to make allocation worthwhile Parallel, Real-Time Garbage Collection
Collector Configurations • Two residency parameters: • revac= evacuation threshold, maximum residency at which objects are evacuated • ralloc= allocation threshold, maximum residency at which free space is reused (both measured as % of page) • Classic algorithms fall out as special cases • semi-space: revac = 100%, ralloc = 0% • mark-sweep: revac = 0%, ralloc = 100% Parallel, Real-Time Garbage Collection
A Continuum of Configurations • revac = when objects are evacuated • ralloc= when free space is reused Parallel, Real-Time Garbage Collection
Adaptive Hybrid Collection • So far. . . • outlined a hybrid algorithm for garbage collection • shown how residency thresholds give rise to a continuum of configurations • Next: how to determine residency. . . Parallel, Real-Time Garbage Collection
Measuring Residency • Residency can be computed during a heap traversal at little additional cost. • However, we need to know residency at beginning of collection. . . Parallel, Real-Time Garbage Collection
Use Past Measurements • One possibility: explicitly measure residency during each collection (using an extra traversal). • Instead, measure and predict future values • historical measurements avoid a second pass • measured residency still accounts for changes in program behavior over time Parallel, Real-Time Garbage Collection
Existing GCs Also Predict Residency • Other hybrids fix residency estimates in advance. • syntactic properties to distinguish objects • e.g. size, type, allocation point • Age: “Most objects die young.” • “Nursery is sparsely populated.” • i.e. using age to predict residency • however, generational collectors are often inconsistent in using age (e.g. large objects) Parallel, Real-Time Garbage Collection
Recovering from Poor Predictions • Adaptive GC recovers naturally from poor predictions. • at most two collections to recover space • without changes to algorithm Parallel, Real-Time Garbage Collection
Recovering from Poor Predictions • Consider the impact of overestimation: • many sparse pages remain – high fragmentation • measurements give true residency for each page • next collection will relocate objects to reclaim space • GC with poor predictions behaves like an explicit measurement phase. • in effect, only corrects inaccurate estimates • Allocation uses true residency, not predictions. • actual residency is known at the end of collection Parallel, Real-Time Garbage Collection
Experimental Results • If we know usage patterns of an application in advance, we choose an appropriate algorithm. • adaptive hybrid GC relieves us of this choice • Consider the impact of heap size on GC overhead • caveat: other effects of GC not shown here Parallel, Real-Time Garbage Collection
Impact of Heap Size • Heap size = memory available to collector • We expect copying collectors to perform well when there is plenty of space. • For small heaps, semi-space collectors may not satisfy all allocation requests. • Hybrid yields good performance in all cases. Parallel, Real-Time Garbage Collection
Impact of Heap Size Parallel, Real-Time Garbage Collection
Impact of Heap Size Parallel, Real-Time Garbage Collection
Other Measurements • Same configuration performs well in many different environments • different applications • small and large footprints (100 kB to 100 MB) Parallel, Real-Time Garbage Collection
Related Work • Replicating Collection – Nettles & O’Toole; Cheng & Blelloch • Mostly Copying – Bartlett; Smith & Morrisett • Garbage First – Detlefs, Flood, Heller, & Printezis (also uses residency as a measure of cost) • Metronome – Bacon, Cheng, & Rajan (different point in the “mostly” design space) Parallel, Real-Time Garbage Collection
Summary • Incremental collection • divide up collector work over time • Adaptive hybrid collection • use page residency to guide runtime decisions • continuous spectrum of tracing collectors • Measure and adapt to application behavior • predict residency based on past measurements • naturally recovers from poor estimates Parallel, Real-Time Garbage Collection
Useful Links • VEE ’05 paper: • http://www.cs.cmu.edu/~spoons/gc/vee05.pdf • Collector implementation: • http://www.cs.cmu.edu/~spoons/gc/ • C# benchmarks: • http://www.cs.cmu.edu/~spoons/gc/benchmarks.html Parallel, Real-Time Garbage Collection