170 likes | 376 Views
Immix: A Mark-Region Garbage Collector. Curtis Dunham CS 395T Presentation Feb 2, 2011. I believe this presentation to be ~ 95% Steve’s PLDI talk and ~4% Jennifer Sartor’s 395T presentation and < 1% mine . Comparison to Prior Work; Contributions. Status Quo Before This Work.
E N D
Immix: A Mark-Region Garbage Collector Curtis Dunham CS 395T Presentation Feb 2, 2011 I believe this presentation to be ~95% Steve’s PLDI talk and ~4% Jennifer Sartor’s 395T presentation and < 1% mine Thanks to Steve Blackburn and Jennifer Sartor for their 2008 and 2009 Immix presentations, respectively.
Comparison to Prior Work; Contributions Status Quo Before This Work Post-Immix: The New World of GC Mutator Performance Bump Pointer Locality Mark-Compact Copying Immix: Mark-Region w/ Opportunistic Defragmentation Does both (every object either marked or copied) In One Pass Non-Semispace Space efficiency Fast Collection Mark-Sweep
GC FundamentalsAlgorithmic Components Identification Allocation Reclamation ` Sweep-to-Free Tracing (implicit) Free List Compact Reference Counting (explicit) Bump Allocation Evacuate 3 1
GC FundamentalsCanonical Garbage Collectors Mark-Sweep [McCarthy 1960] Free-list + trace + sweep-to-free ` Sweep-to-Free Compact Mark-Compact [Styger 1967] Bump allocation + trace + compact Evacuate Semi-Space [Cheney 1970] Bump allocation + trace + evacuate Thanks to Steve for his Immix presentation from 2008.
Sweep-To-Regionand Mark-Region Reclamation Sweep-to-Free Mark-Sweep Free-list + trace + sweep-to-free ` Compact Mark-Compact Bump allocation + trace + compact Evacuate Semi-Space Bump allocation + trace + evacuate Mark-Region Bump alloc + trace + sweep-to-region Sweep-to-Region
Naïve Mark-Region • Contiguous allocation into regions • Excellent locality • For simplicity, objects cannot span regions • Simple mark phase (like mark-sweep) • Mark objects and their containing region • Unmarked regions can be freed
Heap Organization • Blocks – analogous to Regions • Recyclable • Immix block = 32KB • Lines • Objects can span lines • Immix line = 128B • Opportunistic defragmentation • Candidate and target blocks • Single pass to mark and copy Reusable for (more) allocation 256 per Block Move from mostly-empty to mostly-full
Immix: Lines and Blocks ✓More contiguous allocation Large Regions ✗ Increased metadata o/h ✗ Constrained object sizes ✗ Fragmentation (false marking) “In a mark-region collector, region size embodies the collector’s space-time tradeoff.” Small Regions Lines & Blocks N pages approx 1 cache line Free Recyclable Free Recyclable ▫ TLB locality, cache locality ▫ Block > 4 X max object size ▫ Objects span lines ▫ Lines marked with objects ✗ Fragmentation (can’t fill blocks) ✓Less fragmentation ✓Fast common case
Allocation Policy(Recycling) • Recycle partially marked blocks first • Minimize fragmentation • Maximize sharing of freed blocks • Recycle in address order • We explored other options • Allocate into free blocks last Effect on locality and fragmentation?
Opportunistic Defragmentation • Identify source and target blocks • (see paper for heuristics) • Evacuate objects in source blocks • Allocate into target blocks • Opportunistic • Leave in place if no space, or object pinned • Opportunistically evacuate fragmented blocks • Lightweight, uses same allocation mechanism • No cost in common case (specialized GC) • Source = most holes • Other heuristics?
Details • Parallelizable • Coarse sweeping • Defragmentation • Demand-driven overflow allocations • Medium objects • Metadata space overheads • For parallel synch: mark bytes (not bits) • Line and block mark, not just object mark • Defragmentation headroom • Overflow allocation block • Conservative line marking
Other Optimizations ✓Most objects small ▫ Small objects implicitly mark next line Line mark ✓V. Fast common case Implicit line mark ▫ Large objects mark lines exactly Overflow Allocation Implicit Marking ✓ ▫ Multi-line objects may skip many small holes ▫ Overflow allocation (used on failure) ✓Large objects uncommon ✓V. effective solution
Mark-Region: Immix(Bump Allocation + Trace + Sweep-to-Region) Good locality Space efficient ✓ ✓ Excellent performance ✓ Simple, very fast collection ✓ Actual data, taken from geomean of DaCapo, jvm98, and jbb2000 on 2.4GHz Core 2 Duo
Total Performance Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo
Discussion • Necessity of two-level hierarchy? • Caching/Paging? • Efficacy of tuned line/block sizes:e.g. actual TLB miss reduction? • Implicit Marking • advantages overcome possible fragmentation? • Methodology and Results
Sticky Performance Benefits of Sticky? Geomean of DaCapo, jvm98 and jbb2000 on 2.4GHz Core 2 Duo