320 likes | 510 Views
An On-the-Fly Mark and Sweep Garbage Collector Based on Sliding Views. Hezi Azatchi - IBM Yossi Levanoni - Microsoft Harel Paz – Technion Erez Petrank – Technion. Garbage Collection Today. Today’s advanced environments: multiprocessors + large memories. Dealing with multiprocessors.
E N D
An On-the-Fly Mark and Sweep Garbage Collector Based on Sliding Views Hezi Azatchi - IBM Yossi Levanoni - Microsoft Harel Paz –Technion Erez Petrank–Technion
Garbage Collection Today • Today’s advanced environments: • multiprocessors + large memories Dealing with multiprocessors Stop The World GC via Sliding Views
Garbage Collection Today • Today’s advanced environments: • multiprocessors + large memories Dealing with multiprocessors Parallel collection Concurrent collection On-the-fly collection GC via Sliding Views
Garbage Collection Today • Today’s advanced environments: • multiprocessors + large memories Dealing with multiprocessors 300ms 30ms Parallel collection Concurrent collection 3ms On-the-fly collection Informal pause times GC via Sliding Views
Garbage Collection Today • Today’s advanced environments: • multiprocessors + large memories Dealing with multiprocessors 10% Parallel collection Concurrent collection 10% On-the-fly collection Informal throughput loss GC via Sliding Views
This Talk • A new on-the-fly mark and sweep collector. • A synergy of snapshot collection and sliding views. • Implementation and measurements on the Jikes RVM. • Pause times < 2ms • Throughput loss 10%. GC via Sliding Views
The Mark-Sweep algorithm [McCarthy 1960] • Traverse & mark live objects. • White objects may be reclaimed. globals Roots GC via Sliding Views
Base: a snapshot collection • A naïve collector: • Stop program threads • Create a snapshot (replica) of the heap • Program threads resume • Trace replica concurrently with program • Objects identified as unreachable in the replica may be collected. Problem: taking a replica of the heap is not realistic GC via Sliding Views
Base: a snapshot collection • A naïve collector: • Stop program threads • Create a snapshot (replica) of the heap • Program threads resume • Trace replica concurrently with program • Objects identified as unreachable in the replica may be collected. • [Furusou et al. 91]: use a copy-on-write barrier. • No need to copy unless area written • Use virtual pages. GC via Sliding Views
Some inefficiencies • Copying a page requires synchronization. • Efficiency depends on the system. • Triggering and copying apply to all fields although only pointers are interesting: • Programs work at object level, this mechanism works at page level • a waste to copy a full page. GC via Sliding Views
Synergy with recently developed techniques • Note goal:we want to copy pointers in each modified object prior to its first modification. • The write barrier of the Levanoni-Petrank reference counting collector provides exactly this. • Use a dirty bit per object. Before a pointer is first modified – save object pointer values locally. • This can be done concurrently by a multithreaded program with no synchronization! GC via Sliding Views
The write barrier (simplified) Update(Object **slot, Object *new){ Object *old = *slot if (!IsDirty(slot)) { log( slot, old ) SetDirty(slot) } *slot = new } • Observation: • If two threads: • invoke the write barrier in parallel, and • both log an old value, • then both record the same • old value.
The write barrier (simplified) Update(Object **slot, Object *new){ Object *old = *slot if (!IsDirty(slot)) { log( slot, old ) SetDirty(slot) } *slot = new } • The “real” write barrier: • In the object level • With an optimistic initial “if”
Concurrent (intermediate) Algorithm: • Stop all threads • Scan roots (locals) • Initiate write barrier usage • Resume threads • Trace from roots. • Whenever a dirty objects is discovered use buffers to obtain its pointers. • Stop write barrier usage • Sweep to reclaim unmarked objects. • Clear all buffers and dirty bits. Next goal: stop one thread at a time GC via Sliding Views
The Sliding Views “Framework” • Avoid simultaneous halting. Instead, stop one thread at a time. • View of the heap is a “sliding view”. • There is a time interval in which all objects are read. (But not one single point in time.) GC via Sliding Views
Danger in Sliding Views Here sliding view reads P2 (NULL) Program does: P1 O P2 O P1 NULL Here sliding view reads P1 (NULL) Problem: reachability of O not noticed! Solution: “snooping”. If a pointer to O is stored while the sliding view is taken – do not reclaim O. GC via Sliding Views
The Sliding Views Algorithm: • Initiate snooping and write barrier usage • For each thread: • Stop thread and scan its roots (locals) • Stop snooping • Trace from roots and snooped objects. • Whenever a dirty object is discovered use buffers to obtain its actual values. • Stop write barrier usage • Sweep to reclaim unmarked objects. • Clear all buffers and dirty bits. GC via Sliding Views
Optimizing the write barrier • We only need to store: • non-null pointer values of object. • while tracing is on. • objects that have not been traced. • the object once. • Implication of 3: new objects are never stored. Slow path of the write barrier is seldom taken (~ 1/300) GC via Sliding Views
Write Barrier Statistics GC via Sliding Views
Performance Measurements • Implementation for Java on the Jikes Research JVM • Compared collectors: • Jikes parallel collector (Parallel) • Jikes concurrent RC (Jikes concurrent) • Benchmarks: • Server benchmark: SPECjbb2000 --- business-like transactions in a large firm • Client benchmarks: SPECjvm98 --- mostly single-threaded client benchmarks GC via Sliding Views
Pause Times vs. Parallel Jikes parallel Jikes parallel GC via Sliding Views
Pause Times vs. Jikes Concurrent GC via Sliding Views
SPECjbb2000 Throughput Jikes parallel GC via Sliding Views
SPECjvm98 Throughput Jikes parallel GC via Sliding Views
SPECjbb2000 Throughput GC via Sliding Views
SPECjvm98 Throughput GC via Sliding Views
SPECjbb2000 Throughput GC via Sliding Views
Most Related Collector • Vast literature on on-the-fly mark & sweep collectors. • The state-of-the-art collector is by Doligez-Leroy-Gonthier [POPL 93-94] • Implemented for Java by IBM research:Domani-Kolodner-Petrank [PLDI 2000]Domani et al [ISMM 2000] • Our new collector is the only alternative for tracing on-the-fly. GC via Sliding Views
Parent p o1 o2 Comparison ? • No available research implementation for Java. • Some thoughts on locality: A difference in write barrier on pointer modification: • [DLG]: Mark ex-referenced object • [This work:] Copy (seldom) parent pointers, check (frequently) parent mark bits. GC via Sliding Views
Related Work • Snapshot tracing: • Demers et al (1990), Furusou et al. (1991) • On-the-fly tracing: • Dijkstra et. al. (1976), Steele (1976), Lamport (1976), Kung & Song (1977), Gries (1977) Ben-Ari (1982,1984), Huelsbergen et. al. (1993,1998) • Doligez-Gonthier-Leroy (1993-4), Domani-Kolodner-Petrank (2000) • The RC sliding views algorithm: • [Levanoni & Petrank: OOPSLA 01]. • Generational extension of sliding views: • Azatchi & Petrank [Compiler Construction 2003] GC via Sliding Views
Conclusions • A new non-intrusive, efficient mark & sweep garbage collector suitable for multiprocessors. • An implementation on Jikes and measurements on a multiprocessor. • Low pause times (1ms) small throughput penalty (10%). GC via Sliding Views