440 likes | 448 Views
This article explores various garbage collection techniques such as uniprocessor, Cheney copying, mark-sweep, tricolor, and incremental tracing. It discusses the advantages, pitfalls, and implementation details of each technique.
E N D
Uniprocessor Garbage Collection Techniques Paul R. Wilson
The Two-Phase Abstraction • 1. Detection • 2. Reclamation
Why Garbage Collect at All? • Safety • Memory leaks • Continued use of freed pointers • Simplicity
Why Garbage Collect at All? • Flexibility • Hard coded program limits • Efficiency! • Who is responsible for deletion? • Extraneous copies
Liveness and Garbage • There is a root set which is defined as live. • Anything reachable from a live pointer is also live • Everything else is garbage
The Root Set • The Root Set • Static global and module variables • Local Variables • Variables on any activation stack(s) • Everyone else • Anything Reachable From a live value
Reference Counting Advantages • Implicitly distributes garbage collection • Real Time guarantees with deferred reclamation • Keep a list of zeroed objects not yet processed • Memory efficiency, can utilize all available memory with no work room
Reference Counting Pitfalls • Conservative- needs a separate GC technique to reclaim cycles • Expensive- pointer reassignment requires: • Increment • Decrement • Zero Check • Stack Variables frequent creation/destruction • Can be optimized to some extent
Deferred Reference Counting • Defer deletion of zero counted objects • Periodically scan the stack for pointers
Mark-Sweep Collection • Starting From the root set traverse all pointers via depth/breadth first search. • Free everything that is not marked.
Non-Copying issues • Same as for traditional allocators • Fragmentation • Memory block size management • Locality of reference- interleaved new/old • General issues- work proportional to heap size
Copying Advantages • Memory locality preserved • Disadvantages • Lots of copying! • “Scavenging”
Stop and Copy • How to update multiple pointers to the same object? • Forwarding Pointers • Mark/Sweep is proportional to the amount of live data. Assuming this stays roughly constant, increasing memeory will increase efficiency.
Non Copying Version • Facts • Allocated with a color • Fragmentation • Advantages • Does not require pointer rewriting • Supports obscure pointer formats, C friendly
In place collection • Conservative estimates • Useful for languages like C • Pointers can be safely passed to foreign libraries not written with Garbage Collection in mind
Incremental Tracing Collectors • The ‘Mutator’ • The reachability graph may change • From the garbage collectors point of view the actual application is merely a coroutine ir cuncurrent process with an unfortunate tendency to modify data structures that the collector is trying to traverse • Floating Garbage • Can’t survive more than one extra round
Real Time Garbage Collection • Incremental Tracing Collectors • In Place Collection • Many readers single writer(mutator) • As a Copying Collector • Multiple Readers Multiple Writers
Tricolor Marking • White • Initial color for an object subject to collection • Black • Objects that will be retained after the current round • Gray • Object has been reached, but not its descendents • Wave front effect
Read Barrier • Detects an attempt to read a white object and immediately colors it gray
Write Barrier • Traps attempts to write a pointer into an object
Some algorithms • Snapshot-at-beginning write barrier • Black-only read barrier • Baker’s read barrier • Dijkstra’s write Barrier • Steele’s write Barrier
Baker’s Read Barrier • Allocates Black • Grey Objects cannot be reverted to white • Immediately Invalidates fromspace • Any pointer access to fromspace causes the GC to grey the target object by copying it to tospace if necessary and updating the pointer.
Baker’s Non Copying Scheme • Real Time Friendly
Black Only Read Barrier • When a white object in fromspace is touched it is scanned completely.
Replication Copying Collection • Until copying from from space to to space is completed, the mutator continues to read from from space. • Write updates must be trapped to update tospace. • Single simultaneous ‘flip’ where all pointers are updated. • Expensive for standard hardware, but cheap for functional languages
Real time considerations • Read Barriers add an unpredictable cost per pointer access • Nilson background scavenger, reserve only • Write barrier may be more expensive overall, but the cost per access is well bounded • Guaranteeing progress allocation clock, frees per allocation • Statically allocate troublesome objects
Results • Writer barrier more efficient on standard hardware
Snapshot at the Beginning • Catches pointers which try to escape from white objects • If a pointer is replace in a black object, the replaced pointer is first stored. All overwritten pointers are saved via a write barrier. • All objects that are live at the beginning of collection remain live • Allocate Black during collection round • Incremental Update • Reverts black to gray when an object is written to, or else grays they new pointed to object
Incremental Update with Write-Barrier(Dijkstra)g • Catches pointers that try to hide in black objects • Reverts Black to gray • If the overwritten pointer is not pointed to elsewhere then it is garbage • Allocated white. Newly allocated objects assumed unreachable
Motivation for a new Strategy • Most objects are short lived • 80% to 90% die within a few million instructions • Objects that don’t die quickly are more likely to live a while • Long lived objects are copied over and over • Excessive Paging in Scanning if the heap must exceed available physical memory
Variations of generational collection • Intergenerational references • Write barrier • Old to younger • Young to old • Collection • Advancement policies • Advance always • Advance after 2 rounds Counter in the header field? Advance always? Semispace in the last generation 3 spaces Bucket brigade Mark compact in the oldest generation for memory efficiency