1 / 39

Conservative Garbage Collection

Conservative Garbage Collection. Stephan Lesch January 9, 2002 slesch@studcs.uni-sb.de. Contents. Intro Conservative GC Mostly Copying Collection Hidden Pointer Problems GC for C++. Type-accurate GC: locations of pointers are known no pointer arithmetic

gamma
Download Presentation

Conservative Garbage Collection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Conservative Garbage Collection Stephan LeschJanuary 9, 2002slesch@studcs.uni-sb.de

  2. Contents • Intro • Conservative GC • Mostly Copying Collection • Hidden Pointer Problems • GC for C++

  3. Type-accurate GC: locations of pointers are known no pointer arithmetic often tailored to one software product usually supported by compiler/runtime system So Far

  4. every register/word potiential pointer non-supportive environment little/no knowledge about register usage object/stack layout should work with any C/C++ programs programmers don‘t want to pay for GC unless needed must coexist with explicit memory management The middle way: programmer/compiler provide information to recognize pointers Ambiguous Roots Collection

  5. Conservative GC Boehm/Demers/Weiser (Xerox PARC) [1988] • non-moving mark-and-deferred-sweep collector • fully conservative, no reliance on compiler no extra bits to distinguish pointer/non-pointer no additional object headers • for C and C++ • for Unix, OS/2, Mac, Win95/NT • supports incremental/generational collection • can function as space leak detector

  6. Heap Layout Two logically distinct heaps: Standard heap • malloc / free • compatible with existing code • no pointers to collected heap! Collected heap • GC_malloc • GC_free to free known garbage • pointers to standard heap ignored

  7. Layout of Collected Heap • made up of blocks (e.g. 4 K, aligned to 4 K boundaries) • one object size per block • for each object size: • bitmap to mark allocated objects • freelist (linked list of heap block slots) • reclaimable blocks queue (deferred sweep) • heap-block free-list

  8. Finding headers & bit maps

  9. Allocation for objects > 1/2 block: allocate chunk of blocks(heap-block free list) none available GC not enough space reclaimed expand heap for small objects: pop free-list for this size free-list is empty resume sweep phase still empty GC not enough space reclaimed expand heap Clear object after allocation!

  10. Finding Roots & Pointers • possible roots: registers, stack, static areas • no cooperation from compiler • treat every word as potential pointer • ignore interior pointers (standard) • prefer marking from false pointers over ignoring valid pointers Conservative Pointer Identification: given word p; • does p refer to the collected heap? • does it point into heap block allocated by collector? • does it point to the beginning of an object in that block? if yes, • mark object in block header • push object onto mark stack finally: reset mark bits of objects on free-lists

  11. Misidentification • integers accidentally fulfilling validity tests • avoid need to trace from interior pointers... • ... or unaligned pointers: 000000090000000A • avoid addresses with lots of trailing 0’s • try to avoid generating false references: • collector clears non-atomic objects after alloc • GC_malloc_atomic for objects without pointers • programmer initialize structures • programmer destroy obsolete pointers (“dead pointers on stack are often the most significant source of leaks”)

  12. Black Listing Idea: don’t allocate in heap blocks at addresses likely to collide with invalid pointers: • black list references to vincinity of heap which fail validity tests • extra run before first allocation finds false references in static data • additional space overhead < 10% • but: difficult to allocate >100K without spanning black-listed blocks

  13. Influence of Data Structures Problems with: large structures + interior pointers strongly connected structures Lisp: • small disjoint garbage structures • lists constructed of cons-cells => Conservative GC worked well, memory leaks remain bounded (<8% leakage, constant amount) KRC: • large, strongly connected structures • next pointers in objects => collector thrashed [Wentworth, 1990]

  14. Efficiency (1) Comparative studies by Zorn, 1992; Detlefs et al. 1994 • „real-world“ C programs: (perl, xfig, GhostScript) • comparing BDW w. explicit managers • replace malloc() w. GC_malloc(), remove free() • no further adaption • used outdated versions (4.3 vs. 1.6/2.6)

  15. Efficiency (2) • realistic alternative to explicit mem management(20% avg execution time overhead over best managers, up to 57% in worst case) • marks 3 MB/s on SparcStation II • up to 3 times heap usage for small heaps (fixed cost for collector’s internal structs) • needs substantially more space to avoid over-frequent GC • works best w. programs using very small objects • might co-exist poorly with cache management(heap blocks aligned on 4K boundaries)

  16. Incremental/Generational Mode • marking in small steps interleaved with mutator • need to detect later changes to connectivity in traced parts of graph: • read dirty bits for pages • write-protect memory and catch faults • when mark stack is empty:trace from all marked objects on dirty heap blocks • reduces avg. pause times, increases total exec time • generational: GC uses knowledge which pages were recently modified

  17. Mostly Copying Collection • Joel Bartlett, 1988 (Digital) • hybrid conservative / copying collector: • roots are treated conservative (don’t move referenced objects) • objects only accessible from heap-allocated objects are copied(assumes pointers in heap-allocated data can be found accurately) faster allocation less problems with pointer identification more accurate GC

  18. Object layout • programmer has no control over object layout • what if object layout should match hardware registers or file structures? header size #pointers pointers user data non-pointers

  19. Heap layout blocks with space identifiers root current_space = 1 next_space = 1 1 0 currently unused 1 42 currently unused

  20. Allocation • within a block: • inc free-pointer • dec free-slots-count • if necessary: search for free block (space_id  current_space/next_space) set its space_id to next_space • current_space = next_space during allocation

  21. Collection • GC when heap is half full (half of heap blocks have space_id=current_space) • next_space = current_space +1 mod n • Fromspace = current_space blocks • Tospace = next_space blocks • scan roots conservatively for pointers into heap • move potentially referred objects to Tospace: • changing space_id of their blocks to next_space • add block to Tospace scan list • copy graphs accessible from blocks on scan list

  22. Heap after Collection root current_space = 2 next_space = 2 2 2 1 42 currently unused currently unused

  23. Bartlett‘s GC algorithm (1) gc() = next_space = (current_space + 1) mod 077777 Tospace_queue = empty for R in Roots promote(block(R)) while Tospace_queue != empty blk = pop(Tospace_queue) for obj in blk for S in Children(obj) S = copy(S) current_space = next_space

  24. Bartlett‘s GC algorithm (2) promote (block) = if Heap_bottom  block  Heap_top and space(block) == current_space space(block) = next_space allocatedBlocks = allocatedBlocks + 1 push(block, Tospace_queue) copy (p) = if space(p) == next_space or p == nil return p if forwarded(p) return forwarding_address(p) np = move(p, free) free = free + size(p) forwarding_address(p) = np return np

  25. Generational Mode (1) • One bit in space_id indicates young/old generation • Other bits approximate age of objects/blocks • Minor collection: • when 50% of free space after last GC is full • young objects reachable from roots/remembered set are promoted en masse (change space_id/copy) • remembered set: maintained via memory protection

  26. Generational Mode (2) • Major collection (mark-compact): • when old generation occupies >85% of heap • mark accessible objects in old generation • pass 1: find old generation blocks <1/3 filledcopy objects to free space leaving forwarding addresses • pass 2: rescan old generation, correct pointers using forwarding addresses • expand heap if >75% full • maintaining remembered set costs time, but often saves more time during GC(20% time improvement on Scheme compiler)also reduces pause times in interactive programs

  27. Efficiency (1) • no thorough studies • space overhead: space_ids, type info, block links, promotion bits 2% for 512 byte blocks; tagging data increases overhead • Mostly Copying vs. BDW:Mostly Copying probably better with many shortlived objects, benefit from faster allocation

  28. Experiences • generational version: 20% runtime improvement for Scheme-to-C compiler • significant performance increase in CAD program (reduced paging) • bad results for non-generational collector for Modula-2 w. very large heaps (10s of Megabytes) • choose GC strategy that fits behaviour of mutator

  29. The optimising Compiler/User Devil • conservative GC defeated by temporarily hidden pointers - parts of graph may be unreachable during a GC: • pointer arithmetic • adding tag bits • e.g. optimized array traversal: xend = x+SIZE; for(; x<xend; x++) ...*x...; x -= SIZE; ...x...; for (i=0; i<SIZE; i++) ...x[i]...; ...x...; inside loop x is interior pointer, afterwards x points one past the end

  30. Machine-specific Optimizations struct l_thing { char thing[35000]; struct l_thing *next; } struct l_thing *; tail(struct l_thing *x) { return (x->next); } on IBM RISC System/6000, tail() translates to AIU r3=r3,1 ; r3+=65536 L r3=SHADOW(r3, -30536) ;= r3+35000 BA lr

  31. Boehm and Chase’s Solution (1) • local root set of function f at any point in execution: • register/auto variables • previously computed values of direct sub-expressions of incompletely evaluated expressions:malloc‘s return value in malloc(size) + 4 • global root set: • declared static and extern variables • local root sets of all call sites in call chain • any values stored in other areas scanned by collector • valid base pointer: • pointer to anywhere inside an object or one past its end • BDW can handle such pointers

  32. Boehm and Chase’s Solution (2) • every object on garbage collected heap must be accessible from global root set through chain of base pointersconservative collection safe with strictly ANSI-compatible programs • suggested implementation: • preprocess source using macros that prevent code generator from discarding live base pointers prematurely • compile normally • post-process assembly code, removing macro artifacts • transparent to programmer & compiler • may interfere with instruction scheduling • may increase register pressure

  33. Ellis and Detlef’s solution • annotate operations on pointers with names of base pointers from which they’re derived • compiler treats these operations as uses of the original base pointers, extending their live ranges • code generation must respect live ranges • requires changes to compiler • does not alter sources • does not rely on behaviour of volatile declarations

  34. GC for C++ • object-oriented languages often use more heap-allocated data • generate more complex data structures • GC uncouples memory management from class interfaces instead of dispersing it through code

  35. Conservative GC for C++ • requires no changes to language • restriction on coding style holds: no hidden pointers (converted to int) • existing code may violate the restriction • aggressive optimisers may as well • safety must be enforced in code-generator • some support for finalization (GC_register_finalizer) - assuming few objects need finalization

  36. Mostly Copying for C++ • storing all pointers at beginning of objects interferes with inheritance (fast field lookup) • here: user supplies callback methods to identify pointers class Tree { public: Tree* left; Tree* right; int data; Tree (int x); GCCLASS(Tree); ... }; GCPOINTERS(Tree) { gcpointer(left); gcpointer(right); } GCPOINTERS macro generates callback method Tree::GCPointers • currently no support for finalisation

  37. Benefits of pointer locating methods • programmer may solve unsure reference problem:union { int n; thing *ptr;} x; • enables semantically accurate marking:e.g. stacks, queues • automatic GC retains uncleared references to removed elements • programmer can omit them even better than type-accurate GC

  38. Using Object Descriptors • Detlefs, 1991: extension to Mostly Copying • insert descriptor into object headers • Bitmap format: • 1 word with 32 bits indicating pointer/non-pointer words • use if only first 32 words of user data contain pointers, can’t handle unsure references • Indirect format: • pointer to byte array encoding sure/unsure references and non-pointer values • array can be compressed using repeat counts • Fast indirect format: • array of ints; 1st number indicates repetitions of rest • subsequent numbers = number of words to skip to reach next pointer, negative number indicates unsure reference

  39. Conclusion • GC effective for traditional imperative languages • realistic alternative to explicit mem management for most applications • not yet suitable for real-time / safety-critical applications • no big onstraints to coding style, except hidden pointer problem • gc’ing allocators competitive even with code not written for GC • GC should have hooks for client/programmer to communicate their knowledge: • explicit deallocation calls • atomic objects • hints of appropriate times to collect

More Related