1 / 42

Garbage Collection Introduction and Overview

Garbage Collection Introduction and Overview. Excerpted from presentation by Christian Schulte Programming Systems Lab Universität des Saarlandes, Germany schulte@ps.uni-sb.de. Garbage Collection….

berne
Download Presentation

Garbage Collection Introduction and Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Garbage Collection Introduction and Overview Excerpted from presentation by Christian Schulte Programming Systems Lab Universität des Saarlandes, Germany schulte@ps.uni-sb.de

  2. Garbage Collection… …is concerned with the automatic reclamation of dynamically allocated memory after its last use by a program

  3. Garbage collection… • Dynamically allocated memory • Last use by a program • Examples for automatic reclamation

  4. Kinds of Memory Allocation static int i; void foo(void) { int j; int* p = (int*) malloc(…); }

  5. Static Allocation • By compiler (in text area) • Available through entire runtime • Fixed size static int i; void foo(void) { int j; int* p = (int*) malloc(…); }

  6. Automatic Allocation • Upon procedure call (on stack) • Available during execution of call • Fixed size static int i; void foo(void) { int j; int* p = (int*) malloc(…); }

  7. Dynamic Allocation • Dynamically allocated at runtime (on heap) • Available until explicitly deallocated • Dynamically varying size static int i; void foo(void) { int j; int* p = (int*) malloc(…); }

  8. Dynamically Allocated Memory • Also: heap-allocated memory • Allocation: malloc, new, … • before first usage • Deallocation: free, delete, dispose, … • after last usage • Needed for • C++, Java: objects • SML: datatypes, procedures • anything that outlives procedure call

  9. Getting it Wrong • Forget to free (memory leak) • program eventually runs out of memory • long running programs: OSs. servers, … • Free to early (dangling pointer) • lucky: illegal access detected by OS • horror: memory reused, in simultaneous use • programs can behave arbitrarily • crashes might happen much later • Estimates of effort • Up to 40%! [Rovner, 1985]

  10. p Nodes and Pointers • Node n • Memory block, cell • Pointer p • Link to node • Node access: *p • Children children(n) • set of pointers to nodes referred by n n

  11. Mutator • Abstraction of program • introduces new nodes with pointer • redirects pointers, creating garbage

  12. Shared Nodes • Nodes referred to by several pointers • Makes manual deallocation hard • local decision impossible • respect other pointers to node • Cycles instance of sharing

  13. Last Use by a Program • Question: When is node M not any longer used by program? • Let P be any program not using M • New program sketch: Execute P; Use M; • Hence: M used  P terminates • We are doomed: halting problem! • So “last use” undecidable!

  14. Safe Approximation • Decidable and also simple • What means safe? • only unused nodes freed • What means approximation? • some unused nodes might not be freed • Idea • nodes that can be accessed by mutator

  15. Reachable Nodes • Reachable from root set • processor registers • static variables • automatic variables (stack) • Reachable from reachable nodes root

  16. Summary: Reachable Nodes • A node n is reachable, iff • n is element of the root set, or • n is element of children(m) and m is reachable • Reachable node also called “live”

  17. Mark and Sweep • Compute set of reachable nodes • Free nodes known to be not reachable

  18. Reachability: Safe Approximation • Safe • access to not reachable node impossible • depends on language semantics • but C/C++? later… • Approximation • reachable node might never be accessed • programmer must know about this! • have you been aware of this?

  19. Example Garbage Collectors • Mark-Sweep • Others • Mark-Compact • Reference Counting • Copying • see Chapter 1&2 of [Lins&Jones,96]

  20. The Mark-Sweep Collector • Compute reachable nodes: Mark • tracing garbage collector • Free not reachable nodes: Sweep • Run when out of memory: Allocation • First used with LISP [McCarthy, 1960]

  21. Allocation node* new() { if (free_pool is empty) mark_sweep(); …

  22. Allocation node* new() { if (free_pool is empty) mark_sweep(); return allocate(); }

  23. The Garbage Collector void mark_sweep() { for (r in roots) mark(r); …

  24. The Garbage Collector void mark_sweep() { for (r in roots) mark(r); … all live nodes marked

  25. Recursive Marking void mark(node* n) { if (!is_marked(n)) { set_mark(n); … } }

  26. Recursive Marking void mark(node* n) { if (!is_marked(n)) { set_mark(n); … } } nodes reachable from n marked

  27. Recursive Marking void mark(node* n) { if (!is_marked(n)) { set_mark(n); for (m in children(n)) mark(m); } } i-th recursion: nodes on path with length i marked

  28. The Garbage Collector void mark_sweep() { for (r in roots) mark(r); sweep(); …

  29. The Garbage Collector void mark_sweep() { for (r in roots) mark(r); sweep(); … all nodes on heap live

  30. The Garbage Collector void mark_sweep() { for (r in roots) mark(r); sweep(); … all nodes on heap live and not marked

  31. Eager Sweep void sweep() { node* n = heap_bottom; while (n < heap_top) { … } }

  32. Eager Sweep void sweep() { node* n = heap_bottom; while (n < heap_top) { if (is_marked(n)) clear_mark(n); else free(n); n += sizeof(*n); } }

  33. The Garbage Collector void mark_sweep() { for (r in roots) mark(r); sweep(); if (free_pool is empty) abort(“Memory exhausted”); }

  34. Assumptions • Nodes can be marked • Size of nodes known • Heap contiguous • Memory for recursion available • Child fields known!

  35. Assumptions: Realistic • Nodes can be marked • Size of nodes known • Heap contiguous • Memory for recursion available • Child fields known

  36. Assumptions: Conservative • Nodes can be marked • Size of nodes known • Heap contiguous • Memory for recursion available • Child fields known

  37. Mark-Sweep Properties • Covers cycles and sharing • Time depends on • live nodes (mark) • live and garbage nodes (sweep) • Computation must be stopped • non-interruptible stop/start collector • long pause • Nodes remain unchanged (as not moved) • Heap remains fragmented

  38. Software Engineering Issues • Design goal in SE: • decompose systems • in orthogonal components • Clashes with letting each component do its memory management • liveness is global property • leads to “local leaks” • lacking power of modern gc methods

  39. Typical Cost • Early systems (LISP) up to 40% [Steele,75] [Gabriel,85] • “garbage collection is expensive” myth • Well engineered system of today 10% of entire runtime [Wilson, 94]

  40. Areas of Usage • Programming languages and systems • Java, C#, Smalltalk, … • SML, Lisp, Scheme, Prolog, … • Perl, Python, PHP, JavaScript • Modula 3, Microsoft .NET • Extensions • C, C++ (Conservative) • Other systems • Adobe Photoshop • Unix filesystem • Many others in [Wilson, 1996]

  41. Understanding Garbage Collection: Benefits • Programming garbage collection • programming systems • operating systems • Understand systems with garbage collection (e.g. Java) • memory requirements of programs • performance aspects of programs • interfacing with garbage collection (finalization)

  42. References • Garbage Collection. Richard Jones and Rafael Lins, John Wiley & Sons, 1996. • Uniprocessor garbage collection techniques. Paul R. Wilson, ACM Computing Surveys. To appear. • Extended version of IWMM 92, St. Malo.

More Related