610 likes | 782 Views
Garbage Collection Introduction and Overview. Christian Schulte Programming Systems Lab Universität des Saarlandes, Germany schulte@ps.uni-sb.de. Purpose of Talk. Explaining basic concepts terminology Garbage collection… …is simple …can be explained at a high-level Organization.
E N D
Garbage Collection Introduction and Overview Christian Schulte Programming Systems Lab Universität des Saarlandes, Germany schulte@ps.uni-sb.de
Purpose of Talk • Explaining basic • concepts • terminology • Garbage collection… • …is simple • …can be explained at a high-level • Organization
Purpose of Talk • Explaining basic • concepts • terminology (never to be explained again) • Garbage collection… • …is simple • …can be explained at a high-level • Organization
Overview • What is garbage collection • objects of interest • principal notions • classic examples with assumptions and properties • Discussion • software engineering issues • typical cost • areas of usage • why knowledge is profitable • Organizational • Material • Requirements
Overview • What is garbage collection • objects of interest • principal notions • classic examples with assumptions and properties • Discussion • software engineering issues • typical cost • areas of usage • why knowledge is profitable • Organizational • Material • Requirements
Garbage Collection… …is concerned with the automatic reclamation of dynamically allocated memory after its last use by a program
Garbage Collection… • dynamically allocated memory …is concerned with the automatic reclamation of dynamically allocated memory after its last use by a program
Garbage Collection… • dynamically allocated memory • last use by a program …is concerned with the automatic reclamation of dynamically allocated memory after its last use by a program
Garbage Collection… • dynamically allocated memory • last use by a program • automatic reclamation …is concerned with the automatic reclamation of dynamically allocated memory after its last use by a program
Garbage collection… • Dynamically allocated memory • Last use by a program • Examples for automatic reclamation
Kinds of Memory Allocation static int i; void foo(void) { int j; int* p = (int*) malloc(…); }
Static Allocation • By compiler (in text area) • Available through entire runtime • Fixed size static int i; void foo(void) { int j; int* p = (int*) malloc(…); }
Automatic Allocation • Upon procedure call (on stack) • Available during execution of call • Fixed size static int i; void foo(void) { int j; int* p = (int*) malloc(…); }
Dynamic Allocation • Dynamically allocated at runtime (on heap) • Available until explicitly deallocated • Dynamically varying size static int i; void foo(void) { int j; int* p = (int*) malloc(…); }
Dynamically Allocated Memory • Also: heap-allocated memory • Allocation: malloc, new, … • before first usage • Deallocation: free, delete, dispose, … • after last usage • Needed for • C++, Java: objects • SML: datatypes, procedures • anything that outlives procedure call
Getting it Wrong • Forget to free (memory leak) • program eventually runs out of memory • long running programs: OSs. servers, … • Free to early (dangling pointer) • lucky: illegal access detected by OS • horror: memory reused, in simultaneous use • programs can behave arbitrarily • crashes might happen much later • Estimates of effort • Up to 40%! [Rovner, 1985]
p Nodes and Pointers • Node n • Memory block, cell • Pointer p • Link to node • Node access: *p • Children children(n) • set of pointers to nodes referred by n n
Mutator • Abstraction of program • introduces new nodes with pointer • redirects pointers, creating garbage
Shared Nodes • Nodes referred to by several pointers • Makes manual deallocation hard • local decision impossible • respect other pointers to node • Cycles instance of sharing
Garbage collection… • Dynamically allocated memory • Last use by a program • Examples for automatic reclamation
Last Use by a Program • Question: When is node M not any longer used by program? • Let P be any program not using M • New program sketch: Execute P; Use M; • Hence: M used P terminates • We are doomed: halting problem! • So “last use” undecidable!
Safe Approximation • Decidable and also simple • What means safe? • only unused nodes freed • What means approximation? • some unused nodes might not be freed • Idea • nodes that can be accessed by mutator
Reachable Nodes root • Reachable from root set • processor registers • static variables • automatic variables (stack) • Reachable from reachable nodes
Summary: Reachable Nodes • A node n is reachable, iff • n is element of the root set, or • n is element of children(m) and m is reachable • Reachable node also called “live”
MyGarbageCollector • Compute set of reachable nodes • Free nodes known to be not reachable • Known as mark-sweep • in a second…
Reachability: Safe Approximation • Safe • access to not reachable node impossible • depends on language semantics • but C/C++? later… • Approximation • reachable node might never be accessed • programmer must know about this! • have you been aware of this?
Garbage collection… • Dynamically allocated memory • Last use by a program • Examples for automatic reclamation
Example Garbage Collectors • Mark-Sweep • Others • Mark-Compact • Reference Counting • Copying • skipped here • read Chapter 1&2 of [Lins&Jones,96]
The Mark-Sweep Collector • Compute reachable nodes: Mark • tracing garbage collector • Free not reachable nodes: Sweep • Run when out of memory: Allocation • First used with LISP [McCarthy, 1960]
Allocation node* new() { if (free_pool is empty) mark_sweep(); …
Allocation node* new() { if (free_pool is empty) mark_sweep(); return allocate(); }
The Garbage Collector void mark_sweep() { for (r in roots) mark(r); …
The Garbage Collector void mark_sweep() { for (r in roots) mark(r); … all live nodes marked
Recursive Marking void mark(node* n) { if (!is_marked(n)) { set_mark(n); … } }
Recursive Marking void mark(node* n) { if (!is_marked(n)) { set_mark(n); … } } nodes reachable from n marked
Recursive Marking void mark(node* n) { if (!is_marked(n)) { set_mark(n); for (m in children(n)) mark(m); } } i-th recursion: nodes on path with length i marked
The Garbage Collector void mark_sweep() { for (r in roots) mark(r); sweep(); …
The Garbage Collector void mark_sweep() { for (r in roots) mark(r); sweep(); … all nodes on heap live
The Garbage Collector void mark_sweep() { for (r in roots) mark(r); sweep(); … all nodes on heap live and not marked
Eager Sweep void sweep() { node* n = heap_bottom; while (n < heap_top) { … } }
Eager Sweep void sweep() { node* n = heap_bottom; while (n < heap_top) { if (is_marked(n)) clear_mark(n); else free(n); n += sizeof(*n); } }
The Garbage Collector void mark_sweep() { for (r in roots) mark(r); sweep(); if (free_pool is empty) abort(“Memory exhausted”); }
Assumptions • Nodes can be marked • Size of nodes known • Heap contiguous • Memory for recursion available • Child fields known!
Assumptions: Realistic • Nodes can be marked • Size of nodes known • Heap contiguous • Memory for recursion available • Child fields known
Assumptions: Conservative • Nodes can be marked • Size of nodes known • Heap contiguous • Memory for recursion available • Child fields known
Mark-Sweep Properties • Covers cycles and sharing • Time depends on • live nodes (mark) • live and garbage nodes (sweep) • Computation must be stopped • non-interruptible stop/start collector • long pause • Nodes remain unchanged (as not moved) • Heap remains fragmented
Variations of Mark-Sweep • In your talk…
Implementation • In your talk…
Efficiency Analysis • In your talk…
Comparison • In your talk…