1 / 40

Understanding Garbage Collection in Programming Languages

Dive into the world of memory management and storage allocation in programming languages through the lens of garbage collection. Explore different methods such as static allocation, stack allocation, and heap allocation, along with techniques like reference counting, mark and sweep, stop and copy. Learn about the advantages and disadvantages of various garbage collection approaches.

Download Presentation

Understanding Garbage Collection in Programming Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 22: Shameless Self-Promotion From bnelson@netcom.com (Bob Nelson) Subject Re: NT vs. Linux Date Fri, 5 Jul 1996 05:11:22 GMT Newsgroups comp.os.linux.advocacy,comp.sys.ibm.pc.hardware, comp.os.ms-windows.win95.misc, comp.os.mswindows.nt.misc, alt.flame,alt.fan.bill-gates,alt.destroy.microsoft ------------------------------------------------------------------------ Toni Anzlovar (toni.anzlovar@kiss.uni-lj.si) wrote: > Why does everybody want to RUN WORD? Why does nobody want to write and edit > text? Simple. A *tremendous* number of documents are written using Microsoft Word. One that is particularly ironic is the guide to LCLint -- a very popular lint tool -- often the lint of choice in the linux world. David Evans http://www.cs.virginia.edu/~evans CS655: Programming Languages University of Virginia Computer Science

  2. Menu • Garbage Collection • Theory of Type Qualifiers • PS3, Question 5 • LCLint University of Virginia CS 655

  3. Static Storage Allocation • All storage allocated at compile time • Advantages: • Fast • Safe (cannot run out of memory) • Disadvantages: • Limited expressiveness • No recursion, no dynamic structures • Inefficient (sizes must be known at compile time) • FORTRAN University of Virginia CS 655

  4. Stack Allocation • Activation records • Storage allocated on procedure entrance, deallocated on procedure exit • Advantages over static allocation: • Supports recursion • Local structures size may vary • But: storage lifetimes fixed to procedures • Algol60 University of Virginia CS 655

  5. Heap Allocation • Dynamically allocate storage • Advantages • Storage size and lifetime controlled by programmer • Disadvantages • Storage size and lifetime controlled by programmer • Heap fills up with garbage University of Virginia CS 655

  6. What is Garbage? • Allocated memory that will never be used again • Conservative Predictions: • Java, CLU, LISP, ML • Objects that are not reachable • C/C++ • Reachability is much harder because of pointer arithmetic, casting • Linda • No way to tell University of Virginia CS 655

  7. Reference Counting • Every allocated object has an associated reference counter, rc • Creating a new object, rc = 0 • Assignment (creating a reference), rc++ • Losing a reference: rc--; if rc == 0 free object • Advantages: • Overhead distributed • Disadvantages: • High overhead on assignments, block exits • Can’t reclaim cyclic structures University of Virginia CS 655

  8. next: next: 1 next: 1 1 Cannot be reclaimed! Cyclic structures List x next: next: 1 next: 2 1 x := new List ();

  9. Reachability • “Root” is reachable • Object is reachable, if there is a reachable reference to it • Roots: • CLU • Every object reference on the stack, own variables • Java • Every object reference on the stack, static (global) references University of Virginia CS 655

  10. Mark and Sweep root

  11. Mark and Sweep root

  12. Mark and Sweep • Stop everything • Mark objects reachable from roots • Just follow all references recursively • Reclaim everything that isn’t marked • Disadvantages • Long pauses (what gave GC a bad name) • Advantages • Simple, no overhead except when GC’ing University of Virginia CS 655

  13. Stop and Copy • Divide storage into two spaces • Stop everything • Start from roots, copy all reachable objects to new space (switching references to point to new space as you go) • Advantages: • Improves locality – better cache behavior • Disadvantages: • Have to waste memory (need new space to copy into) • Changes references (okay if language has address transparency) University of Virginia CS 655

  14. Garbage Collecting in C/C++ • Problem: what is reachable? • Approach 1: • Keep table of malloc’ed objects • Assume all values that look like pointers (have value in address range) are pointers, and make all pointer-like values on stack, in registers, in static storage the roots • Any object not reachable is from roots is garbage University of Virginia CS 655

  15. GC Test Program 1 char *evil () { char *s = malloc (1000); long int adr1 = (long int) s & 0xFFFF0000; long int adr2 = (long int) s & 0x0000FFFF; s = malloc (1000); s = (char *) (adr1 | adr2); return s; } University of Virginia CS 655

  16. Test Fragment 2 char *s = ...; // read a new string int len = 0; while (*s != ‘\0’) { *s = tolower (*s); s++; len++; } s = s – len; // point back to string start University of Virginia CS 655

  17. Boehm [PLDI 96] • ANSI limits pointer arithmetic to within an allocated object • Source code transformations to explicitly mark live references (put them in GC roots) • Check source code for casts from non-pointer to pointer University of Virginia CS 655

  18. GC Summary • After 40 years, still an active research area • PLDI ‘2000 – 3 GC papers (of 31) • Concurrent; generational; dealing with contaminated storage • Performance penalty can be low or negative (improved cache behavior) University of Virginia CS 655

  19. Alternatives to GC • Never reclaim storage • Works fine for most PC applications, but not for embedded systems • Loses locality – real reason to GC most programs • Manually reclaim storage • Buggy, dangerous and time-consuming • Support manual reclamation with static checking University of Virginia CS 655

  20. Type Qualifiers q    negative (narrowing) qualifier   q  positive (widening) qualifier Is unsigned (in C) a qualifier? No, neither unsigned int  int int  unsigned int is true. University of Virginia CS 655

  21. Subtyping Rule Why not  ? QQ’ i = ’ii  [1..n] [type-constructor] Q c(1 ,..., n ) Q’ c(’1 ,..., ’n ) University of Virginia CS 655

  22. PS3, Question 5: Subtyping S T [monotonic-arrays] array[S]array[T ] S =T [specialization of type-constructor] array[S]array[T ] University of Virginia CS 655

  23. PS3, Question 5 • Goal: call Scrunch (p) with a p that allows attacker to violate type safety • [monotonic-arrays] means typeof(p) must be array[String], so we can have typeof(p) = array[EvilString] where EvilString  String • Method overriding allows attacker to define EvilString.concat String.concat University of Virginia CS 655

  24. Overriding String.concat P1Q1, ..., Pn Qn, ST [monotonic-procs] proc (P1, ..., Pn) returns (S)  proc (Q1, ... , Qn) returns (T) So, String EvilString.concat (EvilString s)  String String.concat (String s) University of Virginia CS 655

  25. Can attacker make Scrunch call EvilString.concat (s) where s  EvilString? Implementing EvilString class EvilString extends String { private int hackPointer; String concat (EvilString s) { this.hackPointer = // forge an address return super.concat (s); } ... } University of Virginia CS 655

  26. No! Scrunch calls s.concat after initializing, String s = “”; ... s = s.concat ((String) ar[i]); ... so, EvilString.concat is never called! Attacker loses. If replaced initialization with, String s = a[0]; then, would call EvilString.concat(EvilString) if a[0] is EvilString, and a[1] is String EvilString. University of Virginia CS 655

  27. Real Java • Has the [monotonic-array] rule! String [] s; SubString [] t; ... s = t; s[0] = new String (“test”); SubString tt = t[0]; • Assignment produces an ArrayStoreException University of Virginia CS 655

  28. const • const is a positive qualifier  const  •  values can be cast to const  • const  values cannot be cast to  (Note: ANSI C allows it, but result is implementation dependent) • const qualified values can be initialized but not updated • From stdlib: char *strcpy (const char *, char *); University of Virginia CS 655

  29. A e1: const ref(2) A e2 : 2 [assign’] A e1:= e2:  unit Assign A e1:ref(2) A e2 : 2 [assign] A e1:= e2:  unit University of Virginia CS 655

  30. No changes necessary:  const  No const    rule means const    So, existing [call] rule disallows passing const  as  parameter. Call A e1:2  A e2 : 2 [call] A e1(e2) :  University of Virginia CS 655

  31. LCLint Approach • Programmers add annotations (formal specifications) • Simple and precise • Describe programmers intent: Types, memory management, data hiding, aliasing, modification, nullness, etc. (project group 3: buffer overflows) • LCLint detects inconsistencies between annotations and code • Simple (fast!) dataflow analyses University of Virginia CS 655

  32. Sample Annotation: only extern only char *gptr; extern only out null void *malloc (int); • Reference (return value) owns storage • No other persistent (non-local) references to it • Implies obligation to transfer ownership • Transfer ownership by: • Assigning it to an external only reference • Return it as an only result • Pass it as an only parameter: e.g., extern void free (only void *); University of Virginia CS 655

  33. Example • extern only out null void *malloc (int); in library 1 int dummy (void) { 2 int *ip= (int *) malloc (sizeof (int)); 3 *ip = 3; 4 return *ip; 5 } LCLint output: dummy.c:3:4: Dereference of possibly null pointer ip: *ip dummy.c:2:13: Storage ip may become null dummy.c:4:14: Fresh storage ip not released before return dummy.c:2:43: Fresh storage ip allocated University of Virginia CS 655

  34. only • Try: only is a negative qualifier only    • From stdlib: only void *malloc (size_t); void free (only void *); • Does call rule work? (only pass onlys as onlys) • But, after call state is changed: only char *x; ... free (x); free(x); University of Virginia CS 655

  35. Operational Semantics • After passing as only, becomes dead. • Configuration: < Instructions, PC, Store > Store: loc  <value, state  { only, dead, ... } University of Virginia CS 655

  36. Pass as Only Instructions[PC] = f (e) & Store ( f ) = < vf, only   > & Store (e) = < ve, only > PC = PC + 1; Store’ = Store[e  < , dead >] University of Virginia CS 655

  37. Assign Only Instructions[PC] = l := r & Store ( l ) = < vl, only > & Store (r) = < vr, only > PC = PC + 1 Store’ = Store[l < vr, only >] [r < , dead >] University of Virginia CS 655

  38. Still Challenge Problem Remaining • How do you handle declarations? • How do you handle block exits? • Would denotational semantics work better? • What about a combination of static and operational? • How do you handle other annotations consistently? University of Virginia CS 655

  39. Summary • Theory of Type Qualifiers uses: • Static Semantics (Typing Judgments) • Subtyping Rules • Type Polymorphism • Type Inference • Lambda Calculus • Operational Semantics • If you understand everything in this paper, you know 75% of what you need to for the final. University of Virginia CS 655

  40. Charge • Next time: • Wacky Programming Paradigms • Guidelines for Rotunda Presentations • Signup for Final Timeslots • Project Final Reports due Friday • All team members should read complete drafts of your report University of Virginia CS 655

More Related