400 likes | 407 Views
Dive into the world of memory management and storage allocation in programming languages through the lens of garbage collection. Explore different methods such as static allocation, stack allocation, and heap allocation, along with techniques like reference counting, mark and sweep, stop and copy. Learn about the advantages and disadvantages of various garbage collection approaches.
E N D
Lecture 22: Shameless Self-Promotion From bnelson@netcom.com (Bob Nelson) Subject Re: NT vs. Linux Date Fri, 5 Jul 1996 05:11:22 GMT Newsgroups comp.os.linux.advocacy,comp.sys.ibm.pc.hardware, comp.os.ms-windows.win95.misc, comp.os.mswindows.nt.misc, alt.flame,alt.fan.bill-gates,alt.destroy.microsoft ------------------------------------------------------------------------ Toni Anzlovar (toni.anzlovar@kiss.uni-lj.si) wrote: > Why does everybody want to RUN WORD? Why does nobody want to write and edit > text? Simple. A *tremendous* number of documents are written using Microsoft Word. One that is particularly ironic is the guide to LCLint -- a very popular lint tool -- often the lint of choice in the linux world. David Evans http://www.cs.virginia.edu/~evans CS655: Programming Languages University of Virginia Computer Science
Menu • Garbage Collection • Theory of Type Qualifiers • PS3, Question 5 • LCLint University of Virginia CS 655
Static Storage Allocation • All storage allocated at compile time • Advantages: • Fast • Safe (cannot run out of memory) • Disadvantages: • Limited expressiveness • No recursion, no dynamic structures • Inefficient (sizes must be known at compile time) • FORTRAN University of Virginia CS 655
Stack Allocation • Activation records • Storage allocated on procedure entrance, deallocated on procedure exit • Advantages over static allocation: • Supports recursion • Local structures size may vary • But: storage lifetimes fixed to procedures • Algol60 University of Virginia CS 655
Heap Allocation • Dynamically allocate storage • Advantages • Storage size and lifetime controlled by programmer • Disadvantages • Storage size and lifetime controlled by programmer • Heap fills up with garbage University of Virginia CS 655
What is Garbage? • Allocated memory that will never be used again • Conservative Predictions: • Java, CLU, LISP, ML • Objects that are not reachable • C/C++ • Reachability is much harder because of pointer arithmetic, casting • Linda • No way to tell University of Virginia CS 655
Reference Counting • Every allocated object has an associated reference counter, rc • Creating a new object, rc = 0 • Assignment (creating a reference), rc++ • Losing a reference: rc--; if rc == 0 free object • Advantages: • Overhead distributed • Disadvantages: • High overhead on assignments, block exits • Can’t reclaim cyclic structures University of Virginia CS 655
next: next: 1 next: 1 1 Cannot be reclaimed! Cyclic structures List x next: next: 1 next: 2 1 x := new List ();
Reachability • “Root” is reachable • Object is reachable, if there is a reachable reference to it • Roots: • CLU • Every object reference on the stack, own variables • Java • Every object reference on the stack, static (global) references University of Virginia CS 655
Mark and Sweep root
Mark and Sweep root
Mark and Sweep • Stop everything • Mark objects reachable from roots • Just follow all references recursively • Reclaim everything that isn’t marked • Disadvantages • Long pauses (what gave GC a bad name) • Advantages • Simple, no overhead except when GC’ing University of Virginia CS 655
Stop and Copy • Divide storage into two spaces • Stop everything • Start from roots, copy all reachable objects to new space (switching references to point to new space as you go) • Advantages: • Improves locality – better cache behavior • Disadvantages: • Have to waste memory (need new space to copy into) • Changes references (okay if language has address transparency) University of Virginia CS 655
Garbage Collecting in C/C++ • Problem: what is reachable? • Approach 1: • Keep table of malloc’ed objects • Assume all values that look like pointers (have value in address range) are pointers, and make all pointer-like values on stack, in registers, in static storage the roots • Any object not reachable is from roots is garbage University of Virginia CS 655
GC Test Program 1 char *evil () { char *s = malloc (1000); long int adr1 = (long int) s & 0xFFFF0000; long int adr2 = (long int) s & 0x0000FFFF; s = malloc (1000); s = (char *) (adr1 | adr2); return s; } University of Virginia CS 655
Test Fragment 2 char *s = ...; // read a new string int len = 0; while (*s != ‘\0’) { *s = tolower (*s); s++; len++; } s = s – len; // point back to string start University of Virginia CS 655
Boehm [PLDI 96] • ANSI limits pointer arithmetic to within an allocated object • Source code transformations to explicitly mark live references (put them in GC roots) • Check source code for casts from non-pointer to pointer University of Virginia CS 655
GC Summary • After 40 years, still an active research area • PLDI ‘2000 – 3 GC papers (of 31) • Concurrent; generational; dealing with contaminated storage • Performance penalty can be low or negative (improved cache behavior) University of Virginia CS 655
Alternatives to GC • Never reclaim storage • Works fine for most PC applications, but not for embedded systems • Loses locality – real reason to GC most programs • Manually reclaim storage • Buggy, dangerous and time-consuming • Support manual reclamation with static checking University of Virginia CS 655
Type Qualifiers q negative (narrowing) qualifier q positive (widening) qualifier Is unsigned (in C) a qualifier? No, neither unsigned int int int unsigned int is true. University of Virginia CS 655
Subtyping Rule Why not ? QQ’ i = ’ii [1..n] [type-constructor] Q c(1 ,..., n ) Q’ c(’1 ,..., ’n ) University of Virginia CS 655
PS3, Question 5: Subtyping S T [monotonic-arrays] array[S]array[T ] S =T [specialization of type-constructor] array[S]array[T ] University of Virginia CS 655
PS3, Question 5 • Goal: call Scrunch (p) with a p that allows attacker to violate type safety • [monotonic-arrays] means typeof(p) must be array[String], so we can have typeof(p) = array[EvilString] where EvilString String • Method overriding allows attacker to define EvilString.concat String.concat University of Virginia CS 655
Overriding String.concat P1Q1, ..., Pn Qn, ST [monotonic-procs] proc (P1, ..., Pn) returns (S) proc (Q1, ... , Qn) returns (T) So, String EvilString.concat (EvilString s) String String.concat (String s) University of Virginia CS 655
Can attacker make Scrunch call EvilString.concat (s) where s EvilString? Implementing EvilString class EvilString extends String { private int hackPointer; String concat (EvilString s) { this.hackPointer = // forge an address return super.concat (s); } ... } University of Virginia CS 655
No! Scrunch calls s.concat after initializing, String s = “”; ... s = s.concat ((String) ar[i]); ... so, EvilString.concat is never called! Attacker loses. If replaced initialization with, String s = a[0]; then, would call EvilString.concat(EvilString) if a[0] is EvilString, and a[1] is String EvilString. University of Virginia CS 655
Real Java • Has the [monotonic-array] rule! String [] s; SubString [] t; ... s = t; s[0] = new String (“test”); SubString tt = t[0]; • Assignment produces an ArrayStoreException University of Virginia CS 655
const • const is a positive qualifier const • values can be cast to const • const values cannot be cast to (Note: ANSI C allows it, but result is implementation dependent) • const qualified values can be initialized but not updated • From stdlib: char *strcpy (const char *, char *); University of Virginia CS 655
A e1: const ref(2) A e2 : 2 [assign’] A e1:= e2: unit Assign A e1:ref(2) A e2 : 2 [assign] A e1:= e2: unit University of Virginia CS 655
No changes necessary: const No const rule means const So, existing [call] rule disallows passing const as parameter. Call A e1:2 A e2 : 2 [call] A e1(e2) : University of Virginia CS 655
LCLint Approach • Programmers add annotations (formal specifications) • Simple and precise • Describe programmers intent: Types, memory management, data hiding, aliasing, modification, nullness, etc. (project group 3: buffer overflows) • LCLint detects inconsistencies between annotations and code • Simple (fast!) dataflow analyses University of Virginia CS 655
Sample Annotation: only extern only char *gptr; extern only out null void *malloc (int); • Reference (return value) owns storage • No other persistent (non-local) references to it • Implies obligation to transfer ownership • Transfer ownership by: • Assigning it to an external only reference • Return it as an only result • Pass it as an only parameter: e.g., extern void free (only void *); University of Virginia CS 655
Example • extern only out null void *malloc (int); in library 1 int dummy (void) { 2 int *ip= (int *) malloc (sizeof (int)); 3 *ip = 3; 4 return *ip; 5 } LCLint output: dummy.c:3:4: Dereference of possibly null pointer ip: *ip dummy.c:2:13: Storage ip may become null dummy.c:4:14: Fresh storage ip not released before return dummy.c:2:43: Fresh storage ip allocated University of Virginia CS 655
only • Try: only is a negative qualifier only • From stdlib: only void *malloc (size_t); void free (only void *); • Does call rule work? (only pass onlys as onlys) • But, after call state is changed: only char *x; ... free (x); free(x); University of Virginia CS 655
Operational Semantics • After passing as only, becomes dead. • Configuration: < Instructions, PC, Store > Store: loc <value, state { only, dead, ... } University of Virginia CS 655
Pass as Only Instructions[PC] = f (e) & Store ( f ) = < vf, only > & Store (e) = < ve, only > PC = PC + 1; Store’ = Store[e < , dead >] University of Virginia CS 655
Assign Only Instructions[PC] = l := r & Store ( l ) = < vl, only > & Store (r) = < vr, only > PC = PC + 1 Store’ = Store[l < vr, only >] [r < , dead >] University of Virginia CS 655
Still Challenge Problem Remaining • How do you handle declarations? • How do you handle block exits? • Would denotational semantics work better? • What about a combination of static and operational? • How do you handle other annotations consistently? University of Virginia CS 655
Summary • Theory of Type Qualifiers uses: • Static Semantics (Typing Judgments) • Subtyping Rules • Type Polymorphism • Type Inference • Lambda Calculus • Operational Semantics • If you understand everything in this paper, you know 75% of what you need to for the final. University of Virginia CS 655
Charge • Next time: • Wacky Programming Paradigms • Guidelines for Rotunda Presentations • Signup for Final Timeslots • Project Final Reports due Friday • All team members should read complete drafts of your report University of Virginia CS 655