280 likes | 287 Views
This lecture discusses garbage collection in C, including pointer arithmetic, type checking, and the challenges of implementing garbage collection in C.
E N D
Lecture 18: Deep C Garbage Collection David Evans http://www.cs.virginia.edu/evans CS201j: Engineering Software University of Virginia Computer Science
Menu • Pointers in C • Pointer Arithmetic • Type checking in C • Why is garbage collection hard in C? CS 201J Fall 2002
What are those arrows really? Heap Stack sb “hello” CS 201J Fall 2002
Pointers • In Java, an object reference is really just an address in memory • But Java doesn’t let programmers manipulate addresses directly Heap Stack 0x80496f0 0x80496f4 0x80496f8 hell o\0\0\0 0x80496fb sb 0x80496f8 0x8049700 0x8049704 0x8049708 CS 201J Fall 2002
Pointers in C • Addresses in memory • Programs can manipulate addresses directly &expr Evaluates to the address of the location expr evaluates to *exprEvaluates to the value stored in the address expr evaluates to CS 201J Fall 2002
&*%&@#*! int f (void) { int s = 1; int t = 1; int *ps = &s; int **pps = &ps; int *pt = &t; **pps = 2; pt = ps; *pt = 3; t = s; } s == 1, t == 1 s == 2, t == 1 s == 3, t == 1 s == 3, t == 3 CS 201J Fall 2002
Rvalues and Lvalues What does = really mean? int f (void) { int s = 1; int t = 1; t = s; t = 2; } left side of = is an “lvalue” it evaluates to a location (address)! right side of = is an “rvalue” it evaluates to a value There is an implicit * when a variable is used as an rvalue! CS 201J Fall 2002
BLISS Aside • BLISS [Wulf71] • Made getting values explicit s = .t; Puts the value in the location t in the location s CS 201J Fall 2002
Parameter Passing in C • Actual parameters are rvalues void swap (int a, int b) { int tmp = b; b = a; a = tmp; } int main (void) { int i = 3; int j = 4; swap (i, j); … } The value of i (3) is passed, not its location! swap does nothing CS 201J Fall 2002
Parameter Passing in C • Can pass addresses around void swap (int *a, int *b) { int tmp = *b; *b = *a; *a = tmp; } int main (void) { int i = 3; int j = 4; swap (&i, &j); … } The value of &i is passed, which is the address of i CS 201J Fall 2002
Beware! int *value (void) { int i = 3; return &i; } void callme (void) { int x = 35; } int main (void) { int *ip; ip = value (); printf (“*ip == %d\n", *ip); callme (); printf ("*ip == %d\n", *ip); } > splint value.c Splint 3.0.1.7 --- 08 Aug 2002 value.c: (in function value) value.c:4:10: Stack-allocated storage &i reachable from return value: &i A stack reference is pointed to by an external reference when the function returns. The stack-allocated storage is destroyed after the call, leaving a dangling reference. (Use -stackref to inhibit warning) … But it could really be anything! *ip == 3 *ip == 35 CS 201J Fall 2002
Manipulating Addresses char s[6]; s[0] = ‘h’; s[1] = ‘e’; s[2]= ‘l’; s[3] = ‘l’; s[4] = ‘o’; s[5] = ‘\0’; printf (“s: %s\n”, s); expr1[expr2] in C is just syntactic sugar for *(expr1 + expr2) s: hello CS 201J Fall 2002
Obfuscating C char s[6]; *s = ‘h’; *(s + 1) = ‘e’; 2[s] = ‘l’; 3[s] = ‘l’; *(s + 4) = ‘o’; 5[s] = ‘\0’; printf (“s: %s\n”, s); s: hello CS 201J Fall 2002
Fun with Pointer Arithmetic int match (char *s, char *t) { int count = 0; while (*s == *t) { count++; s++; t++; } return count; } int main (void) { char s1[6] = "hello"; char s2[6] = "hohoh"; printf ("match: %d\n", match (s1, s2)); printf ("match: %d\n", match (s2, s2 + 2)); printf ("match: %d\n", match (&s2[1], &s2[3])); } &s2[1] &(*(s2 + 1)) s2 + 1 The \0 is invisible! match: 1 match: 3 match: 2 CS 201J Fall 2002
Condensing match int match (char *s, char *t) { int count = 0; while (*s == *t) { count++; s++; t++; } return count; } int match (char *s, char *t) { char *os = s; while (*s++ == *t++); return s – os - 1; } s++ evaluates to spre, but changes the value of s Hence, C++ has the same value as C, but has unpleasant side effects. CS 201J Fall 2002
Type Checking in C • Java: only allow programs the compiler can prove are type safe • C: trust the programmer. If she really wants to compare apples and oranges, let her. Exception: run-time type errors for downcasts and array element stores. CS 201J Fall 2002
Type Checking int main (void) { char *s = (char *) 3; printf ("s: %s", s); } Windows2000 (earlier versions of Windows would just crash the whole machine) CS 201J Fall 2002
In Praise of Type Checking int match (int *s, int *t) { int *os = s; while (*s++ == *t++); return s - os; } int main (void) { char s1[6] = "hello"; char s2[6] = "hello"; printf ("match: %d\n", match (s1, s2)); } match: 2 CS 201J Fall 2002
Different Matching int different (int *s, int *t) { int *os = s; while (*s++ != *t++); return s - os; } int main (void) { char s1[6] = "hello"; printf ("different: %d\n", different ((int *)s1, (int *)s1 + 1)); } different: 29 CS 201J Fall 2002
So, why is it hard to garbage collect C? CS 201J Fall 2002
Mark and Sweep (Java version) active = all objects on stack while (!active.isEmpty ()) newactive = { } foreach (Object a in active) mark a as reachable foreach (Object o that a points to) if o is not marked newactive = newactive U { o } active = newactive sweep () // remove unmarked objects on heap CS 201J Fall 2002
Mark and Sweep (C version?) active = all pointers on stack while (!active.isEmpty ()) newactive = { } foreach (pointer a in active) mark *a as reachable foreach (address p that a points to) if *p is not marked newactive = newactive U { *p } active = newactive sweep () // remove unmarked objects on heap CS 201J Fall 2002
GC Challenges char *f (void) { char *s = (char *) malloc (sizeof (char) * 100); s = s + 20; *s = ‘a’; return s – 20; } There may be objects that only have pointers to their middle! CS 201J Fall 2002
GC Challenges char *f (void) { char *s = (char *) malloc (sizeof (char) * 100); int x = (int) s; s = 0; return (char *) x; } There may be objects that are reachable through values that have non-pointer apparent types! CS 201J Fall 2002
GC Challenges char *f (void) { char *s = (char *) malloc (sizeof (char) * 100); int x = (int) s; x = x - &f; s = 0; return (char *) (x + &f); } There may be objects that are reachable through values that have non-pointer apparent types and have values that don’t even look like addresses! CS 201J Fall 2002
Why not just do reference counting? Where can you store the references? Remember C programs can access memory directly, better not change how objects are stored! CS 201J Fall 2002
Summary • Garbage collection depends on: • Knowing which values are addresses • Knowing that objects without references cannot be reached • Both of these are problems in C • Nevertheless, there are some garbage collectors for C. • Change meaning of some programs • Slow down programs a lot • Are not able to find all garbage CS 201J Fall 2002
Charge • PS6 due Tuesday • Exam 2 out Thursday • Send review questions if you want a review class • Remaining classes: • Java Byte Codes • Security • Concurrency without synchronization • Project Management Garbage Collectors (COAX, Seoul, 18 June 2002) CS 201J Fall 2002