460 likes | 653 Views
Cyclone Memory Management. Greg Morrisett Cornell University & Microsoft Research. Joint Work:. Trevor Jim, Dan Grossman, Mike Hicks, Yanling Wang, James Cheney. The Cyclone Project. Cyclone is a type-safe dialect of C: Started with C syntax and semantics.
E N D
Cyclone Memory Management Greg Morrisett Cornell University & Microsoft Research
Joint Work: • Trevor Jim, Dan Grossman, Mike Hicks, • Yanling Wang, James Cheney.
The Cyclone Project • Cyclone is a type-safe dialect of C: • Started with C syntax and semantics. • Threw out things that could lead to type errors: • Unsafe casts, unions, pointer arithmetic, deallocation, … • Yields a safe but very small language. • Making up for missing features through: • An advanced type system (polymorphism, regions, …) • An intra-procedural flow analysis (def. assignment,…) • New language features (tagged unions, fat pointers,…) • Run-time checks (array bounds checks,…)
Cyclone in Practice • Cyclone compiler, tools, & libraries • Over 100 KLOC • Eating our own dog food ensures some degree of quality, usability. • Windows and Linux device drivers • UPenn: In-kernel Network Monitoring • Cornell & Maryland: MediaNet • Leiden Institute: Open Kernel Environment • Moving on to embedded systems. • Utah: RBClick Router
This Talk • Region-based Memory Management • Type-safe stack allocation • Interaction with ADTs and polymorphism • Adding lexical arena allocation • Adding dynamic arena allocation • Integrating unique pointers • Flow analysis • Sharing unique objects • Temporary aliasing
Spot the problem… • char *itoa(int i) {char buf[20];sprintf(buf,"%d",i);return buf; • } • Most compilers will warn you of this…
But Consider... • list_t global = NULL; • // inserts item into global list • void insert(char *item) { • list_t e = (list_t)malloc(sizeof(struct List)); • e->hd = item; • e->tl = global; • global = e; • } • void foo(int x) {char buf[20];sprintf(buf, "%d", x);insert(buf); • }
Regions: • We use a region-based type system to prevent dereferencing a dangling pointer. • Based on work by Tofte & Talpin plus others. • Each lexical block is given a unique (compile-time) name called a region. • Each pointer type indicates the region into which the value points. • Only pointers into live regions can be dereferenced. • Region polymorphism ensures reusable code. • Region inference and default region annotations minimize the burden on the programmer.
Points: what you think you wrote • typedefstruct Point {int x,y;} pt; • void addTo(pt * a, pt * b) { • a->x += b->x; • a->y += b->y; • return a; • } • pt p = {1,2}; • void main() { • pt q = {3,4}; • pt * pptr = &p; • pt * qptr = &q; • addTo(pptr, qptr); • }
Points: what you really wrote • typedefstruct Point {int x,y;} pt; • void addTo<r1,r2>(pt *r1 a, pt *r2 b : {r1, r2}) { • a->x += b->x; • a->y += b->y; • } • pt p = {1,2}; • void main() `main:{ • pt q = {3,4}; • pt *Heap pptr = &p; • pt *`main qptr = &q; • addTo<Heap,`main>(pptr, qptr); • } point p is allocatedin the heap region (Heap) point q is allocatedin main's region (`main) the region of an object is reflected in a pointer's type
Points Continued • typedefstruct Point {int x,y;} pt; • void addTo<r1, r2>(pt *r1 a, pt *r2 b : {r1, r2}) { • a->x += b->x; • a->y += b->y; • } • pt p = {1,2}; • void main() `main:{ • pt q = {3,4}; • pt *Heap pptr = &p; • pt *`main qptr = &q; • addTo<Heap,`main>(pptr, qptr); • } This function is parameterized by two regions corresponding to the two pointers passed in. By default, we assume such regions are live across the call. Any caller has to prove that the regions will be live.
Points Inferred • typedefstruct Point {int x,y;} pt; • void addTo(pt * a, pt * b) { • a->x += b->x; • a->y += b->y; • } • pt p = {1,2}; • void main() { • pt q = {3,4}; • pt * pptr = &p; • pt * qptr = &q; • addTo(pptr, qptr); • } Missing regions in prototypes are replaced with a fresh type variable (just like type-polymorphism). We generalize over the region variables for function arguments. We perform local type inference. Result: very few region annotations.
Region Subtyping • Because blocks are allocated in a LIFO fashion, we can safely treat a pointer into region r1 as a pointer into r2 if r1 was defined before r2. • int foo(int *r1 x, int *r2 y) • `foo:{ int *`foo z; • if (rand()) z = x; // r1 <: `foo • else • z = y; // r2 <: `foo • ... • }
Region Subtyping, contd. • Programmers can specify region ordering relations as pre-conditions to functions. • int *r1 foo(int *r1 x, int *r2 y: r2<:r1) • { • if (rand()) return x; • else • return y; • } • Note: the Heap outlives all regions…
Dangling Pointers Revisited • char *??? itoa(int i) • `itoa:{char buf[20];sprintf(buf,"%d",i); return buf;// buf: char*`itoa • } • The buffer lives in region `itoa. • But `itoa is not in scope for the return type. • And `itoa does not outlive any other region. • Ergo, there’s no way to make the example type-check. • It seems as though pointers cannot escape the scope defining their lifetimes. • Or can they?
Ensuring Soundness • With out a mechanism for hiding regions in types, it would be sufficient to check if a region is in scope to determine whether or not it is still live. • But Cyclone includes support for polymorphism and abstract data types (, ) which provide ways to hide regions within abstracted types. • Closures and objects also hide types. • can be used to encode closures, objects • We must somehow ensure these pointers aren't dereferenced after the region is deallocated.
Live Region Sets • We keep track of the set of live regions (always a subset of those in scope). • Analogous to the effects used by Tofte & Talpin. • To access an ADT that hides some abstract set of regions S, you must present evidence that S is a subset of the currently live regions. • Achieved by lifting region subtype constraints to sets of regions: r <: S meaning if r is live, then so are all the regions in S. • For details, see PLDI’02 paper.
Beyond Stack Allocation • Thus far, the only way to allocate an object is by declaring it as a local variable and taking its address (stack allocation). • Pros: • No run-time checks -> all errors caught at compile time. • Covers caller allocates & callee reads/writes. • Easy to determine space bounds (w/o recursion). • Constant time [de]allocation. • Sufficient for lots of systems code! • Cons: • Caller doesn’t know how much to allocate -- gets(). • Lifetimes of objects are constrained to LIFO.
Lexical Arena Allocation • char *r rgets(handle_t<r>); • char *r append(handle_t<r>, char *r1, char *r2); • void bar(int n) { • region<r> r; • char *r str1 = rmalloc(r,6); • strncpy(str1, “Hello “, 5); • { region<s> s; • char *s str2 = rgets(s); • str1 = append(r,str1,str2); • } • printf(“%s”,str1); • }
Lexical Arena Allocation • char *r rgets(handle_t<r>); • char *r append(handle_t<r>, char *r1, char *r2); • void bar(int n) { • region<r> r; • char *r str1 = rmalloc(r,6); • strncpy(str1, “Hello “, 5); • { region<s> s; • char *s str2 = rgets(s); • str1 = append(r,str1,str2); • } • printf(“%s”,str1); • } Declares a new region r, whose lifetime corresponds to the block. The variable r is a handle for allocating in r.
Lexical Arena Allocation • char *r rgets(handle_t<r>); • char *r append(handle_t<r>, char *r1, char *r2); • void bar(int n) { • region<r> r; • char *r str1 = rmalloc(r,6); • strncpy(str1, “Hello “, 5); • { region<s> s; • char *s str2 = rgets(s); • str1 = append(r,str1,str2); • } • printf(“%s”,str1); • } rmalloc(r, n)allocates n bytes in the region for which r is a handle.
Lexical Arena Allocation • char *r rgets(handle_t<r>); • char *r append(handle_t<r>, char *r1, char *r2); • void bar(int n) { • region<r> r; • char *r str1 = rmalloc(r,6); • strncpy(str1, “Hello “, 5); • { region<s> s; • char *s str2 = rgets(s); • str1 = append(r,str1,str2); • } • printf(“%s”,str1); • } Declares a region s with handle s.
Lexical Arena Allocation • char *r rgets(handle_t<r>); • char *r append(handle_t<r>, char *r1, char *r2); • void bar(int n) { • region<r> r; • char *r str1 = rmalloc(r,6); • strncpy(str1, “Hello “, 5); • { region<s> s; • char *s str2 = rgets(s); • str1 = append(r,str1,str2); • } • printf(“%s”,str1); • } rgetsallocates its result in s by using the handle it is passed.
Lexical Arena Allocation • char *r rgets(handle_t<r>); • char *r append(handle_t<r>, char *r1, char *r2); • void bar(int n) { • region<r> r; • char *r str1 = rmalloc(r,6); • strncpy(str1, “Hello “, 5); • { region<s> s; • char *s str2 = rgets(s); • str1 = append(r,str1,str2); • } • printf(“%s”,str1); • } Same for append, except the result for this call goes in r.
Lexical Arena Allocation • char *r rgets(handle_t<r>); • char *r append(handle_t<r>, char *r1, char *r2); • void bar(int n) { • region<r> r; • char *r str1 = rmalloc(r,6); • strncpy(str1, “Hello “, 5); • { region<s> s; • char *s str2 = rgets(s); • str1 = append(r,str1,str2); • } • printf(“%s”,str1); • } Storage for s is deallocated here (i.e., str2).
Lexical Arena Allocation • char *r rgets(handle_t<r>); • char *r append(handle_t<r>, char *r1, char *r2); • void bar(int n) { • region<r> r; • char *r str1 = rmalloc(r,6); • strncpy(str1, “Hello “, 5); • { region<s> s; • char *s str2 = rgets(s); • str1 = append(r,str1,str2); • } • printf(“%s”,str1); • } Storage for r is deallocated here.
Runtime Organization Regions are linked lists of pages. Arbitrary inter-region references. Similar to arena-style allocators. runtime stack
Lexical Arenas • Based on Tofte-Talpin's ML+regions • Regions introduced with lexical scope (i.e., have LIFO lifetimes) • Can support O(1) memory management operations. • Adds dynamic allocation. • Supports “callee allocates” idioms. • Arenas are often used in C programs (e.g., LCC & Apache) • Beats the crap out of Real-time Java proposals. • Unlike the ML-Kit: • Programmer controls where objects are allocated instead of implicit inference. • In practice, ML-Kit requires coding idioms. • More code, but more control, easier to reason about space requirements.
But Still Too Limited • LIFO arena lifetimes is still too strict for many programs. No notion of “tail-call” for regions. • Makes it hard to code iterative algorithms where state must be transferred from the previous iteration to the next (e.g., a copying collector). Data lifetimes are statically determined • Consider a server or gui that creates some state upon a request, and only deallocates that state upon a subsequent request. • Creating/destroying a region is relatively expensive. • Must install exception handler/deconstructor to ensure memory is reclaimed upon an uncaught exception. • NB: real-time Java has same troubles…
To Address these Issues • Dynamic arenas • Can allocate or deallocate the arena at will. • But an extra check is required for access. • (Checks can be amortized.) • Unique pointers • Lightweight (single-object) regions • Arbitrary lifetimes. • But restrictions on making copies • (Restrictions can be temporarily lifted.)
Dynamic Arenas • dynhandle_t<r> • Think a possibly NULL pointer to a region. • it will be NULL if the region has been freed. • Three operations: • dynregion()returns r.dynhandle_t<r>. Note that r is not in the live set. • region h = open(dynhandle<r>){ ... }dynamically checks that r has not been freed, records that the region is opened, and then grants access to the region for the specified scope. • free(dynhandle<r>)checks that the region r is not open and has not been freed, and then deallocates the storage.
Notes on Dynamic Arenas • We’ve traded some potential for dynamic failure for greatly increased flexibility. • So flexible, in fact, it’s possible to write a copying garbage collector within the language (see Wang & Appel, Monnier & Shao, etc.) • You can amortize the dynamic checks. • Can think of opening a dynamic region as trying to map a page (put it in your TLB) for some scope. Subsequent access within the scope is then cheap. • Opening a dynamic region is a convenient synchronization point for shared data among threads. • See D. Grossman’s TLDI paper and up-coming thesis. • It’s easy to support asynchronous revocation. • See C. Hawblitzel’s thesis.
Unique Pointers • Arenas are relatively expensive: • Must install/tear-down exception handler. • Good for batching up allocation/access. • Too heavyweight for individual objects. • Unique pointers provide very lightweight “anonymous” regions. • T*U≈r.T*r. • Can create and destroy at will. • But there are strong restrictions on creating copies of a unique pointer to ensure uniqueness.
Copying Kills • void foo(int *U x) { • int *U y = x; • *y = 3; • free(y); • *x; // oops! • } • Flow analysis prevents old copies of unique pointers from being used, so this is flagged as a compile-time error.
Flow Analysis • Tracks whether a value is defined at each point. • Needed to ensure you don’t use an uninitialized value. • Copying a unique value makes the source undefined • void foo(int *U*U x) { • int *U y;// y undefined • y = *x; // y defined, but *x undefined • *y = 3; // okay, y is defined • f(y); // y now undefined • *y; // compile-time error! • }
Joins • void foo(int *U x) { • if (rand()) • free(x); • *x; // x undefined • } • The analysis conservatively considers a value undefined if it’s undefined on any path. • This can lead to leaks, so we warn if it’s undefined on one path, but not another. • Considered making it an error, but too many false positives, and it prevents the next feature…
Sharing Unique Pointers • void foo(int *U*r x) { • int *U*r y = x; • free(*x); // *x undefined • // so, *y undefined • } • To be sound, it seems we must have completely accurate, global alias information or else prevent unique pointers from being placed in shared objects. • The latter approach is the norm for linear types.
Swap to the Rescue • We allow placing unique objects in shared ones, but the only way to extract the unique object is by using an atomic swap. • int foo(int *U*r x) { • int *U*r y = x; • int *U temp = malloc(sizeof(int)); • *temp = 3; • *y :=: temp; • free(temp); • return **x; • }
Unifying U and r • With swap, unique pointers are great! • Supports sharing (even among threads). • Yet also supports grabbing a unique object. • In turn, provides fine-grained memory mgmt. • But uniqueness is still a strong constraint. • Have to write functions so they return the arguments they don’t consume. • Example: list length or list map. • I should be able to code these once, and use them for unique lists or shared ones. • But to traverse a unique list, you have to use swaps and reverse it as you crawl over it, and then reverse it again on the way back up.
Alias Declarations • alias <r> x = y { … } • When y has type T*U and r is fresh: • Makes a copy of y and binds it to x. • The type of x is T*r (i.e., shareable) • So x can be freely copied or traversed • But r isn’t in scope outside {…} so no copy can escape. • y is undefined within {…} so it can’t be freed. • After {…}, y becomes defined again.
Notes on Alias Declarations • Generalization of Wadler’s Let-! • Closer to Walker & Watkins Let-region • But we support a form of deep aliasing • T[U] can be treated as T[r] as long as T is a covariant type constructor. • E.g., list<U> can be treated as list<r> throughout the scope of the alias declaration. • This makes it possible to write libraries where much code can be shared.
From the List Library • mlist_t<a,r>, // mutable list <: • list_t<a,r> // immutable lists • void freelist(mlist_t<a,U> x); • int length(list<a,r> x); • void foo(mlist_t<int,U> x) { • int i; • alias<r> list_t<int,r> y = x in { • i = length(x); • } • freelist(x); • }
Compiler Can Often Infer Alias: • mlist_t<a,r>, // mutable list <: • list_t<a,r> // immutable lists • void freelist(mlist_t<a,U> x); • int length(list<a,r> x); • void foo(mlist_t<int,U> x) { • int i; • i = length(x); // alias inferred! • freelist(x); • }
Summary: • Cyclone provides flexible, real-time, user-controlled memory management with static type safety guarantees. • Stack & lexical arenas require no checks, but only support “static” lifetimes. • Dynamic arenas support arbitrary lifetimes, but require some run-time checks. • Unique pointers support lightweight regions and arbitrary lifetimes, but have restrictions on sharing. • Both dynamic arena and unique pointers can be temporarily treated as lexical pointers. • Crucial for building re-usable libraries. • The region-based type system provides a unifying framework.
What Next? • Reference counting • Some preliminary support based on unique pointers. • Bounded arenas • I really think the real-time and embedded guys will like this stuff. • Better type error messages! • Very, very, very, very hard… • Region constraints (instead of parameters) • Like “where” clauses in ML modules
For more info: • Download the code: • www.cs.cornell.edu/projects/Cyclone • www.research.att.com/projects/Cyclone