230 likes | 369 Views
Yarra: A lightweight extension of C for data integrity & local reasoning Work in Progress!. David Walker Princeton University. joint with: Karthik Pattabiraman, Cole Schlesinger, Nikhil Swamy, Ben Zorn. Report from the Front Lines. Far from the lily waters and ivory towers that house
E N D
Yarra: A lightweight extension of C fordata integrity & local reasoningWork in Progress! David Walker Princeton University joint with: Karthik Pattabiraman, Cole Schlesinger, Nikhil Swamy, Ben Zorn
Report from the Front Lines Far from the lily waters and ivory towers that house local reasoners and separation logicians, a pitched battle is being waged: Hackers & Legacy Software vs. C, C++ programmers
The Skirmishes buffer overflow on stack; corrupt return address buffer overflow on heap; execute heap data attack data integrity; corrupt program without changing control flow XOR heap metadata with random number Stack Guard heap layout randomization /GS flag DEP: turn off HW execute bits in heap this talk frequency of attacks write integrity testing (WIT) 2000 2002 2006 2010 2004 2008 Source: Ben Zorn
A non-control data attack [source: Akritidis et al.; inspired by true nullhttpd attack] Web Server Code: 1: char cgiCommand[1024]; 2: char cgiDir[1024]; 3: 4: void ProcessCGIRequest(char* msg, intsz) { 5: int i=0; 6: while (i < sz) { 7: cgiCommand[i] = msg[i]; 8: i++; 9: } 10: 11: ExecuteRequest(cgiDir, cgiCommand); 12: } cgiCommand overflow cgiDir
Two Perspectives • The conventional perspective: • The write operation on cgiCommand is out-of-bounds. The fault lies in the definition and implementation of cgiCommand. It misimplements indexing operations. • The data integrity perspective: • The integrity of the cgiDir data structure is compromised. The fault lies in the definition and implementation of cgiDir. It fails to protect itself from external agents.
Engineering Considerations • The conventional perspective: • bounds for all data structures must be maintained • indices proven within bounds of structures • The data integrity perspective: • bounds only for critical (high integrity) data structures must be maintained • indices proven not within critical structures • new, efficient implementation possibilities • eg: allocate critical objects on designated pages; flip hardware protection bits to prevent writes to critical data, allowing “safe” linking with buggy, unmodified libraries
Yarra • A light-weight extension to C • Programmers use type declarations to specify data integrity constraints • A compiler & run-time system dynamically enforces the constraints • Yarra’s program logic possesses sound frame rules that apply even when modules are linked to (almost) arbitrary C or assembly libraries • Yarra = arraY-1
Yarra Type Definitions • Yarra type definitions allows programmers to declare their data integrity intentions: • Only operations on pointers with static type dir may access data structure cgiDir • By using a pointer with the appropriate type, a programmer declares their intention to modify a particular structure yarra typedef char dir[1024]; dir cgiDir;
type declaration for data with high integrity high integrity data structure protected by run-time system yarra typedef char dir[1024]; char cgiCommand[1024]; dir cgiDir; void ProcessCGIRequest(char* msg, int sz) { int i=0; while (i < sz) { cgiCommand[i] = msg[i]; i++; } ExecuteRequest(cgiDir, cgiCommand); } cgiCommand overflow cgiDir on overflow, access pointer has type char[ ] but memory written to has type dir
A (very) simple allocator protected meta data 1 0 1 0 0 0 meta: data: protected unallocated cell unprotected allocated cell
1 0 1 0 0 0 meta: data: yarra typedef struct { int tag; } metaT; yarra typedef struct { int nothing; } unusedT; union item { unusedT unused; int used; }; static metaT* meta[SIZE]; static item* data[SIZE];
1 0 1 0 0 0 meta: data: yarra typedef struct { int tag; } metaT; yarra typedef struct { int junk; } unusedT; union item { unusedT unused; int used; }; static metaT* meta[SIZE]; static item* data[SIZE]; int *alloc() { for (int i=0;i<SIZE;i++) { if (meta[i] == 0) { meta[i]->tag = 1; unbless(unusedT, data[i]->unused); return data+i; } } return NULL; // out of memory }
1 0 1 0 0 0 meta: void free(int *datum) { if (datum >= data && datum < data+SIZE) { int i = datum-data; if (meta[i]->tag == 0) abort("double free"); else { meta[i]->tag = 0; bless(unusedT, data[i]->unused); } } else { abort("out of bounds datum"); } return; } data: yarra typedef struct { int tag; } metaT; yarra typedef struct { int junk; } unusedT; union item { unusedT unused; int used; }; static metaT* meta[SIZE]; static item* data[SIZE];
Yarra Operations • bless(T,p): memory pointed to by p may only be used at type T from now on • unbless(T,p): memory pointed to by p had been blessed in the past and from now on may be freely used at any type • isa(T): true if memory has been blessed as T • yalloc(T): allocate memory blessed at T • p->f, p[i]: use memory; dynamic legality depends upon static type of p
Yarra Abstract Semantics • Abstractly, objects of each Yarra type are implemented in their own, independent heap • Programmers reason as if this were the case • Heaps metaT, unusedT, show up in specs ususedT: metaT: heap:
Yarra Abstract Semantics ususedT: metaT: heap: • Bless (T,p) moves (transfers) a location p from the standard heap to the heap T • Unbless (T,p) moves p from T to the standard heap • Programmers use a classical first-order assertions to reason about the states of these independent heaps: • loc dom(unusedT) means “loc is in the heap unusedT”
Yarra Specs in heap metaT 1 0 1 0 0 0 meta: data: in standard heap in heap unusedT Allocator Module Invariant: forall i:int. 0 <= i < SIZE ==> let loc = data+i in ((metaT[loc] == 0 && loc dom(unusedT)) || metaT[loc] == 1 && loc dom(unusedT))
Yarra Modular Reasoning via Classical Anti-Frame Rule foo.yarra: yarra t1 ... yarra tk static t1’ x1=e1; ... static tj’ xj=ej; invariant R; t g () requires P ensures Q modifies M {C} Verification Conditions: • FV(R) {t1,...,tk,x1,...,xj} • initial_state |= R • {P & R} C {Q & R} External Interface: • {P} g () {Q} • modifies(g()) = modifies(C) – {x1,...,xk}
Linking the Allocator to a Client allocator any client that: uses alloc uses free uses pointer arithmetic has buffer overflows uses the ordinary heap arbitrarily metaT and unusedT not in scope yarra metaT yarra unusedT static item *data alloc { ... } free { ... } Allocator Module Invariant R: forall i:int. 0 <= i < SIZE ==> let loc = data+i in ((metaT[loc] == 0 && loc dom(unusedT)) || metaT[loc] == 1 && loc dom(unusedT)) Free vars are a subset of static locals and yarra heaps
Work In Progress • Studying examples: • nullhttp vulnerability, ssh vulnerability, telnet vulnerability, ftp vulnerability • bget allocator (requires nested yarra struct types and lots of casting), bsdmalloc • Implementation of C extensions and run-time system: • software solution inspired by WIT (Akritidis et al.) • would like a hardware-aided solution too • Developing a program logic: • related to recent ideas on “linear maps”
Related Work • WIT (Write Integrity Testing) • alias analysis of C code to assign “colors” to instructions and to objects • when an instruction’s static color fails to correspond to an object’s dynamic color, failure is signalled • pro: 12% performance overhead • pro: no programmer work • con: alias analysis doesn’t always work, resulting in few colors or a useless coloring • con: no semantics or added reasoning principles • con: no separate compilation/support for libraries
Related Work • Samurai [Pattabiraman, Zorn] • similar goals to WIT, but designed to handle unmodified libraries and separate compilation • implementation based on replication of critical data as opposed to region checking • Flicker [Pattabiraman, Zorn] • programming system for low-power devices • some structures marked “not critical” (low integrity) • low integrity data structures stored in faulty, low-power memory • substantial power gains for little reliability degradation • Godstopper Categories [Benton, Birkedal]
Conclusion • Yarra is an experimental extension to C designed to support and enforce data integrity specifications • Yarra enables local reasoning, even when linking with arbitrary, possibly buggy modules