380 likes | 503 Views
Static Analysis of Memory Errors. Mooly Sagiv Tel Aviv University. Project Goals. Statically d etermine that data are used in a sound way No unexpected software behavior In C No undefined semantics (ANSI C) Prevent bad programming styles In Java Certain exceptions will never be raised
E N D
Static Analysis of Memory Errors Mooly Sagiv Tel Aviv University
Project Goals • Statically determine that data are used in a sound way • No unexpected software behavior • In C • No undefined semantics (ANSI C) • Prevent bad programming styles • In Java • Certain exceptions will never be raised • Sound analysis • Minimal false alarms
Sample Cleanness Problems • C String related errors • Unsafe calls to strcpy(), strcat()… • Out of bound references • Pointer arithmetic • Javainterface requirements for library usages
String Manipulation Cleanness Checking Nurit Dor & Greta Yorsh http://www.cs.tau.ac.il/~nurr
Are String Violations Common? • FUZZ study (1995) • Random test programs on various systems • 9 different UNIX systems • 18% – 23% hang or crash • 80% are string related errors • CERT advisory • 50% of attacks are abuses of buffer overflows
Example – unsafe call to strcpy() simple() { char s[20]; char *p; char t[10]; strcpy(s,”Hello”); p = s + 5; strcpy(p,” world!”); strcpy(t,s); }
Example – unsafe call to strcpy() simple() { char s[20]; char *p; char t[10]; strcpy(s,”Hello”); p = s + 5; strcpy(p,” world!”); strcpy(t,s); } cleanness is always violated: alloc(t) = 10 len(s) = 12
Example – unsafe pointer arithmetic /* from web2c [strpascal.c] */ void null_terminate(char *s) { while ( *s != ‘ ‘ ) s++; *s = 0; }
Example – unsafe pointer arithmetic /* from web2c [strpascal.c] */ void null_terminate(char *s) { while ( *s != ‘ ‘ ) s++; *s = 0; } Cleanness is potentially violated: offtset(s) =alloc(buff(s))
buf cp Complicated Example /* from web2c [fixwrites.c] */ #define BUFSIZ 1024 char buf[BUFSIZ]; char insert_long(char *cp) { char temp[BUFSIZ]; … for (i = 0; &buf[i] < cp ; ++i) temp[i] = buf[i]; strcpy(&temp[i],”(long)”); strcpy(&temp[i+6],cp); … temp (long)
Complicated Example /* from web2c [fixwrites.c] */ #define BUFSIZ 1024 char buf[BUFSIZ]; char insert_long(char *cp) { char temp[BUFSIZ]; … for (i = 0; &buf[i] < cp ; ++i) temp[i] = buf[i]; strcpy(&temp[i],”(long)”); strcpy(&temp[i+6],cp); … buf cp temp ( l o n g ) Cleanness is potentially violated: 7 + offset (cp) BUFSIZ
buf cp Complicated Example /* from web2c [fixwrites.c] */ #define BUFSIZ 1024 char buf[BUFSIZ]; char insert_long(char *cp) { char temp[BUFSIZ]; … for (i = 0; &buf[i] < cp ; ++i) temp[i] = buf[i]; strcpy(&temp[i],”(long)”); strcpy(&temp[i+6],cp); … temp (long) Cleanness is potentially violated: offset(cp)+7 +len(cp) BUFSIZ 7 + offset (cp) < BUFSIZ
Vulnerable String Manipulation • Pointers to bufferschar *p= buffer; … while( ) p++; • Standard string manipulation functions strcpy(), strcat(), … • NULL termination strncpy(), …
C Static String Verifier (CSSV) Objectives • Modular analysis • Procedure pre-condition/post-condition/mod • Automatically generate procedure specification • Handle full C • Multi-level pointers • Structures • Reduce complexity of transformation • Linear in the number of variables
PreModPost Cfiles Cfiles AWP CSSV Pointer Analysis Procedure’sPointer info Procedure name C2IP Integer Proc Potential Error Messages Integer Analysis
Advantages of Procedure Specification • Modular analysis • Not all the code is available • Enables more expensive analyses • User control of the verification • Detect errors at point of logical error • Improve the precision of the analysis • Check additional properties • Beyond ANSI-C
Specification and Soundness • All errors are detected • Violation of procedure’s precondition • Call • Violation of procedure's postcondition • Return • Violation of statement’s precondition • …a[i]…
Specification – strcpy char* strcpy(char* dst, char *src) requiresmod ensures ( string(src) alloc(dst) > len(src) ) len(dst), is_nullt(dst) ( len(dst) = = pre@len(src) return = = pre@dst )
Specification – insert_long() /* insert_long.c */ #include "insert_long.h" char buf[BUFSIZ]; char * insert_long (char *cp) { char temp[BUFSIZ]; int i; for (i=0; &buf[i] < cp; ++i){ temp[i] = buf[i]; } strcpy (&temp[i],"(long)"); strcpy (&temp[i + 6], cp); strcpy (buf, temp); return cp + 6; } char * insert_long(char *cp) requires( string(cp) buf cp < buf + BUFSIZ ) modcp.strlen ensures ( len(cp) = = pre[len(cp) + 6] return_value = = cp + 6 ; )
PreModPost Cfiles Cfiles AWP CSSV Pointer Analysis Procedure’sPointer info Procedure name C2IP Integer proc Potential Error Messages Integer Analysis
Cfiles Cfiles Pre AWP CSSV Pointer Analysis Procedure’sPointer info LeafProcedure C2IPside effect Mod Integer proc
Cfiles Cfiles Post CSSV Pointer Analysis Procedure’sPointer info LeafProcedure C2IP Pre Mod Integer proc Potential Error Messages Integer Analysis
C2IP int cp.offset; int temp.offset = 0; int stemp.msize = BUFSIZ; int stemp.len ; int stemp.is_nullt; int i assume(sbuf.is_nullt 0 cp.offset sbuf.len sbuf.alloc ); for (i=0; i< cp.offset ; ++i ) { assert(0 i stemp.msize (stemp.is_nullt i stemp.len)); assert(-i cp.offset< -i +sbuf.len); if (sbuf.is_nullt sbuf.len == i ) { stemp.len = i; stemp.is_nullt = true; } else … assert(0 i < 6 - stemp.msize ); assume(stemp.len == i + 6);… char * insert_long (char *cp) { char temp[BUFSIZ] int i require string(cp); for(i=0; &buf[i] < cp; ++i) { temp[i]=cp[i]; } strcpy(&temp[i],"(long)");
AWP • Approximate the Weakest Precondition • Backward integer analysis • Generates a precondition
AWP – insert_long() • Generate the following precondition: string(cp) len(buf) offset(cp) + 1017 • Not the weakest precondition: string(cp) len(buf) 1017
Implementation • Using: • ASToolKit [Microsoft] • GOLF [Microsoft – Manuvir Das] • New Polka [IMAG - Bertrand Jeannet] • Main steps: • Simplifier • Pointer analysis • C2IP • Integer Analysis
Preliminary results (web2C) Up to four times faster than SAS01
The Canvas Project ComponentANnotation, VerificationAndStuff J. Field D. Goyal. G. Ramalingam IBM Research http://www.research.ibm.com/menage/canvas
The problem • Class libraries and software components are supposed to • make building complex applications from "parts" easier • make a market for pre-packaged code... • ...but in practice • programming with components is hard • inadequate documentation • lack of source code • increased API complexity (to allow for customization) • Programmers often resort to iterative trial-and-error methods to get components to work in their application
Canvas Goals • The component designers specify component conformance constraints • Develop automated certification tools to determine whether the client satisfies the component's conformance constraints • focus on JavaTM libraries and JavaBeansTM
Our Approach • Specify component behavior in a Java like language (EASL) • Use TVLA for statically analyzing Java heap • Specialize the algorithm for the component
The Concurrent Modification Problem(PLDI’02 Berlin) • Static analysis of Java programs manipulating Java 2 collections • Inconsistent usages of iterators • An Iterator object i defined on a collection object c • No use of i may be preceded by update to the contents of c,unless the update was also made via i
class Make { private Worklist worklist; public static void main (String[] args) { Make m = new Make(); m.initializeWorklist(args); m.processWorklist(); } void initializeWorklist(String[] args) { ...; worklist = new Worklist(); ... // add some items to worklist} void processWorklist() { Set s = worklist.unprocessedItems(); for (Iterator i = s.iterator(); i.hasNext()){ Object item = i.next(); if (...) processItem(item); } } void processItem(Object i){ ...; doSubproblem(...);} void doSubproblem(...) { ... worklist.addItem(newitem); ... } } public class Worklist { Set s; public Worklist() {. ..; s = new HashSet(); ... } public void addItem(Object item) { s.add(item); } public Set unprocessedItems() { return s; } } return rev; }
EASL Specification class Version {} class Collection { Version version; Collection() { version = new Version(); } boolean add(Object o) { version = new Version(); } Iterator iterator() { return new Iterator(this); } } class Iterator { Collection set; Version definingVersion; Iterator (Collection s){ definingVersion = s.version; set = s; } void remove() { requires (definingVersion == set.version); set.ver = new Version(); definingVersion = set.version; } Object next() { requires (definingVersion == set.version); }
Prototype Jimple AST J2TVP Translator CFG + actions Java Soot EASL Specialize Three Value Logic Analyzer action definition Analysis result Potential cleanness violations
Conclusion • Ambitious sound analyses • Very few false alarms • Scaling is an issue • Use staged analyses • Use modular analysis • Use encapsulation