1 / 32

Static Code Checking: Security and Concurrency

Static Code Checking: Security and Concurrency. Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005. The Video. The Problem. How to discover errors in code without running it Code can run for weeks or months without displaying the error

suzy
Download Presentation

Static Code Checking: Security and Concurrency

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Static Code Checking: Security and Concurrency Ben Watson The George Washington University CS 297 Security and Programming Languages June 9, 2005

  2. The Video

  3. The Problem • How to discover errors in code without running it • Code can run for weeks or months without displaying the error • Many errors are caused by pieces of code that are very difficult to test • Device drivers – manufacturers aren’t always good at this, and one OS company can’t possibly test all the tens of thousands of devices out there • The Windows 98 crash was caused by a bad scanner driver • Concurrent code—debugging complicated concurrency problems is a nightmare x n.

  4. The Scope • Lines of Code (estimated)

  5. The Real Problem • We’re only human • No person, no group of people can possibly manually debug anything as complicated as an OS and its related pieces • Good tools are not enough • Can’t rely on thorough annotations of entire code base • Can’t rely on manual directions: the more automated the better

  6. The Solutions • MC Security checking system • RacerX: Race condition and Deadlock detection • General rule inference from source code

  7. MECA: Statically Checking Security Properties • Checks low-level properties (pointer safety, etc.) • Relies on annotations that propagate through the analysis • Goals • Expressiveness • Low manual overhead—programmers only have to type in a relatively few number of annotations • Low false-positives

  8. How MC Works • Uses a modified GCC compiler • Parses source along with abstract syntax tree generated by compiler • AST used to build a control-flow graph • Annotation propagator uses CFG to propagate annotations through entire graph • Checkers are run on the completed graph • Results are ranked and filtered

  9. An example • Rule: OS kernel may not access a user-pointer (there are “paranoid” functions to access the data pointed to by a user-pointer) • Referred to as a “tainted” pointers • Annotate: • Tainted variables, parameters, and fields • Functions that produce tainted values

  10. Source annotations struct myStruct { /*@ tainted */ int*p; }; /*@ tainted */ int *foo(/*@ tainted */int *p); void memcpy(/*@ !tainted */void *dst, /*@ !tainted */void *src, unsigned nbytes);

  11. Source annotations //Binding: /*@ set_length($ret, sz) */ void* malloc(unsigned sz); //Global: all sys_* calls //are tainted /*@ global $param ${!strncmp(current_fn,”sys_”,4)} ==> tainted */

  12. Propagation void bar(/*@ tainted */void *p); struct S{char* buf;} //Before analysis void foo(char** p, struct S* s) { char *r; struct S* ss; r=*p; bar(r); //taints r and *p ss =s; bar(ss->buf); //taints ss and s } //At the end of analysis: Foo(/*@ tainted (*p) */char **p, /*@tainted(s->buf) */struct S* s);s

  13. MECA results • On average, one manual annotation led to 682 checks • Linux 2.5.63 Bugs:

  14. RacerX • Static detection of race conditions and deadlocks • Designed to find errors in large, multi-threaded systems • Sorts errors by severity (the hard part) • They checked Linux, FreeBSD, and a mystery OS that has only 500,000 lines of code

  15. Deadlock • Deadlock • Thread 1 has locked resource A • Thread 2 has locked resource B • Thread 1 needs resource B to complete • Thread 2 needs resource A to complete • Neither can proceed—these threads are deadlocked

  16. Race condition • Multiple threads access the same memory • If memory is unprotected: • Two threads can simultaneously write to same memory (bad) • One thread can read, another can write simultaneously (bad) • Two threads can simultaneously read from same memory (probably ok) • It’s a race because final value is non-deterministically chosen by who gets there first.

  17. Avoiding the Problem • If data is never accessed by more than one thread, you don’t have to worry about concurrency • If program logicensures that only one thread accesses data, you don’t need to worry about locking the data • If you’re writing a shared component, you almost always have to worry about concurrency

  18. Algorithm • “Lockset” algorithm detects both types of problems • Lockset - A pair of • Lock()/Unlock() • InterruptDisable()/InterruptEnable() • Etc.

  19. Algorithm • Top-down analysis of control-flow graph • Add/remove locks as needed • Check for race/deadlock on each statement • Cache results to ease exponential graph size

  20. Deadlock Check • Basically, finds if there are cycles in the lockset dependencies • If lock a is obtained, then lock b, we have: • a  b • Following this line of reasoning, we can discover cases that look like this: • a  b  c  a

  21. Deadlock Check • Deciding how important the cycle is, is non-trivial. • Basically, rank higher according to: • Global locks vs. local locks • Small depth difference vs. big depth difference • Fewer threads vs. more threads

  22. Race Checking • This is even harder than deadlock detection • Must answer: • Is lockset valid (if not, you will have LOTS of false positives) • Can the unprotected memory be accessed more than one thread? • Does the access need to be protected? • Two reads do not a wrong make • Must annotate API functions that require locks

  23. Race Checking • Deciding if code is multithreaded: • Inferred from “programmer belief” – if a piece of code contains concurrency-related statements, the code is probably multi-threaded • Annotations—designate API functions as requiring locks

  24. Race Checking • Does memory need to be protected? • If it’s never written to, no. • If it’s only written on initialization, no. • On a certain code path, if there are a high-number of variables that are potentially written to concurrently, probably. • Anything that can’t be written atomically, yes. (although, this is pretty much anything, especially if you have more than 1 CPU) • If a variable is statistically likely to be protected by locking code (“Programmer Belief”)

  25. RacerX: Results

  26. Pop Quiz – Question 1 • If you have read the 3rd paper, you may not answer this question. • Find the bug: if (card==NULL) { printk(KERN_ERR “capidrv-%d: … %d!\n”, card->contrnr, id); }

  27. Pop Quiz – Answer 1 if (card==NULL) { printk(KERN_ERR “capidrv-%d: … %d!\n”, card->contrnr, id); }

  28. Pop Quiz – Question 2 • If you have read the 3rd paper, you may not answer this question. • Find the bug: struct mxser_struct *info = tty->driver_data; unsigned long flags; if (!tty || !info->xmit_buf) return 0;

  29. Pop Quiz – Answer 2 struct mxser_struct *info = tty->driver_data; unsigned long flags; if (!tty || !info->xmit_buf) return 0;

  30. General Methodology • Take advantage of programmer beliefs • Statistics are our friend • If something is usually done a certain way, then instances that violate that should be examined • Check internal consistency • Discover rules that are built-in to the code • Minimal to no annotation

  31. Conclusion • The methods tonight provide some of the best ways to find errors: • Millions of lines of code can be checked with at most hundreds of lines of annotations • The bugs these methods find are fairly specific in nature (revolve around well-structured code constructs)

  32. References • Junfeng Yang, Ted Kremenek, Yichen Xie, and Dawson Engler. MECA: an Extensible, Expressive System and Language for Statically Checking Security Properties. ACM CCS, 2003. • Dawson Engler and Ken Ashcraft. RacerX: Effective, Static Detection of Race Conditions and Deadlocks. SOSP 2003. • Dawson Engler, David Yu Chen, Seth Hallem, Andy Chou, and Benjamin Chelf. Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code. OSDI 2000. • Source Lines of Code, http://www.answers.com/topic/source-lines-of-code • Concurrency – Part 2: Avoiding the Problem, http://blogs.msdn.com/larryosterman/archive/2005/02/15/373460.aspx

More Related