510 likes | 667 Views
An Overview of the Saturn Project. The Three-Way Trade-Off. Precision Modeling programs accurately enough to be useful Scalability Saying anything at all about large programs Human Effort How much work must the user do? Either giving specifications, or interpreting results. Today’s focus.
E N D
An Overview of the Saturn Project Saturn Overview
The Three-Way Trade-Off • Precision • Modeling programs accurately enough to be useful • Scalability • Saying anything at all about large programs • Human Effort • How much work must the user do? • Either giving specifications, or interpreting results Today’s focus Not so much about this . . . Saturn Overview
Precision int f(int x) { . . . . . . } Primary abstraction is done at function boundaries. formula [A(Ff), A(Fg), A(Fh)] A(Ff) Ff A(Fg) [A(Ff), A(Fg), A(Fh)] [A(Ff), A(Fg), A(Fh)] A(Fh) Intraprocedural analysis with minimal abstraction. Saturn Overview
Design constraint: SAT formula size ~ function size Analyze one function at a time Parallel implementation Server sends functions to clients to analyze Typically use 50-100 cores to analyze Linux Scalability Saturn Overview
Summaries • Abstract at function boundaries • Compute a summary for function’s behavior • Summaries should be small • Ideally linear in the size of the function’s interface • Summaries are our primary form of abstraction • Saturn delays abstraction to function boundaries Slogan: Analysis design is summary design! Saturn Overview
Expressiveness • Analyses written in Calypso • Logic programs • Express traversals of the program • E.g., backwards/forwards propagation • Constraints • For when we don’t know traversal order • Written ~40,000 lines of Calypso code Saturn Overview
Availability • An open source project • BSD license • All Calypso code available for published experiments saturn.stanford.edu Saturn Overview
People Brian Hackett Isil Dillig Suhabe Bugrara Yichen Xie (past) Alex Aiken Thomas Dillig Peter Hawkins Saturn Overview
Outline What else can you do? Survey of results • Saturn overview • An example analysis • Intraprocedural • Interprocedural Saturn Overview
Saturn Architecture C Program C Frontend C Syntax Databases Calypso analyses Constraint Solvers Calypso Interpreter Summary Databases Summary Reports UI Saturn Overview
Parsing and C Frontend Source Code Build Interceptor Preprocessed Source Code Other possible frontends CIL frontend Abstract Syntax Tree Databases Saturn Overview
Calypso • General purpose logic programming language • Pure • Prolog-like syntax • Bottom-up evaluation • Magic sets transformation • Also a (minor) moon of Saturn Saturn Overview
Helpful Features • Strong static type and mode checking • Permanent data (sessions) • stored as Berkeley DB databases • Sessions are just a named bundle of predicates • Support for unit-at-a-time analysis Saturn Overview
Extensible Interpreter SAT Solver #sat predicate, … Logic Program Interpreter LP Solver DOT graph package UI package Saturn Overview
Scalability • Interpreter is not very efficient • OK, it’s slow • But can run distributed analyses • 50-100 CPUs • Scalability is more important than raw speed • Can run intensive analyses of the entire Linux kernel (>6MLOC) in a few hours. Saturn Overview
Cluster Architecture Calypso DB Worker Node 1 Databases Master Node Calypso DB Worker Node 100 Saturn Overview
Job Scheduling Job = a function body Dynamically track dependencies between jobs • Rerun jobs if new dependencies found • Optimistic concurrency control Iterate to fixpoint for circular dependencies Saturn Overview
Calypso Analyses C Syntax Predicates CFG Construction Constraint Solvers Memory Model Function Pointer Analysis Typestate verifier NULL checker Alias Analysis Saturn Overview
The Paradigmatic Locking Analysis Check that a thread does not: • acquire the same lock twice • release the lock twice Otherwise the application may deadlock or crash. Saturn Overview
Specification unlock unlock error locked unlocked lock lock Saturn Overview
Basic Setup • We assume • one locking function lock(l) • one unlocking function unlock(l). • We analyze one function at a time • produce locking summary describing the FSM transitions associated with a given lock. Saturn Overview
An Example Function & Summary f( . . ., lock *L, . . .) { lock(L); . . . unlock(L); } L: unlocked -> unlocked locked -> error • Summaries are input state -> output state • The net effect of the function on the lock • Summary size is independent of function size • Bounded by the square of the number of states Saturn Overview
Guard g_guard is a boolean constraint Program point pp is a unique id for each point in the program Trace t_trace is a unique name for a memory location Lock States type lockstate ::= locked | unlocked | error. • Predicates to describe lock states on nodes and edges of the CFG: predicate node_state(P:pp,L:t_trace,S:lockstate,G:g_guard). predicate edge_state(P:pp,L:t_trace,S:lockstate,G:g_guard). Saturn Overview
1.Initialize lock states at function entry 2. Join operator: Combine edges to produce successor’s node_state 3. Transfer functions for every primitive: assignments tests function calls The Intraprocedural Analysis Saturn Overview
Initializing a Lock • Use fresh boolean variable • Interpretation: • is true )Lis locked • : is true)Lis unlocked • Enforces that L cannot be both locked and unlocked simultaneously Saturn Overview
Notation (lock, state, guard) P At program point P, the lock is in state if guard is true. Saturn Overview
Initialization Rules node_state(P0,L,locked,LG):- entry(P0), is_lock(L), fresh_variable(L, LG). node_state(P0,L,unlocked,UG):- entry(P0), node_state(P0,L,locked,LG), #not(LG, UG). Allocates new boolean variable associated with lock L. f( . . ., lock *L, . . .) { . . . } P0 (L, locked, LG) (L, unlocked, UG) Saturn Overview
The Intraprocedural Analysis 1. Initialize lock states at function entry 2.Join operator: • Combine edges to produce successor’s node_state 3. Transfer functions for every primitive: • assignments • tests • function calls Saturn Overview
Joins if (…) (L, locked, F2) (L, locked, F1) (L, locked, F1ÇF2) • node_state(P,L,S,G) :- • edge_state(P,L,S,_), • \/edge_state(P,L,S,EG):#or_all(EG,G). Note: There is no abstraction in the join . . . Saturn Overview
The Intraprocedural Analysis 1. Initialize lock states at function entry 2. Join operator: • Combine edges to produce successor’s node_state 3. Transfer functions for every primitive: • assignments • function calls • etc. Saturn Overview
Assignments Assignments do not affect lock state: edge_state(P1,L,S,G) :- assign(P0,P1,_), node_state(P0,L,S,G). P0 (L, S, G) X = E; (L,S, G) P1 Saturn Overview
Interprocedural Analysis Basics • Function summaries are the building blocks of interprocedural analysis. • Generating a function summary requires: • Predicates encoding relevant facts • A session to store these predicates. Saturn Overview
Interprocedural Analysis Outline 1. Generating function summaries 2. Using function summaries • How do we retrieve the summary of a callee? • How do we map facts associated with a callee to the namespace of the currently analyzed function? Saturn Overview
Summary Declaration session sum_locking(FN:string) containing[lock_trans]. predicate lock_trans(L: t_trace, S0: lockstate, S1: lockstate). Declares a persistent database sum_locking(function name) holding lock_transfacts sum_locking Saturn Overview
*arg0 is the memory location modified by lock and unlock Summary Generation: Primitives Summaries for lock and unlock: sum_locking("lock")->lock_trans(*arg0,locked,error) :- . sum_locking("lock")->lock_trans(*arg0,unlocked,locked) :- . sum_locking("unlock")->lock_trans(*arg0,unlocked,error) :- . sum_locking("unlock")->lock_trans(*arg0,locked,unlocked) :-. Saturn Overview
Summary Generation: Other Functions sum_locking(F)->lock_trans(L, S0, S1) :- current_function(F), entry(P0), node_state(P0, L, S0 , G0), exit(P1), node_state(P1, L, S1, G1), #and(G0, G1, G), guard_satisfiable(G). F( . . ., lock *L, . . .) { . . . } P0 (L, S0, G0) P1 (L, S1, G1) if SAT(G1Æ G2), then . . . h F: S0! S1 Saturn Overview
Summary Application Rule call_transfer(I, L, S0, S1, G) :- direct_call(I, F), call(P0, _, I), sum_locking(F)->lock_trans(CL, S0, S1), instantiate(s_call{I}, P0, CL, L, G). G( . . .) { F(. . .) } P0 (S0, L, G) F: S0! S1 (S1, L, G) Saturn Overview
Applications • Bug finding • Verification • Software Understanding Saturn Overview
Saturn Bug Finding • Early work • Locking • Scalable Error Detection using Boolean Satisfiability. POPL 2005 • Memory leaks • Context- and Path-Sensitive Memory Leak Detection. FSE2005 • Scripting languages • Static Detection of Security Vulnerabilities in Scripting Languages. 15th USENIX Security Symposium, 2006 • Recent work • Inconsistency Checking • Static Error Detection Using Semantic Inconsistency Inference. PLDI 2007 Saturn Overview
Examples: Null pointer dereferences Saturn Overview
Lessons Learned • Saturn-based tools improve bug-finding • Multiple times more bugs than previous results • Lower false positive rate • Why? • “Sounder” than previous bug finding tools • bit-level modeling, handling casts, aliasing, etc. • Precise • Fully intraprocedurally path-sensitive • Partially interprocedurally path-sensitive Saturn Overview
Lessons Learned (Cont.) • Design of function summary is key to scalability and precision • Summary-based analysis only looks at the relevant parts of the heap for a given function • Programmers write functions with simple interfaces Saturn Overview
Saturn Verification • Unchecked user pointer dereferences • Important OS security property • Also called “probing” or “user/kernel pointers” • Precision requirements • Context-sensitive • Flow-sensitive • Field-sensitive • Intraprocedurally path-sensitive Saturn Overview
Current Results for Linux-2.6.1 6.2 MLOC with 91,543 functions Verified 616 / 627 system call arguments 98.2% 11 false alarms Verified 851,686 / 852,092 dereferences 99.95% 406 false alarms Saturn Overview
Preliminary Lessons Learned Bug finders can be sloppy: ignore functions or points-edges that inhibit scalability or precision Soundness substantially more difficult than finding bugs Lightweight, sparsely placed annotations Have programmers add some information Makes verification tractable Only 22 annotations need for user pointer analysis Saturn Overview
Saturn for Software Understanding • A program analysis is a code search engine • Generic question: Do programmers ever do X? • Write an analysis to find out • Run it on lots of code • Classify the results • Write a paper . . . Saturn Overview
Examples • Aliasing is used in very stylized ways, at least in C • Cursors into data structures • Parent/child pointers • And 7 other idioms How is Aliasing Used in Systems Software? FSE 2006 • Do programmers take the address of function ptrs? • Answer: Almost never. • Allows simpler analysis of function pointers Saturn Overview
Other Things We’ve Thought About • Shape analysis • We notice the lack of shape information • Interprocedural path-sensitivity • Needed for some common programming patterns • Proving correctness of Saturn analyses Saturn Overview
Related Work • Lots • All bug finding and verification tools of the last 10 years • Particularly, though • Systems using logic programming (bddbddb) • ESP • Metal • CQual • Blast Saturn Overview
saturn.stanford.edu Saturn Overview