400 likes | 418 Views
Explore techniques and challenges of ESP program analysis for reliable software development. Learn about managing resources, scalability, and ensuring program correctness. Case studies and flow-sensitive dataflow analysis included.
E N D
CleanL TSys DSL DFA WP/SP MC ATP ESP [Das et al PLDI 2002] Interface usage rules in documentation • Order of operations, data access • Resource management • Incomplete, wordy, not checked Violated rules ) crashes • Failed runtime checks • Unreliable software
CleanL TSys DSL DFA WP/SP MC ATP ESP [Das et al PLDI 2002] C Program Rules ESP Safe Not Safe
CleanL TSys DSL DFA WP/SP MC ATP ESP [Das et al PLDI 2002] • ESP is a program analysis that keeps track of object state at each program point • e.g.: is file handle open or closed? • Challenge: scale to large programs • One of scalability issues: merge nodes • Always analyze both sides of merge node ) exponential (or non-terminating) program analyses • ESP has a heuristic for handling merges that • avoids exponential blow-up and runs fast in practice • maintains enough precision to verify programs
Prop Sim example: stdio usage in gcc void main () { if (dump) fil = fopen(dumpFile,”w”); if (p) x = 0; else x = 1; if (dump) fclose(fil); }
Prop Sim example: stdio usage in gcc void main () { if (dump) Open; if (p) x = 0; else x = 1; if (dump) Close; }
$uninit Print/Close * $error Open Close Open Opened Print Prop Sim example: stdio usage in gcc void main () { if (dump) Open; if (p) x = 0; else x = 1; if (dump) Close; }
entry dump T F Open p F T x = 0 x = 1 dump T F Close exit Example: no path-sensitivity
entry dump T F Open p F T x = 0 x = 1 dump T F Close exit Example: no path-sensitivity {$uninit} {$uninit,Opened} {$uninit,Opened} {$error,$uninit,Opened}
entry dump T F Open p F T x = 0 x = 1 dump T F Close exit Example: full path-sensitivity
entry dump T F Open p F T x = 0 x = 1 dump T F Close exit Example: full path-sensitivity [$uninit] [$uninit|dump=T] [Opened|dump=T] [Opened|dump=T,p=T] [Opened|dump=T,p=T,x=0] [Opened|dump=T,p=T,x=0] [$uninit|dump=T,p=T,x=0]
entry dump T F Open p F T x = 0 x = 1 dump T Close F exit Example: ESP technique
entry dump T F Open p F T x = 0 x = 1 dump T [Opened|dump=T,p=T,x=0] [Opened|dump=T,p=F,x=1] Close F exit Example: ESP technique [$uninit] [Opened|dump=T] [$uninit|dump=F] [Opened|dump=T] [$uninit|dump=F] [$uninit|dump=T][$uninit|dump=F] [$uninit|dump=T] [$uninit]
Case study: stdio usage in gcc • cc1 from gcc version 2.5.3 (Spec95) • Does cc1 always print to opened files? • cc1 is a complex program: • 140K non-blank, non-comment lines of C • 2149 functions, 66 files, 1086 globals • Call graph includes one 450 function SCC
Experimental results • Precision • Verification succeeds for every file handle • No transitions to $error; no false errors • Scalability • Average per handle: 72.9 seconds, 49.7 MB • Single 1GHz PIII laptop with 512 MB RAM • Proved that: • Each of the 646 calls to fprintf in the source code prints to a valid, open file
ESP follow-up • ESP has since been run on large real-world applications • ESP/X: local intra-procedural version • PSE: post-mortem analysis • run ESP backwards to figure out what cause a crash
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Flow-sensitive intraproc dataflow analysis • Iterative dataflow analysis • flow functions, lattice-theoretic formulation • Termination • monotonic flow functions + finite height lattice • Meet over all paths vs. meet over all feasible paths vs. dataflow analysis • For distributive problems, MOP = dataflow analysis
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Program representations • Simple • AST • CFG • More advanced • Dataflow Graph • Control Dependence Graph • Program Dependence Graph • SSA
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Interprocedural analysis • Context insensitive • caller summaries and callee summaries • Context-sensitive • call-strings as context (k-CFA, “call-strings”) • dataflow info as context • bottom-up, complete summaries • top-down, partial summaries (partial transfer functions)
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Flow-insensitive analysis • Keep only one piece of information for the entire program/procedure • Loses precision, but improves space consumption
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Path-sensitive analysis • Enhance dataflow to try to keep paths separate • Two kinds of path-sensitive analysis: • aim towards MOP • aim towards removing infeasible paths (branch correlations)
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Pointer analysis • Started with simple naïve intraproc analysis with allocation site summaries • To scale to large programs: • make naïve pointer analysis flow insensitive (Andersen) • make each node have only one outgoing edge, which makes it near linear time (Steensgaard) • add one level of flow to regain some precision (One-level flow)
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Program analysis and program reliability • Property simulation • path sensitive analysis in polynomial time • uses clever heuristic for merges • algorithm behind ESP • Predicate abstraction and iterative refinement • given set of predicates, compute predicates that hold at each program point • iteratively refine set of predicates • core of BLAST and SLAM
Course overview • Cross-cutting issues • Correctness • Ordering transformations and analyses • Dataflow analysis and variations • iterative dataflow analysis • program representations • interprocedural • flow-insensitive • path-sensitive • Applications • Pointer analysis • Optimizing OO languages • Program reliability • Rhodium
Looking forward (discussion) • What are the current hot topics in compilers and program analysis? • Compilers and program analysis in 20 years from now?
Looking forward: Concurrency • Hardware trends are making exploiting concurrency more and more important • Language features and compiler technology to express and exploit concurrency • Current examples: • race detection • primitives for concurrency and efficient implementations (eg: atomic primitive)
Looking forward: Scalability • Scale to large programs while retaining precision • Current examples: • Use scalable constraint solvers such as SAT (SATURN) • Use compact representations such as BDDs
Looking forward: Verification • Tradeoffs between: • automation • scalability • precision • domain-specificity • Current examples • ESP, BLAST, SLAM, Rhodium
Looking forward: Extensibility • Removing barrier to entry to the compiler • New models of using compilers for • domain-specific checkers • domain-specific optimizations • Current examples: • Rhodium, Collider