280 likes | 566 Views
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis. Hari Hampapuram Jason Yue Yang Manuvir Das. Center for Software Excellence (CSE) Microsoft Corporation. Gist of Results. Symbolic path simulation engine supporting: Merge For merge-based path-sensitive analysis
E N D
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft Corporation
Gist of Results • Symbolic path simulation engine supporting: • Merge • For merge-based path-sensitive analysis • Function summaries • For scalable global analysis • Pointers • Our main client is Windows Jason Yang, Microsoft
Infeasible Path False Positive extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } START OpenHandle CloseHandle OPEN CLOSE UseHandle UseHandle ERROR Jason Yang, Microsoft
Infeasible Path False Positive extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } START OpenHandle CloseHandle OPEN CLOSE UseHandle UseHandle ERROR Jason Yang, Microsoft
Need for Merge • The “knob” for scalability vs. precision tradeoff • Always merge (traditional dataflow) false errors • Always separate: exponential blow-up • Driven by client analyses Jason Yang, Microsoft
Merge Criterion for ESP • Selective merging based on property states • Partition symbolic states into property states and everything else • If the incoming paths differ in property states, track them separately; otherwise, merge them. Jason Yang, Microsoft
Merge Criterion for ESP Example extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } Property states different along paths Jason Yang, Microsoft
Merge Criterion for ESP Example extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } Property states different along paths Do not merge Jason Yang, Microsoft
Merge Criterion for ESP Example extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } Property states change along paths Do not merge Property states are the same Jason Yang, Microsoft
Merge Criterion for ESP Example extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } Property states change along paths Do not merge Property states are the same Merge Jason Yang, Microsoft
Merge Criterion for ESP Example extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } Property states change along paths Do not merge Property states are the same Merge Still maintains the needed fact: “If CloseHandle is called, branch should fail.” Jason Yang, Microsoft
Need for Function Summaries extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = Foo(b); else y = 2; if (x != 1) UseHandle(handle); } Partial transfer functions Computed on-demand Enforced by “into-binding” and “back-binding” Jason Yang, Microsoft
Support for Language Features • Pointers • Field-based objects • Operator expressions • … Jason Yang, Microsoft
Symbolic Simulator Architecture Defect detection, core dump analysis, test generation code review ... Client Application Client Application “Semantic translator” Simulation Interface (SI) Simulation Interface (SI) Simulation State Manager (SSM) “Theorem prover” Jason Yang, Microsoft
Semantic Domains • Environment • ProgramSymbol Loc • Managed by Simulation Interface • Store • Loc Val • Managed by Simulation State Manager • Region-based model for symbolic store • region Loc • value Val Jason Yang, Microsoft
Simulation State Manager (SSM) • Tracking symbolic simulation states to answer queries about path feasibility • What should be tracked? • Mapping of store region value • Constraints on values Jason Yang, Microsoft
Regions • Variable regions vs. deref regions • Important for pointer dereference • Important for supporting merge and binding Variable regions: R(p), R(q), R(x), R(y) Deref regions: R(*p), R(*q) void Process(int *p, int *q) { int x = *p; int y = *q; if (p != q) return; if (*p != *q) … // Not reachable } Jason Yang, Microsoft
Values • Constant values (integers, floats, …) • Operator values (arithmetic, bitwise, relational) • Symbolic values (general constraint variables) • Region-initial values (constraint variables for initial values) • Pointer values (for points-to relationship) • Field-based values (for compound types) Jason Yang, Microsoft
Need for Region-Initial Values • Important for function summary • Pre-condition: simulation state at Entry node • Post-condition: simulation state at Exit node • Input values vs. current values • To support lazy initialization for input values • An input region gets region-initial values by default, unless it has been killed • Need to maintain a kill set Jason Yang, Microsoft
Decision Procedures • Current implementation: • Equality (e.g. a == b): equivalence classes • Disequality (e.g. a != b): multi-maps between equivalence classes • Inequality (e.g. a< b): a graph (nodes are equivalence classes and edges are inequality relations) • Can plug in other theorem provers if needed Jason Yang, Microsoft
Merge • Moves symbolic states upwards in the lattice • Less constraints on path feasibility after merge • Maps the memory graphs and the associated constraints on values 0xEFD0 0xEFD0 R1 R1’ R1’’ 0xEFD0 JOIN R2 $1 R2’ $2 R2’’ $3 $2 > 0 $3 > 0 $1 > 0 Jason Yang, Microsoft
Example Client Analysis ESP • Path-sensitive, context sensitive, inter-procedural defect detection tool for large C/C++ programs Jason Yang, Microsoft
Simulation Interface (SI) • Fetching regions and values • Assignments • E.g., x = 1; • Branches • E.g., a == b; • Procedure call (into-binding) • Call back (back-binding) Jason Yang, Microsoft
Into-Binding • Two approaches: • Binding precise calling context into callee • Less demand in reasoning power to refute infeasible path • More suitable for top-down analysis • Binding no constraints (TOP) into callee • More demand in reasoning power to refute infeasible path • More suitable for bottom-up analysis • Binding from caller Call node to callee Entry node • Bind parameters • Bind global variables • Bind constraints Jason Yang, Microsoft
Back-Binding • Binding from callee Exit node to caller Return node • Bind the region-initial values of input regions • Bind values of output regions • Bind constraints Jason Yang, Microsoft
Experiences • Security properties for future version of Windows • Difficult to check with other tools • Scalability • E.g., for all device drivers, found ~500 errors in 20 hours • Precision: • E.g., for Windows kernel (216,000 LOC, 9755 functions) Jason Yang, Microsoft
Summary • Critical for improving precision • Scalable enough for industrial programs • Other client analyses • PSE • Iterative refinement for ESP • Beneficial to have built-in support for merge, function summaries, and pointers Jason Yang, Microsoft
Thank You!For more information, please visithttp://www.microsoft.com/windows/cse/pa Jason Yang, Microsoft