1 / 28

Symbolic Path Simulation in Path-Sensitive Dataflow Analysis

Symbolic Path Simulation in Path-Sensitive Dataflow Analysis. Hari Hampapuram Jason Yue Yang Manuvir Das. Center for Software Excellence (CSE) Microsoft Corporation. Gist of Results. Symbolic path simulation engine supporting: Merge For merge-based path-sensitive analysis

allegra
Download Presentation

Symbolic Path Simulation in Path-Sensitive Dataflow Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft Corporation

  2. Gist of Results • Symbolic path simulation engine supporting: • Merge • For merge-based path-sensitive analysis • Function summaries • For scalable global analysis • Pointers • Our main client is Windows Jason Yang, Microsoft

  3. Infeasible Path  False Positive extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } START OpenHandle CloseHandle OPEN CLOSE UseHandle UseHandle ERROR Jason Yang, Microsoft

  4. Infeasible Path  False Positive extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } START OpenHandle CloseHandle OPEN CLOSE UseHandle UseHandle ERROR Jason Yang, Microsoft

  5. Need for Merge • The “knob” for scalability vs. precision tradeoff • Always merge (traditional dataflow)  false errors • Always separate: exponential blow-up • Driven by client analyses Jason Yang, Microsoft

  6. Merge Criterion for ESP • Selective merging based on property states • Partition symbolic states into property states and everything else • If the incoming paths differ in property states, track them separately; otherwise, merge them. Jason Yang, Microsoft

  7. Merge Criterion for ESP  Example extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } Property states different along paths Jason Yang, Microsoft

  8. Merge Criterion for ESP  Example extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } Property states different along paths  Do not merge Jason Yang, Microsoft

  9. Merge Criterion for ESP  Example extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } Property states change along paths  Do not merge Property states are the same Jason Yang, Microsoft

  10. Merge Criterion for ESP  Example extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } Property states change along paths  Do not merge Property states are the same  Merge Jason Yang, Microsoft

  11. Merge Criterion for ESP  Example extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = 1; else y = 2; if (x != 1) UseHandle(handle); } Property states change along paths  Do not merge Property states are the same  Merge Still maintains the needed fact: “If CloseHandle is called, branch should fail.” Jason Yang, Microsoft

  12. Need for Function Summaries extern int a, b; void Process(int handle) { int x, y; if (a > 0) { CloseHandle(handle); x = 1; } else x = 2; if (b > 0) y = Foo(b); else y = 2; if (x != 1) UseHandle(handle); } Partial transfer functions Computed on-demand Enforced by “into-binding” and “back-binding” Jason Yang, Microsoft

  13. Support for Language Features • Pointers • Field-based objects • Operator expressions • … Jason Yang, Microsoft

  14. Symbolic Simulator Architecture Defect detection, core dump analysis, test generation code review ... Client Application Client Application “Semantic translator” Simulation Interface (SI) Simulation Interface (SI) Simulation State Manager (SSM) “Theorem prover” Jason Yang, Microsoft

  15. Semantic Domains • Environment • ProgramSymbol  Loc • Managed by Simulation Interface • Store • Loc Val • Managed by Simulation State Manager • Region-based model for symbolic store • region Loc • value Val Jason Yang, Microsoft

  16. Simulation State Manager (SSM) • Tracking symbolic simulation states to answer queries about path feasibility • What should be tracked? • Mapping of store region value • Constraints on values Jason Yang, Microsoft

  17. Regions • Variable regions vs. deref regions • Important for pointer dereference • Important for supporting merge and binding Variable regions: R(p), R(q), R(x), R(y) Deref regions: R(*p), R(*q) void Process(int *p, int *q) { int x = *p; int y = *q; if (p != q) return; if (*p != *q) … // Not reachable } Jason Yang, Microsoft

  18. Values • Constant values (integers, floats, …) • Operator values (arithmetic, bitwise, relational) • Symbolic values (general constraint variables) • Region-initial values (constraint variables for initial values) • Pointer values (for points-to relationship) • Field-based values (for compound types) Jason Yang, Microsoft

  19. Need for Region-Initial Values • Important for function summary • Pre-condition: simulation state at Entry node • Post-condition: simulation state at Exit node • Input values vs. current values • To support lazy initialization for input values • An input region gets region-initial values by default, unless it has been killed • Need to maintain a kill set Jason Yang, Microsoft

  20. Decision Procedures • Current implementation: • Equality (e.g. a == b): equivalence classes • Disequality (e.g. a != b): multi-maps between equivalence classes • Inequality (e.g. a< b): a graph (nodes are equivalence classes and edges are inequality relations) • Can plug in other theorem provers if needed Jason Yang, Microsoft

  21. Merge • Moves symbolic states upwards in the lattice • Less constraints on path feasibility after merge • Maps the memory graphs and the associated constraints on values 0xEFD0 0xEFD0 R1 R1’ R1’’ 0xEFD0  JOIN R2 $1 R2’ $2 R2’’ $3 $2 > 0 $3 > 0 $1 > 0 Jason Yang, Microsoft

  22. Example Client Analysis  ESP • Path-sensitive, context sensitive, inter-procedural defect detection tool for large C/C++ programs Jason Yang, Microsoft

  23. Simulation Interface (SI) • Fetching regions and values • Assignments • E.g., x = 1; • Branches • E.g., a == b; • Procedure call (into-binding) • Call back (back-binding) Jason Yang, Microsoft

  24. Into-Binding • Two approaches: • Binding precise calling context into callee • Less demand in reasoning power to refute infeasible path • More suitable for top-down analysis • Binding no constraints (TOP) into callee • More demand in reasoning power to refute infeasible path • More suitable for bottom-up analysis • Binding from caller Call node to callee Entry node • Bind parameters • Bind global variables • Bind constraints Jason Yang, Microsoft

  25. Back-Binding • Binding from callee Exit node to caller Return node • Bind the region-initial values of input regions • Bind values of output regions • Bind constraints Jason Yang, Microsoft

  26. Experiences • Security properties for future version of Windows • Difficult to check with other tools • Scalability • E.g., for all device drivers, found ~500 errors in 20 hours • Precision: • E.g., for Windows kernel (216,000 LOC, 9755 functions) Jason Yang, Microsoft

  27. Summary • Critical for improving precision • Scalable enough for industrial programs • Other client analyses • PSE • Iterative refinement for ESP • Beneficial to have built-in support for merge, function summaries, and pointers Jason Yang, Microsoft

  28. Thank You!For more information, please visithttp://www.microsoft.com/windows/cse/pa Jason Yang, Microsoft

More Related