1 / 73

Symbolic Execution

Symbolic Execution. Willem Visser Stellenbosch University. Overview. What is Symbolic Execution History of Symbolic Execution Symbolic PathFinder Concolic Execution aka Dynamic SE DSE vs classic SE Resources:

bbattle
Download Presentation

Symbolic Execution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Symbolic Execution Willem Visser Stellenbosch University

  2. RW 745 - Willem Visser Overview • What is Symbolic Execution • History of Symbolic Execution • Symbolic PathFinder • Concolic Execution aka Dynamic SE • DSE vs classic SE • Resources: • download the docker image at https://hub.docker.com/r/willemvisser/willem-jpf-mutation/ • docker run -it willemvisser/willem-jpf-mutation • “cd ..” look for TUTORIAL files to follow steps

  3. RW 745 - Willem Visser Acknowledgements Corina Pasareanu My ex-colleague from NASA Ames andprobably the world’s leading expert on symbolic execution, for doing this YouTube video (Symbolic Execution and Model Checking for Testing) and for putting the presentation on how JPF’s symbolic execution now works on the web at http://www.slideworld.com/slideshows.aspx/Symbolic-Execution-of-Java-Bytecode-ppt-823844

  4. What is Symbolic Execution? • Static Analysis Technique • Executes code in a non-standard way • Instead of concrete inputs, symbolic values are manipulated • At each program location, the state of the system is defined by • The current assignments to the symbolic inputs and local variables • A symbolic state represent a set of concrete states • A path condition that must hold for the execution to reach this location • Condition on the inputs to reach the location • Program counter • At each branch in the code, both paths must be followed • On the true branch: the condition is added to the path condition • On the false branch: the negation of the condition is added to the path condition • If a branch is infeasible, then execution along that branch is terminated • Idea first floated in mid 1970s

  5. [PC: X >= MIN && X <= MAX Symbolic Execution: Walking Many Paths at Once [pres = 460;pres_min = 640;pres_max = 960] if( (pres < pres_min) || (pres > pres_max)) { … } else { … } [pres = X;pres_min = MIN;pres_max = MAX] [PC: TRUE] if ((pres < pres_min) || (pres > pres_max)) { … } else { … } if ((pres < pres_min)) || (pres > pres_max)) { … } else { … } if ((pres < pres_min) || (pres > pres_max)) { … } else { … } [PC: X< MIN] [PC: X > MAX]

  6. Concrete Execution Path (example) int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert(false); } x = 1, y = 0 1 >? 0 x = 1 + 0 = 1 y = 1 – 0 = 1 x = 1 – 1 = 0 0 >? 1

  7. X >? Y [ X <= Y ] END [ X > Y ] x = X + Y [ X > Y ] y = X + Y – Y = X [ X > Y ] x = X + Y – X = Y [ X > Y ] Y >? X [ X > Y, Y > X ] END [ X > Y, Y <= X ] END Symbolic Execution Tree (example) int x, y; if (x > y) { x = x + y; y = x – y; x = x – y; if (x > y) assert(false); } x = X, y = Y

  8. LETS TRY SPF

  9. History of Symbolic Execution • 1975-76 • James King • Lori Clarke • 1980-2003 • Nothing much happened • Major improvement in SAT solving + Moore’s Law • 2003 Generalized Symbolic Execution • Classic King/Clarke style but for modern programming language, namely Java • 2005 DART (Directed Automated Random Testing) • First concolic/DSE system

  10. Popular SE Systems • Dynamic Symbolic Execution • CUTE (C) and jCUTE (Java) • CREST (C) • PEX (.NET) • SAGE (x86 binaries) • KLEE (C) ? • [New] Jalangi (JavaScript) • Classic Symbolic Execution • KLEE (C) ? • Symbolic PathFinder (Java)

  11. Generalized Symbolic Execution2003 Khurshid, Pasareanu, Visser • Main idea is how to handle complex data structures • Secondary was the use of model checking as an underlying infrastructure for symbolic execution

  12. NullPointerException Input list + Constraint Output list none none null null ? ? E0 E0 E0 <= E1 null null E0 E1 E1 E0 E0 E0 E1 E1 E0 > E1 E0 E1 E1 E0 E0 > E1 E0 > E1 E1 E0 E1 E0 E0 > E1 E1 E0 E1 E0 ? ? Data Structure Example class Node {int elem;Node next;Node swapNode() { if (next != null) if (elem > next.elem) { Node t = next; next = t.next; t.next = this; return t; } return this;} }

  13. next next E0 E1 t next next Precondition: acyclic list next next next E0 E1 ? E0 E1 t t next next next E0 E1 next E0 E1 null t t next next next next next E0 E1 null E0 E1 ? t t Lazy Initialization Algorithm consider executingnext = t.next;

  14. First we need a quick primer on JPF Lets start by playing with JPF

  15. JPF Overview • What is JPF? • Extending JPF • Listeners • Bytecode Factories • Model classes • Getting started • Download, Install and Run (in Eclipse) • Google Summer of Code

  16. What is JPF? • surprisingly hard to summarize - can be used for many things • extensible virtual machine framework for Java bytecode verification: workbench to efficiently implement all kinds of verification tools • typical use cases: • software model checking (deadlock & race detection) • deep inspection (numeric analysis, invalid access) • test case generation (symbolic execution) • ... and many more

  17. History of JPF • not a new project: around for 10 years and continuously developed: • 1997 - project started as front end for Spin model checker • 1999 - reimplementation as concrete virtual machine for software model checking (concurrency defects) • 2003 - introduction of extension interfaces • 2005 - open sourced on Sourceforge • 2008 - participation in Google Summer of Code • 2009 - moved to own server, hosting extension projects and Wiki

  18. No Free Lunch • you need to learn • JPF is not a lightweight tool • flexibility has its price - configuration can be intimidating • might require extension for your SUT (properties, libraries) • you will encounter unimplemented/missing parts (e.g. UnsatisfiedLinkError) • usually easy to implement • exception: state-relevant native libraries (java.io, java.net) • can be either modeled or stubbed • you need suitable test drivers

  19. Key Points • JPF is research platform and production tool (basis) • JPF is designed for extensibility • JPF is open source • JPF is an ongoing collaborative development project • JPF cannot find all bugs • JPF is moderately sized system (~200ksloc core + extensions) • JPF represents >20 man year development effort • JPF is pure Java application (platform independent)

  20. JPF and the Host JVM • verified Java program is executed by JPF, which is a virtual machine implemented in Java, i.e. runs on top of a host JVM⇒ easy to get confused about who executes what

  21. JPF Top-level Structure • two major constructs: Search and JVM • JVM produces program states • Search is the JVM driver

  22. Search Policies • state explosion mitigation: search the interesting state space part first (“get to the bug early, before running out of memory”) • Search instances encapsulate (configurable) search policies

  23. Exploring Choices • model checker needs choices to explore state space • there are many potential types of choices (scheduling, data, ..) • choice types should not be hardwired in model checker

  24. Choice Generators • transitions begin with a choice and extend until the next ChoiceGenerator (CG) is set (by instruction, native peer or listener) • advance positions the CG on the next unprocessed choice (if any) • backtrack goes up to the next CG with unprocessed choices • Choice Generators are configurable as well, i.e. create your own

  25. Listeners, the JPF Plugins

  26. Example ListenerChecking NonNull Annotation on Return public class NonnullChecker extends ListenerAdapter { ... public void executeInstruction(JVM vm) { Instruction insn = vm.getLastInstruction(); ThreadInfo ti = vm.getLastThreadInfo(); if (insn instanceof ARETURN) { // check @NonNull method returns ARETURN areturn = (ARETURN)insn; MethodInfo mi = insn.getMethodInfo(); if (areturn.getReturnValue(ti) == null) { if (mi.getAnnotation(“java.annotation.Nonnull”) != null) { Instruction nextPc = ti.createAndThrowException( "java.lang.AssertionError", "null return from @Nonnull method: " + mi.getCompleteName()); ti.setNextPC(nextPC); return; } } ...

  27. Bytecode Instruction Factories

  28. Example – Bytecode Factory • provide alternative Instruction classes for relevant bytecodes • create & configure InstructionFactory that instantiates them compiler ... [20] iinc [21] goto 10 [10] iload_4 [11] bipush [12] if_icmpge 22 [13] iload_3 [14] iload_2 [15] iadd ... void notSoObvious(int x){ int a = x*50; int b = 19437583; int c = a; for (int k=0; k<100; k++){ c += b; System.out.println(c); }} ... notSoObvious(21474836); JPF configuration vm.insn_factory.class = .numeric.NumericInstructionFactory class loading code execution (by JPF) class IADD extends Instruction { Instruction execute (.., ThreadInfo ti) { int v1 = ti.pop(); int v2 = ti.pop(); int res = v1 + v2; if ((v1>0 && v2>0 && res<=0) …throw ArithmeticException..

  29. Now back to SPF

  30. RW 745 - Willem Visser JPF Symbolic Execution • JPF-SE • Original approach based on program transformation • 2003-2007 • SPF (Symbolic JPF) • Based on non-standard bytecode interpretation • 2008-… • Rest of the presentation focus on this

  31. Symbolic JPF • JPF search engine used • To generate and explore the symbolic execution tree • Also used to analyze thread inter-leavings and other forms of non-determinism that might be present in the code • No state matching performed • In general, un-decidable • To limit the (possibly) infinite symbolic search state space resulting from loops, we put a limit on • The model checker’s search depth or • The number of constraints in the path condition • Off-the-shelf decision procedures/constraint solvers used to check path conditions • Model checker backtracks if path condition becomes infeasible • Generic interface for multiple decision procedures • Choco (for linear/non-linear integer/real constraints, mixed constraints), http://sourceforge.net/projects/choco/ • IASolver (for interval arithmetic) http://www.cs.brandeis.edu/~tim/Applets/IAsolver.html

  32. Implementation JPF Structure: • Key mechanisms: • JPF’s bytecode instruction factory • Replace or extend standard concrete execution semantics of byte-codes with non-standard symbolic execution • Attributes associated w/ program state • Stack operands, fields, local variables • Store symbolic information • Propagated as needed during symbolic execution • Other mechanisms: • Choice generators: • For handling branching conditions during symbolic execution • Listeners: • For printing results of symbolic analysis (method summaries) • For enabling dynamic change of execution semantics (from concrete to symbolic) • Native peers: • For modeling native libraries, e.g. capture Math library calls and send them to the constraint solver Instruction Factory

  33. An Instruction Factory for Symbolic Execution of Byte-codes • We created SymbolicInstructionFactory • Contains instructions for the symbolic interpretation of byte-codes • New Instruction classes derived from JPF’s core • Conditionally add new functionality; otherwise delegate to super-classes • Approach enables simultaneous concrete/symbolic execution • JPF core: • Implements concrete execution semantics based on stack machine model • For each method that is executed, maintains a set of Instruction objects created from the method byte-codes • Uses abstract factory design pattern to instantiate Instruction objects

  34. Attributes for Storing Symbolic Information • Used previous experimental JPF extension of slot attributes • Additional, state-stored info associated with locals & operands on stack frame • Generalized this mechanism to include field attributes • Attributes are used to store symbolic values and expressions created during symbolic execution • Attribute manipulation done mainly inside JPF core • We only needed to override instruction classes that create/modify symbolic information • E.g. numeric, compare-and-branch, type conversion operations • Sufficiently general to allow arbitrary value and variable attributes • Could be used for implementing other analyses • E.g. keep track of physical dimensions and numeric error bounds or perform concolic execution • Program state: • A call stack/thread: • Stack frames/executed methods • Stack frame: locals & operands • The heap (values of fields) • Scheduling information

  35. Handling Branching Conditions • Symbolic execution of branching conditions involves: • Creation of a non-deterministic choice in JPF’s search • Path condition associated with each choice • Add condition (or its negation) to the corresponding path condition • Check satisfiability (mostly with z3) • If un-satisfiable, instruct JPF to backtrack • Created new choice generator publicclass PCChoiceGenerator extendsIntIntervalGenerator { PathCondition[] PC; … }

  36. Example: IADD Concrete execution of IADD byte-code: public class IADD extends Instruction { … public Instruction execute(… ThreadInfo th){ int v1 = th.pop(); int v2 = th.pop(); th.push(v1+v2,…); return getNext(th); } } Symbolic execution of IADD byte-code: public class IADD extends ….bytecode.IADD { … public Instruction execute(… ThreadInfo th){ Expression sym_v1 = ….getOperandAttr(0); Expression sym_v2 = ….getOperandAttr(1); if (sym_v1 == null && sym_v2 == null) // both values are concrete return super.execute(… th); else { int v1 = th.pop(); int v2 = th.pop(); th.push(0,…); // don’t care … ….setOperandAttr(Expression._plus( sym_v1,sym_v2)); return getNext(th); } } }

  37. Example: IFGE Concrete execution of IFGE byte-code: Symbolic execution of IFGE byte-code: public class IFGE extends Instruction { … public Instruction execute(… ThreadInfo th){ cond = (th.pop() >=0); if (cond) next = getTarget(); else next = getNext(th); return next; } } public class IFGE extends ….bytecode.IFGE { … public Instruction execute(… ThreadInfo th){ Expression sym_v = ….getOperandAttr(); if (sym_v == null) // the condition is concrete return super.execute(… th); else { PCChoiceGen cg = new PCChoiceGen(2);… cond = cg.getNextChoice()==0?false:true; if (cond) { pc._add_GE(sym_v,0); next = getTarget(); } else { pc._add_LT(sym_v,0); next = getNext(th); } if (!pc.satisfiable()) … // JPF backtrack else cg.setPC(pc); return next; } } }

  38. How to Execute a Method Symbolically JPF run configuration: +vm.insn_factory.class=gov.nasa.jpf.symbc.SymbolicInstructionFactory +jpf.listener=gov.nasa.jpf.symbc.SymbolicListener +vm.peer_packages=gov.nasa.jpf.symbc:gov.nasa.jpf.jvm +symbolic.dp=iasolver +symbolic.method=UnitUnderTest(sym#sym#con) Main Symbolic input globals (fields) and method pre-conditions can be specified via user annotations Instruct JPF to use symbolic byte-code set Print PCs and method summaries Use symbolic peer package for Math library Use IASolver as a decision procedure Method to be executed symbolically (3rd parameter left concrete) Main application class containing method under test

  39. “Any Time” Symbolic Execution • Symbolic execution • Can start at any point in the program • Can use mixed symbolic and concrete inputs • No special test driver needed – sufficient to have an executable program that uses the method/code under test • Any time symbolic execution • Use specialized listener to monitor concrete execution and trigger symbolic execution based on certain conditions • Unit level analysis in realistic contexts • Use concrete system-level execution to set-up environment for unit-level symbolic analysis • Applications: • Exercise deep system executions • Extend/modify existing tests: e.g. test sequence generation for Java containers

  40. Current State of SPF • Downloadable as jpf-symbc from JPF website • Recent Publication is the main reference for SPF • “Symbolic PathFinder: Integrating Symbolic Execution with Model Checking for Java Bytecode Analysis” in Automated Software Engineering Journal 20(3) 2013

  41. Lets play some more with SPF

  42. Classic vs Dynamic Symbolic Execution • Terminology • Classic Symbolic Execution == Symbolic Execution • Dynamic Symbolic Execution == Concolic • Classic • Everything is symbolic • Need a special environment to run • Dynamic • Concrete and symbolic • Execute the code for real and keep track of the symbolic world on the side

  43. Example: Classic vs Dynamic SE int max(int a, int b) { if (a > b) return a; else return b; } void test(int x, int y) { int z = max(x,y); if (z == x) { // L1 } else { // L2 } } [true] test(X,Y) {x = X, y = Y} [true] max(X,Y) {a = X, b = Y} [X > Y] ret X [X <= Y] ret Y [X > Y] z = max(X,Y) {z = X} [X <= Y] z = max(X,Y) {z = Y} [X > Y & X == X] L1 [X != X] [X <= Y & Y == X] L1 [X <= Y & Y != X] L2 Solve and run Collect Path Condition [TRUE] ? test(0,1) [X <= Y & Y != X] L2 Negate, solve and run [X <= Y & Y == X] ? test(0,0) [X <= Y & Y == X] L1 [X <= Y & Y != X] done before [X > Y] ? test (1,0) [X > Y & X == X] L1

  44. Example: Why bother with Dynamic? native int max(int a, int b) {} void test(int x, int y) { int z = max(x,y); if (z == x) { // L1 } else { // L2 } } [true] test(X,Y) {x = X, y = Y} [true] max(X,Y) {a = X, b = Y} SPF has a “Symcrete” mode where we concretize on-the-fly KLEE has a similar feature ????? Solve and run Collect Path Condition [TRUE] ? test(0,1) [Y != X] L2 Negate, solve and run [Y == X] ? test(0,0) [Y == X] L1 [Y != X] done before

  45. Approximations public class DART { public static void test (int x ,int y ) { if (x*x*x > 0) { if (x >0 && y ==10) abort (); // A } else { if (x >0 && y ==20) abort (); // B } } public static void main ( String [] args ) { test (2 , 9); } } PC1: x*x*x > 0 && x > 0 && y != 10 Instead of using x = 2 everywhere, we replace x with 2 until the formula becomes linear Negated: 4x > 0 && x > 0 && y == 10 Thus it can handle: public class DART { public static void test (int x ,int y ){ if (x*x*x > 0) { fine(); } else { abort(); } } }

  46. Models: it makes (D)SE tick int max(int a, int b) { if (a > b) return a; else return b; } void test(int x, int y) { int z = max(x,y); if (z == x) { // L1 } else { // L2 } } Model/Summary for max(a,b): ((a > b) & (RET == a)) || ((a<=b) & (RET == b)) Solve and run: [TRUE] ? test(0,1) PC collected: [((X > Y) & (Z == X)) || ((X<=Y) & (Z == Y))& Z != X] L2 [X<=Y & Y != X] Negated, solve and run: [X<=Y & Y == X] ? test(0,0) PC collected: [((X > Y) & (Z == X)) || ((X<=Y) & (Z == Y))& Z == X] L1 [X<=Y & X == Y] Done after only 2 paths!

  47. Models Implementations Properties file deepsea.config.dump = true deepsea.target = examples.simple.MaxChoice3 deepsea.args = deepsea.triggers = examples.simple.MaxChoice3.compute(X:int,Y:int,Z:int) deepsea.delegates = java.lang.Math:za.ac.sun.cs.deepsea.models.Math deepsea.produceoutput = false deepsea.explorer = za.ac.sun.cs.deepsea.explorer.DepthFirstExplorer green.log.level = OFF public booleanmax_II_I(Symbolizer symbolizer) { SymbolicFrame frame = symbolizer.getTopFrame(); Expression arg0 = frame.pop(); Expression arg1 = frame.pop(); Expression var = new IntVariable(Symbolizer.getNewVariableName(), -1000, 1000); Expression pc = new Operation(Operator.OR, new Operation(Operator.AND, new Operation(Operator.GE, arg0, arg1), new Operation(Operator.EQ, arg0, var)), new Operation(Operator.AND, new Operation(Operator.LT, arg0, arg1), new Operation(Operator.EQ, arg1, var))); symbolizer.pushExtraConjunct(pc); frame.push(var); return true; }

  48. Approaches How is DSE implemented? • Instrument the source code • Instrument the bytecode • Instrument/Change the machine Screenshot from https://github.com/ksluckow/awesome-symbolic-execution

  49. DEEPSEA High Level View Test Environment DEEPSEA Environment JVM Tests JVM JVM vs DEEPSEA Code JPDA Code TOO SLOW What is DEEPSEA Potential Benefits • Dynamic Symbolic Execution • Uses Green for constraint solving • Multiple back-end solvers • Caching of constraint results • No instrumentation • Uses Java Platform DebuggerInterface • Parallelization for performance • Summaries to reduce path explosion • Better behavioral coverage • Symbolic rather than concrete inputs • High fidelity • Code under test is untouched • Easier to integrate with existing systems • Code runs the same, only the debugger interface is attached • Works for any version of Java • Supports all bytecodes

  50. COASTAL Overview main main(...) SymbolicState SUT InstrumentationClassLoader SUT .class SUT .class Instrumenter Math .class Math .class System .class System .class new values PathTree Strategy frames instanceData Path condition DepthFirstStrategy A ∧ B ∧ ... BreadthFirstStrategy RandomStrategy

More Related