Regression Verification for Multi-Threaded Programs

Regression Verification for Multi-Threaded Programs Sagar Chaki, SEI-Pittsburgh Arie Gurfinkel, SEI-Pittsburgh Ofer Strichman, Technion-Haifa Originally presented at VMCAI 2012 VSSE 2012

Regression Verification • Formal equivalence checking of two similar programs • Three selling points: • Specification: • not needed • Complexity: • dominated by the change, not by the size of the programs. • Invariants: • Easy to derive automatically high-quality loop/recursion invariants • Tool support: • RVT (Technion) • SymDiff (MSR)

Possible applications of regression verification • Understand how changes propagate through the API • Validate refactoring • Semantic Impact Analysis • GenericTranslation Validation • Proving Determinism • P is equivalent to itself ,P is deterministic • many more…

Definition: Partial Equivalence Let f1 and f2 be functions with equivalent I/O signature f1 and f2are partially equivalent, executions of f1 and f2 on equal inputs …which terminate, result in equal outputs. Undecidable

Multi-Threaded Programs (MTPs) Finite set of root functions : P = f1 || … || fn Each executed in a separate thread Communicate via shared variables is P partially equivalent to itself? …no! shared variable int base = 1; void f1(int n, int *r) { if (n < 1) *r = base; else { int t; f1(n-1, &t); *r = n * t; } } void f2() { base = 2; } thread thread P Computes n! or 2*n!

Towards partial equivalence of MTPs P is a multi-threaded program (P): the set of terminating computations of P R(P) = { (in, out) | ∃ ∈ (P).  begins in in and ends in out} Example: R (P) = {(n, n!), (n, 2n!) | n∈Z} int base = 1; void f1(int n, int *r) { if (n < 1) *r = base; else { int t; f1(n-1, &t); *r = n * t; } } void f2() { base = 2; } P

Partial equivalence of MTPs MTPs P1, P2 arepartially equivalent if R(P1) = R(P2). Denoted p.e.n. (P1, P2) Partial Equivalence of Nondeterministic programs Claim: 8P.²p.e.n.(P,P)

Towards a sound proof rulefor partial equivalence of MTPs • The sequential case • Challenges in extending it to p.e.n. • A proof rule

First: The sequential case [GS’08] Loops/recursion are not a big problem in practice: Recursive calls are replaced with calls to the same uninterpreted function: Abstracts the callees By construction assumes the callees are partially equivalent Sound by induction f1 f2 isolation = UF UF f1 f2 = 9

First: The sequential case [GS’08] • Now we have two isolated functions f1, f2 • Checking partial equivalence is decidable: in1 = in2 = *; o1 = f1(in); o2 = f2(in); assert (o1==o2) • The algorithm traverses up the call graphs while abstracting equal callees. • How can this method be extended to proving p.e.n(f1,f2) ?

What affects partial equivalence of MTPs? Before: in= 1 o1= 1, o2 = 0; (1, h1, 0i) 2 R(p) x1 = x2 = 0 f1() { o1 = x1; o2 = x2; } f2(int in) { x1 = in; x2 = in; } x2 = in; x1 = in; || Swap write order After: in= 1 o1= 1)o2= 1 (1, h1, 0i) R(p) 11

After: in1= 1, in2 = 2 o2 = 1) x1 = 1 < t2 = x1) t1 = 0 ) o1 = 0 (h1,2i, h2,1i)  R(p) What affects partial equivalence of MTPs? x1 = x2 = 0 f1(int in1) { x1 = in1; t1 = x2; o1 = t1; } f2(int in2) { t2 = x1; x2 = in2; o2 = t2; } t1 = x2; x1 = in1; || Swap R/W order Before: in1 = 1, in2 = 2 x1 = 1; t2 = x1 = 1; x2 = in2 = 2; t1 = x2 = 2; o1 = t1 = 2; o2 = t2 = 1; (h1,2i, h2,1i) 2R(p) 12

Before: in= 1 o1= 0, o2 = 1; (1, h0, 1i) 2 R(p) What affects partial equivalence of MTPs? x1 = x2 = 0 f1() { o1 = x1; o2 = x2; } f2(int in) { x1 = in; x2 = in; } o2 = x2; o1 = x1; || Swap read order After: in = 1 o1 = 0)o2 = 0 (1, h0, 1i) R(p) 13

Preprocessing Loops) recursive functions Mutual recursion) simple recursion Non-recursive functions) inlined Function’s return value) parameter Each shared variable x only appears in t = xorx = exp (add auxiliaries) Each auxiliary variable is read once int base = 1; void f1(int n, int *r) { if (n < 1) *r = base; else { int t; f1(n-1, &t); *r = n * t; } } void f2() { base = 2; } P

Mapping • Assume each function is used in a single thread. • Otherwise, duplicate it • Find a mapping between the non-basic types • Find a bijective map between: • threads • shared variables • functions (same proto-type), • in mapped functions: read globals, written-to globals • Without such a mapping: goto end-of-talk.

The Observable Stream • Consider a function f and input in • The observable stream of f(in)’s run is its sequence of • function calls • read/write of shared variables • Example: let x be a shared variable: x x x x

Observable Equivalence of Functions f, f’are observably equivalent,8in. f(in), f’(in)have equal sets of finite observable streams Denote by observe-equiv(f,f’) • Assume: outputs are defined via shared variables • Hence: observable equivalence ) partial equivalence

Observable Equivalence of Functions • We prove observe-equiv(f,f’) by • isolating f,f’. • proving observ. equivalence of all (isolated) pairs • Given f, we build an isolated version [f], in which: • Function calls are replaced with calls to uninterpreted functions • Shared variables are read from an “input” stream • The observable stream is recorded

Reading from an input stream… • For each shared variable x: R(x) ÃR(*) // read R(x) W(x) g(…) W(x) R(x) R(x) R(*) R(*) R(*)

Reading from an input stream… R(x) local W(x) g(…) W(x) R(x) R(x) • Enforce: equal locations)same ‘*’ value • … if their streams up to that point were equal. R(*1) R(*1) R(x) W(x) W(x) local g(…) R(x) local R(x) R(*2) R(*2) R(*3) R(*3)

Summary: from f to [f] Transform function f and f’ to [f] and [f’] by: Function calls are replaced with calls to UFs t = g() ) t = UFg() For (g,g’) 2 map, UFg = UFg’ Shared variables are read from an “input” stream: t = x ) t = UFx(loc) // loc is the location in the stream For (x,x’)2 map, Ufx = Ufx’ The Observable Stream is recorded.

Example: from f to [f] Observable stream Output treated as shared list out; void [f] (int n, int *r) { int t, loc = 0; if (n < 1) { t = UFbase(loc); out += (R,”base”); loc++; out += (W,”r”,t); loc++;} else { t = UFf,t(n-1); out += (C,f,n-1); out += (W,”r”, n * t); loc++; } } int base = 1; void f (int n, int *r) { int t; if (n < 1) { t = base; *r = t;} else { f(n-1,&t); *r = n * t; } } read write function call write f [f]

Checking observable equivalence of [f],[f’] Generate sequential program S: in1 = in2 = *; [f(in1)];[f’(in2)]; rename ([f’].out); // according to map assert([f].out == [f’].out); Validity of S is decidable f [f] S CBMC f’ [f’]

AProof Rule Compositional: check one function pair at a time Sequential: no thread composition General: supports loops/recursion + arbitrary # of threads 8f, f’ 2map. observe-equiv([f], [f’]) p.e.n. (P1, P2)

Summary and Future Thoughts • We suggested foundations of regression verification for MTPs • Notion of partial equivalence of multi-threaded programs • A proof rule • Challenges ahead: • Synchronization primitives: Locks, semaphores, atomic blocks • Dynamic thread creation • Make more complete

Regression Verification for Multi-Threaded Programs

Regression Verification for Multi-Threaded Programs

Presentation Transcript

Multi-threaded RTOS

Thread-specific Heaps for Multi-threaded Programs

Multi-threaded Active Objects

Multi-threaded Active Objects

Multi-threaded applications

Regression-Verification

Regression Verification: Proving the equivalence of similar programs

Multi-threaded Reachability

Regression-Verification

Multi Threaded Chat Server

Multi-threaded Reachability

Dynamic Data-Race Detection in Lock-Based Multi-Threaded Programs

Multi-Threaded Transactions

Distributed Verification of Multi-threaded C++ Programs

Parallelism (Multi-threaded)

Multi-threaded RTOS

Multi-Threaded Video Rendering

Multi-threaded ROOT

Regression Verification: Proving the equivalence of similar programs