460 likes | 612 Views
Automatically Verifying Concurrent Queue Algorithms. Eran Yahav Mooly Sagiv School of Computer Science Tel-Aviv University {yahave,msagiv}@post.tau.ac.il http://www.cs.tau.ac.il/~yahave. Automatically Verifying Partial Correctness of Software using Abstract Interpretation.
E N D
Automatically Verifying Concurrent Queue Algorithms Eran Yahav Mooly Sagiv School of Computer Science Tel-Aviv University {yahave,msagiv}@post.tau.ac.il http://www.cs.tau.ac.il/~yahave
Automatically Verifying Partial Correctness of Softwareusing Abstract Interpretation • Operates on the program source • Fully automatic • Conservative results • No errors are reported partial correctness is guaranteed • But may produce “false alarms” • Makes the results non-useful • The Challenge avoid false alarms
Concurrent Queues • A common component of concurrent systems • Operating systems • A large number of suggested algorithms • Hard to get right • [Stone90] – races + items may be lost • [Valois94] – items may be lost • … • Mostly given without formal proof of correctness
Automatically Verifying Concurrent Queue Algorithms? • Support the following • Concurrency • Dynamic allocation/deallocation of objects • Destructive updates • Heap references • Unbounded storage (heap) • Dynamic allocation/deallocation of threads • Handling references with sufficient precision for establishing correctness properties • Preceding pointer-analysis phase usually insufficient
Example [Michael&Scott PODC96] public void enqueue(Object value) { e_1 node = new QueueItem() // allocate queue node e_2 node.val = value // copy enqueued value into node e_3 node.next.ref = NULL e_4 while(true) { // Keep trying until done e_5 tail = this.Tail // get Tail.ptr and Tail.count e_6 next = tail.ref.next // get next ptr and count e_7 if (tail == this.Tail) { // are tails consistent? e_8 if (next.ref == NULL) { // was tail pointing to last node? e_9 if CAS(tail.ref.next, next, <node, next.count+1>) { // try connect e_10 break // Enqueue is done. Exit loop e_11 } e_12 } else { // tail wasn’t pointing to last node e_13 CAS(this.Tail, tail,<next.ref, tail.count+1>) // try advance tail e_14 } e_15 } e_16 } e_17 CAS(this.Tail, tail, <node, tail.count+1>) //enqueue done. try swing tail e_18 }
Correctness P1 The linked list is always connected P2 Nodes are only inserted after the last node of the linked list P3 Nodes are only deleted from the beginning of the linked list P4 Head always points to the first node in the linked list P5 Tail always points to a node in the linked list
Rich Problem Expressive Formalism • We use first-order logic with transitive closure • Naturally define behavior of heap-manipulating programs • Heap references • Heap Reachability • Threads as heap-allocated objects (and scheduling) • Can also model integers
Plan • Vanilla verification attempt • Program configurations • Expressing safety properties • Abstraction • Refining the vanilla solution • Instrumentation predicates • Prototype implementation (TVLA/3VMC)
Configurations • A program configuration encodes • global store • program-location of every thread • status of locks and threads • First-order logical structures used to represent program configurations
Configurations as First-order Logical Structures • First-order structure • Objects - Individuals • properties of objects – unary predicates • relationship between objects – binary predicates • Additional integrity constraints FO formulas • Object • Type – unary predicates • References between objects – binary predicates • Thread • Program location – unary predicates • Integers • Distinguished zero individual – unary predicate • Successor relationship – binary predicate • With integrity constraints corresponding to Peano axioms
Concrete Configuration succ succ succ rv[value] zero iv[Head] rv[node] at[e_2] iv[Tail] iv[next] rv[this] iv[next] iv[next] rv[Head] rv[next] rv[next] rv[next] rv[this] rv[Tail] at[e_2] rv[node] rv[value]
Configurations • Predicates model properties of interest • eq(v1,v2) • is_T(v) • { at[lab](t) : lab Labels } • { rv[fld](o1,o2) : fld Fields } • { iv[fld](o1,o2) : fld Fields } • heldBy(l,t), blocked(t,l), waiting(t,l) • zero(v), succ(v1,v2) • Can use the framework with different predicates
Tail Reachable from Head succ succ succ zero iv[Head] rv[value] rv[node] at[e_2] iv[Tail] iv[next] iv[next] iv[next] rv[this] rv[Head] rv[next] rv[next] rv[next] rv[this] at[e_2] rv[Tail] rv[node] Vh Vt rv[value] q:nbq,vt. rv[Tail](q,vt) vh. rv[Head](q,vh) rv[next]*(vh, vt)
Abstract Program Model • Conservative representation of the concrete model • Use 3-valued logical structures to conservatively represent multiple 2-valued structures • 1 = true • 0 = false • 1/2 = unknown • A join semi-lattice, 0 1 = 1/2 • Conservatively apply actions on abstract configurations
Concrete Configuration rv[value] succ succ succ zero iv[Head] rv[node] at[e_2] iv[Tail] iv[next] iv[next] iv[next] rv[this] rv[Head] rv[next] rv[next] rv[next] rv[this] at[e_2] rv[Tail] rv[node] rv[value]
Abstract Configuration succ succ zero iv[Head] rv[value] at[e_2] iv[Tail] rv[node] rv[this] rv[Head] iv[next] rv[this] rv[Tail] rv[next]
canonical Abstraction • Merge all nodes with the same unary predicate values into a single summary node • Join predicate values • Converts a configuration of infinite size into a 3-valued abstract configuration of bounded size
Concrete Bad Configuration succ succ succ rv[value] zero iv[Head] rv[node] at[e_2] iv[Tail] iv[next] rv[this] iv[next] rv[Head] rv[next] rv[next] rv[this] rv[Tail] at[e_2] rv[node] rv[value] q:nbq,vt. rv[Tail](q,vt) vh. rv[Head](q,vh) rv[next]*(vh, vt)
Abstract Configuration succ succ rv[value] zero iv[Head] at[e_2] at[e_2] rv[node] iv[Tail] rv[this] rv[Head] iv[next] rv[this] rv[Tail] rv[next] q:nbq,vt. rv[Tail](q,vt) vh. rv[Head](q,vh) rv[next]*(vh, vt)
Instrumentation • Refine the abstraction by recording additional information • Natural idea – record which property-formulae hold via nullary predicates • Corresponds to predicate abstraction • More generally – record subformulae that hold for an individual via unary predicates • Obtain (some) useful results without changing set of predicates per program/property
Instrumented Concrete Configuration r_by[value] rt[value,n] succ succ succ zero i_by[…] iv[Head] rv[value] r_by[node] rt[node,n] iv[Tail] iv[next] rv[node] iv[next] at[e_2] iv[next] rv[this] is[this] r_by[this] rt[this,n] r_by[head] rt[head,n] r_by[next] rt[Head,n] r_by[next] rt[Head,n] r_by[next] r_by[Tail] rt[Head,n] rt[Tail,n] rv[Head] rv[next] rv[next] rv[next] rv[this] at[e_2] rv[Tail] r_by[node] rt[node,n] rv[node] r_by[value] rt[value,n] rv[value]
Instrumented Abstract Configuration r_by[value] rt[value,n] succ rv[value] zero i_by[…] succ iv[Head] r_by[node] rt[node,n] rv[node] at[e_2] iv[Tail] iv[next] iv[next] rv[this] rv[next] is[this] r_by[this] rt[this,n] r_by[Head] rt[Head,n] r_by[next] rt[Head,n] r_by[next] r_by[Tail] rt[Head,n] rt[Tail,n] rv[Head] rv[next] rv[next] rv[Tail] q:nbq,vt. rv[Tail](q,vt) vh. rv[Head](q,vh) rv[next]*(vh, vt) q:nbq,v. rv[Tail](q,v) rt[Head,n](v)
Operational Semantics [|S|] Concretization Abstraction [|S|] Abstract Representation Abstract Representation Abstract Semantics Best Conservative Interpretation Concrete Representation Concrete Representation
Abstract Interpretation - Concretization e_2 node.val = value r_by[value] rt[value,n] succ rv[value] zero i_by[…] succ iv[Head] r_by[node] rt[node,n] rv[node] at[e_2] iv[Tail] iv[next] iv[next] rv[this] rv[next] is[this] r_by[this] rt[this,n] r_by[Head] rt[Head,n] r_by[next] rt[Head,n] r_by[next] r_by[Tail] rt[Head,n] rt[Tail,n] rv[Head] rv[next] rv[next] rv[Tail]
r_by[value] rt[value,n] r_by[value] rt[value,n] r_by[value] rt[value,n] r_by[value] rt[value,n] r_by[value] rt[value,n] r_by[value] rt[value,n] rv[value] rv[value] rv[value] rv[value] rv[value] rv[value] r_by[node] rt[node,n] r_by[node] rt[node,n] r_by[node] rt[node,n] r_by[node] rt[node,n] r_by[node] rt[node,n] r_by[node] rt[node,n] at[e_2] at[e_2] at[e_2] at[e_2] at[e_2] at[e_2] rv[node] rv[node] rv[node] rv[node] rv[node] rv[node] Abstract Interpretation - Concretization e_2 node.val = value r_by[value] rt[value,n] rv[value] r_by[node] rt[node,n] rv[node] at[e_2] …
r_by[value] rt[value,n] r_by[value] rt[value,n] r_by[value] rt[value,n] r_by[val] rt[val,n] r_by[value] rt[value,n] r_by[val] rt[val,n] r_by[value] rt[value,n] rv[value] rv[value] rv[value] rv[value] rv[value] r_by[node] rt[node,n] r_by[node] rt[node,n] r_by[node] rt[node,n] exists[val] r_by[node] rt[node,n] r_by[node] rt[node,n] exists[val] at[e_3] at[e_2] at[e_3] at[e_2] at[e_2] rv[node] rv[node] rv[node] rv[node] rv[node] Abstract Interpretation - update e_2 node.val = value r_by[value] rt[value,n] r_by[val] rt[val,n] rv[value] rv[value] rv[value] rv[val] … r_by[node] rt[node,n] exists[val] at[e_3] rv[node]
Abstract Interpretation - abstraction r_by[value] rt[value,n] r_by[val] rt[val,n] r_by[value] rt[value,n] r_by[val] rt[val,n] rv[value] rv[value] at[e_3] at[e_3] rv[val] rv[val] r_by[value] rt[value,n] r_by[val] rt[val,n] r_by[node] rt[node,n] exists[val] r_by[node] rt[node,n] exists[val] rv[value] rv[node] rv[node] at[e_3] rv[val] r_by[node] rt[node,n] exists[val] r_by[node] rt[node,n] r_by[node] rt[node,n] rv[node] rv[node] rv[node] at[e_2] at[e_2] r_by[value] rt[value,n] r_by[value] rt[value,n] rv[value] rv[value]
Prototype Implementation • TVLA/3VMC • focus • coerce • Limitations • only intraprocedural • no optimizations used • No partial-order reduction
Conclusions • Common challenges of model checking and abstract interpretation • False alarms • Cost • Scalability • Size • Language features
Summary • Verified interesting safety properties of concurrent queues • Unbounded number of objects and threads • Dynamic allocation of objects and threads
The End http://www.cs.tau.ac.il/~yahave
Integer Representation - Concrete x == y ? succ succ succ zero iv[x] yes iv[y] y++ succ succ succ zero x == y ? iv[x] iv[y] no
Integer Representation – No Instrumentation x == y ? succ succ succ zero yes iv[x] iv[y] y++ succ succ succ zero x == y ? iv[x] iv[y] no
Integer Representation – No Instrumentation succ x == y ? succ zero maybe iv[x] iv[y] y++ succ succ x == y ? zero maybe iv[x] iv[y]
Integer Representation – With Instrumentation x == y ? succ succ succ i_by[x] i_by[y] zero yes iv[x] iv[y] y++ succ succ succ zero i_by[x] i_by[y] x == y ? iv[x] iv[y] no
Integer Representation – With Instrumentation succ x == y ? succ succ i_by[x] i_by[y] zero yes iv[x] iv[y] y++ succ succ succ succ x == y ? i_by[y] i_by[x] zero iv[x] no iv[y]
References • [Stone90] • J.M. Stone. A simple and Correct Shared-Queue Algorithm using compare-and-swap. In Proceedings of Supercomputing ’90, November 1990. • [Valois94] • J.D. Valois. Implementing Lock-Free Queues. In Seventh international Conference on Parallel and Distributed Computing Systems, Las Vegas, NV, October 1994
Structural Operational Semantics - actions • An action consists of: • precondition formula • update formulae • Precondition formula may use a free variable ts for “currently scheduled” thread • Semantics is non-deterministic
lock(v) precondition tts: rval[v](ts,l) held_by(l,t) predicate update held_by’(l1,t1) = held_by(l1,t1) (l1 = l t1 = ts) blocked’ (t1,l1) = blocked(t1,l1) ((l1 l) (t1 ts)) Structural Operational Semantics - actions
State Space Exploration Initialize(C0) { for each C C0 push(stack,C) } explore() { while stack is not empty { C = pop(stack) if not member(C,stateSpace) { verify(C) stateSpace = stateSpace {C} for each action ac for each C’ such that C ac C’ push(stack,C’) } } }
Unbounded Number of Threads • Exploit state-space symmetry • Previous work defined symmetry between process names (indices) • Thread location = thread property • canonical names = symmetry between properties