560 likes | 636 Views
Applications of TVLA. Mooly Sagiv Tel Aviv University. Shape Analysis with Applications. http://www.cs.tau.ac.il/~rumster/TVLA /. Outline. Issues Complexity of TVLA Weak vs. Strong Updates Cleanness Null derefernces Memory leaks Freed storage (homework)
E N D
Applications of TVLA Mooly Sagiv Tel Aviv University Shape Analysis with Applications http://www.cs.tau.ac.il/~rumster/TVLA/
Outline • Issues • Complexity of TVLA • Weak vs. Strong Updates • Cleanness • Null derefernces • Memory leaks • Freed storage (homework) • The concurrent modification problem • Partial correctness • Sorting • GC • Total Correctenss • Flow dependences • Multithreading • Other
Complexity of Shape Analysis x = malloc() if (…) y1 = x if (…) y2 = x if (…) y3 = x if (…) yn =x
Complexity of TVLA analysis • Maximal number of Nodes in Blurred Structures • 3|A| • Size of 3-valued structure representation • Action cost • Focus • Precondition • Coerce • New • Update • Coerce • Blur
Weak vs. Strong Updates if (…) x = y else x = z xn = NULL
Detecting Incorrect Library Usages(J. Field, D. Goyal. G. Ramalingam, A. Warshavski) • Java provides libraries for manipulating data structures • Collections • Lists • Hashset • … • Iterators over collections allows sequential accesses • Statically detect incorrect library usages Set s = worklist.unprocessedItems(); for (Iterator i = s.iterator(); i.hasNext()){ Object item = i.next(); if (...) processItem(item);
The Concurrent Modification Problem • Static analysis of Java programs manipulating Java 2 collections • Inconsistent usages of iterators • An Iterator object i defined on a collection object c • No use of i may be preceded by update to the contents of c,unless the update was also made via I • Guarantees order independence
Artificial Example Set v = new Set(); Iterator i1 = v.iterator(); Iterator i2 = v.iterator(); Iterator i3 = i1; i1.next(); i1.remove(); if (...) { i2.next(); } if (...) { i3.next(); } v.add("..."); if (...) { i1.next();}
class Make { private Worklist worklist; public static void main (String[] args) { Make m = new Make(); m.initializeWorklist(args); m.processWorklist(); } void initializeWorklist(String[] args) { ...; worklist = new Worklist(); ... // add some items to worklist} void processWorklist() { Set s = worklist.unprocessedItems(); for (Iterator i = s.iterator(); i.hasNext()){ Object item = i.next(); if (...) processItem(item); } } void processItem(Object i){ ...; doSubproblem(...);} void doSubproblem(...) { ... worklist.addItem(newitem); ... } } public class Worklist { Set s; public Worklist() {. ..; s = new HashSet(); ... } public void addItem(Object item) { s.add(item); } public Set unprocessedItems() { return s; } } return rev; }
Static Detection of Concurrent Modifications • Statically Check for CME exceptions • Warn against potential CME • Sound (conservative) solution • Not too many false alarms • Coding in TVLA • Operational Semantics • Vanilla solution is Imprecise (and inefficient) • Derive instrumentation predicates • Java to TVP front-end • Extract potentially relevant client code
CME specification in Java’ class Version { /* represents distinct versions of a Set */ } class Collection { Version version; Collection() { version = new Version(); } boolean add(Object o) { version = new Version(); } Iterator iterator() { return new Iterator(this); } } class Iterator { Collection set; Version definingVersion; Iterator (Collection s) { definingVersion = s.version; set = s; } void remove() { requires (definingVersion == set.version); set.ver = new Version(); definingVersion = set.version; } Object next() { requires (definingVersion == set.version); } }
Vanilla TVLA Encoding • Local iterators are pointers • Unary predicates • Relevant fields are pointer selectors • Binary predicates
Artificial Example Set v = new Set(); Iterator i1 = v.iterator(); Iterator i2 = v.iterator(); Iterator i3 = i1; i1.next(); i1.remove(); if (...) { i2.next(); } if (...) { i3.next(); } v.add("..."); if (...) { i1.next();}
Improved TVLA Encodings • Use reachability • Explicitly maintain relevant information • valid[i] = i.defVersion == i.set.Version • iterOf[i, v] = i.set == v • mutex[i, j] = i.set ==j.set && i != j • same[v, w] == v == w • Can be automatically derived from the specification • Polynomial complexity in programs where iterators are not stored in the client heap Meet over all path solution • Adaptive to programs with client heap
Partial Correctness • {P} S {Q} • How to derive loop invariants • Abstract interpretation provides a sound solution • The abstract domain represents a class of program invariants
Example Sorting of linked lists typedef struct node { struct node *n; int data; } *Elements; • dle(v1, v2) = v1.data v2.data • inOrder[n, dle](v) = v1: n(v, v1) dle(v, v1) • inROrder[n, dle](v) = v1: n(v, v1) dle(v1, v) • Captures intermediate invariants as well
L insert_sort(L x) { L r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r ->n; pl = NULL; while (l != r) { if (l->data > r->data) { pr->n = rn; r->n = l; if (pl == NULL) x = r; else pl->n = r; r = pr; break; } pl = l; l = l->n; } pr = r; r = rn; } return x; } typedef struct node { struct node *n; int data; } *Elements;
n n x inOrder[dle,n]=½ r[n,x] inOrder[dle,n]=1/2 r[n,x] dle dle L insert_sort(L x) { L r, pr, rn, l, pl; … return x; } n x inOrder[dle,n] r[n,x] inOrder[dle,n] r[n,x] dle dle
/*pred.tvp */ foreach (z in PVar) { %p z(v_1) unique box } %p n(v_1, v_2) function %i is[n](v) = E(v_1, v_2) ( v_1 != v_2 & n(v_1, v) & n(v_2, v)) foreach (z in PVar) { %i r[n,z](v) = E(v_1) (z(v_1) & n*(v_1, v)) } %i c[n](v) = n+(v, v) %p dle(v1, v2) reflexive transitive %i inOrder[dle,n](v) = A(v_1) n(v, v_1) -> dle(v, v_1) nonabs %i inROrder[dle,n](v) = A(v_1) n(v, v_1) -> dle(v_1, v) nonabs %r !dle(v_1, v_2) ==> dle(v_2, v_1)
/* cond.tvp */ %action uninterpreted() { %t "uninterpreted" } %action Is_Not_Null_Var(x1) { %t x1 + " != NULL" %f { x1(v) } %p E(v) x1(v) } %action Is_Null_Var(x1) { %t x1 + " == NULL" %f { x1(v) } %p !(E(v) x1(v)) } %action Is_Eq_Var(x1, x2) { %t x1 + " == " + x2 %f { x1(v), x2(v) } %p A(v) x1(v) <-> x2(v) } %action Is_Not_Eq_Var(x1, x2) { %t x1 + " != " + x2 %f { x1(v), x2(v) } %p !A(v) x1(v) <-> x2(v) }
%action Greater_Data_L(x1, x2) { %t x1 + "->data > " + x2 + "->data" %f { x1(v_1) & x2(v_2) & dle(v_1, v_2) } %p !E(v_1, v_2) x1(v_1) & x2(v_2) & dle(v_1, v_2) } %action Less_Equal_Data_L(x1, x2) { %t x1 + "->data <= " + x2 + "->data" %f { x1(v_1) & x2(v_2) & dle(v_1, v_2) } %p E(v_1, v_2) x1(v_1) & x2(v_2) & dle(v_1, v_2) }
stat.tvp %action Set_Next_Null_L(x1) { %t x1 + "->" + n + " = null" %f { x1(v) } %message !(E(v) x1(v)) -> { n(v_1, v_2) = ... is[n](v) = ... r[n,x1](v) = ... foreach (z in PVar –{x}) { r[n, x](v) = ... } c[n](v) = inOrder[dle,n](v) = inOrder[dle,n](v) | x1(v) inROrder[dle,n](v) = inROrder[dle,n](v) | x1(v) } }
stat.tvp(more) %action Malloc_L(x1) { %t x1 + " = (L) malloc(sizeof(struct node)) " %new { x1(v) = isNew(v) inOrder[dle, n](v1, v2)=… inROrder[dle, n](v1, v2)=… } }
Abstract interpretation of if x->data <= y.data
From Local Outlook to Global Outlook • (Safety) Every time control reaches a given point: • there are no garbage memory cells • the list is acyclic • each cell is locally ordered • (History) The list is a permutation of the original list
Bugs Found • Pointer manipulations • null dereferences • memory leaks • Forget to sort the first element • Swap equal elements in bubble sort(non-termination)
L insert_sort_b2(L x) { L r, pr, rn, l, pl; if (x == NULL) return NULL; pr = x; r = x->n; while (r != NULL) { pl = x; rn = r->n; l = x->n; while (l != r) { if (l->d > r->d) { pr->n = rn; r->n = l ; pl->n = r; r = pr; break } pl = l; l = l->n; } pr = r; r = rn; } return x; } x n n inOrder[dle,n]=½ inOrder[dle,n]=1 dle dle dle
Properties Not Proved • (Liveness) Termination • Stability
Example: Mark and Sweep void Mark(Node root) { if (root != NULL) { pending = pending = pending {root} marked = while (pending ) { x = SelectAndRemove(pending) marked = marked {x} t = x left if (t NULL) if (t marked) pending = pending {t} t = x right if (t NULL) if (t marked) pending = pending {t} } } assert(marked == Reachset(root)) } void Sweep() { unexplored = Universe collected = while (unexplored ) { x = SelectAndRemove(unexplored) if (x marked) collected = collected {x} } assert(collected == Universe –Reachset(root) ) } Run Demo
Total Correctness • Usually more complicated • Need to show that something good eventually happens • Difficult for programs with unbounded concrete states • Example linked lists • Show decreased set of reachable locations
Program Dependences • A statement s1 depends on s2 if • s2 writes into a location l • s1 reads from location l • There is no intervening write in between • Useful for • Parallelization • Scheduling • Program Slicing • How to compute • Scalars • Stack pointers • Heap allocated pointers
Flow Dependences vs. May-Aliases int y; List p, q; q = (List) malloc(); p = q; t=p; p->d = 5; t->d = 7; y = q->d; int y; List p, q; q = (List) malloc(); p = q; p->d = 5; … y = q->d; int y; List p, q; q = (List) malloc(); p = q; t=p; p->d = 5; p=(List) malloc(); y = q->d;
void append() { List head, tail, temp; l1: head = (List) malloc(); l2: scanf("%c", &head->d); l3: head->n = NULL; l4: tail = head; l5: if (tail->d == `x') goto l12; l6: temp = (List) malloc(); l7: scanf("%c", &temp->d); l8: temp->n = NULL; l9: tail->n = temp; l10: tail = tail->n l11: goto l_5; l12: printf("%c", head->d); l13: printf("%c", tail->d); exit
/*pred.tvp */ foreach (z in PVar) { %p z(v_1) unique box } %p n(v_1, v_2) function %i is[n](v) = E(v_1, v_2) ( v_1 != v_2 & n(v_1, v) & n(v_2, v)) foreach (z in PVar) { %i r[n,z](v) = E(v_1) (z(v_1) & n*(v_1, v)) foreach (l in Label) { %p lst_w_v[l,z]() // l is the last write to into the variable z } foreach (l in Label) { %p lst_w_f[l,n](v_1) box // l is the last write to into the v_1.n %p lst_w_f[l,d](v_1) box// l is the last write to into v_1.data } } %i c[n](v) = n+(v, v)
void append() { List head, tail, temp; l1: head = (List) malloc(); l2: scanf("%c", &head->d); l3: head->n = NULL; l4: tail = head; l5: if (tail->d == `x') goto l12; l6: temp = (List) malloc(); l7: scanf("%c", &temp->d); l8: temp->n = NULL; l9: tail->n = temp; l10: tail = tail->n l11: goto l5; l12: printf("%c", head->d); l13: printf("%c", tail->d); exit lst_w_v[l1, head] lst_w_v[l10, tail] lst_w_v[l6, temp] temp head tail lst_w_f[l9,n] lst_w_f[l8,n] lst_w_f[l2, d] lst_w_f[l7,d]
Java Concurrency • Threads and locks are just dynamically allocated objects • synchronized implements mutual exclusion • wait, notify and notifyAll coordinate activities across threads
Example - Mutual Exclusion l_0: while (true) { l_1: synchronized(sharedLock) { l_C: // critical actions l_2: } l_3: } • Allocate new lock ? • Allocate new thread ? Two threads: (pc1,pc2,lockAcquired1,lockAcquired2)
Program Model • Interleaving model of concurrency • Program is modeled as a transition system
Configurations • A program configuration encodes: • global store • program-location of every thread • status of locks and threads • First-order logical structures used to represent program configurations
Configurations • Predicates model properties of interest • is_thread(t) • { at[lab](t) : lab Labels } • { rval[fld](o1,o2) : fld Fields } • held_by(l,t) • blocked(t,l) • waiting(t,l) • Can use the framework with different predicates
Configurations blocked is_thread at[l_1] held_by is_thread at[l_C] rval[this] blocked rval[this] is_thread at[l_1] is_thread at[l_0] is_thread at[l_0] rval[this]
Configurations • Program control-flow is not separately represented • Program location for each thread is encoded inside the configuration • { at[lab](t) : lab Labels }
Structural Operational Semantics - actions • An action consists of: • precondition(when) formula • update formulae • Precondition formula may use a free variable ts for “currently scheduled” thread • Semantics is non-deterministic
lock(v) precondition tts: rval[v](ts,l) held_by(l,t) predicate update held_by’(l1,t1) = held_by(l1,t1) (l1 = l t1 = ts) blocked’ (t1,l1) = blocked(t1,l1) ((l1 l) (t1 ts)) Structural Operational Semantics - actions
Example: mutual exclusion t1,t2: (t1 t2) (at[lcrit](t1) at[lcrit](t2)) • Example: no total deadlock t,lb : is_thread(t) blocked(t, lb) Safety Properties • Configuration-local property as logical formula
Concrete Configuration blocked is_thread at[l_1] held_by is_thread at[l_C] rval[this] blocked rval[this] is_thread at[l_1] is_thread at[l_0] is_thread at[l_0] rval[this]