560 likes | 568 Views
Explore the complexity of TVLA shape analysis, issues with weak vs strong updates, null dereferences, memory leaks, and more. Learn about the concurrent modification problem and how to detect incorrect library usages in Java programs. Discover static detection of concurrent modifications and the improved TVLA encodings. Gain insights into partial correctness and loop invariants with examples.
E N D
Applications of TVLA Mooly Sagiv Tel Aviv University Shape Analysis with Applications http://www.cs.tau.ac.il/~rumster/TVLA/
Outline • Issues • Complexity of TVLA • Weak vs. Strong Updates • Cleanness • Null derefernces • Memory leaks • Freed storage (homework) • The concurrent modification problem • Partial correctness • Sorting • GC • Total Correctenss • Flow dependences • Multithreading • Other
Complexity of Shape Analysis x = malloc() if (…) y1 = x if (…) y2 = x if (…) y3 = x if (…) yn =x
Complexity of TVLA analysis • Maximal number of Nodes in Blurred Structures • 3|A| • Size of 3-valued structure representation • Action cost • Focus • Precondition • Coerce • New • Update • Coerce • Blur
Weak vs. Strong Updates if (…) x = y else x = z xn = NULL
Detecting Incorrect Library Usages(J. Field, D. Goyal. G. Ramalingam, A. Warshavski) • Java provides libraries for manipulating data structures • Collections • Lists • Hashset • … • Iterators over collections allows sequential accesses • Statically detect incorrect library usages Set s = worklist.unprocessedItems(); for (Iterator i = s.iterator(); i.hasNext()){ Object item = i.next(); if (...) processItem(item);
The Concurrent Modification Problem • Static analysis of Java programs manipulating Java 2 collections • Inconsistent usages of iterators • An Iterator object i defined on a collection object c • No use of i may be preceded by update to the contents of c,unless the update was also made via I • Guarantees order independence
Artificial Example Set v = new Set(); Iterator i1 = v.iterator(); Iterator i2 = v.iterator(); Iterator i3 = i1; i1.next(); i1.remove(); if (...) { i2.next(); } if (...) { i3.next(); } v.add("..."); if (...) { i1.next();}
class Make { private Worklist worklist; public static void main (String[] args) { Make m = new Make(); m.initializeWorklist(args); m.processWorklist(); } void initializeWorklist(String[] args) { ...; worklist = new Worklist(); ... // add some items to worklist} void processWorklist() { Set s = worklist.unprocessedItems(); for (Iterator i = s.iterator(); i.hasNext()){ Object item = i.next(); if (...) processItem(item); } } void processItem(Object i){ ...; doSubproblem(...);} void doSubproblem(...) { ... worklist.addItem(newitem); ... } } public class Worklist { Set s; public Worklist() {. ..; s = new HashSet(); ... } public void addItem(Object item) { s.add(item); } public Set unprocessedItems() { return s; } } return rev; }
Static Detection of Concurrent Modifications • Statically Check for CME exceptions • Warn against potential CME • Sound (conservative) solution • Not too many false alarms • Coding in TVLA • Operational Semantics • Vanilla solution is Imprecise (and inefficient) • Derive instrumentation predicates • Java to TVP front-end • Extract potentially relevant client code
CME specification in Java’ class Version { /* represents distinct versions of a Set */ } class Collection { Version version; Collection() { version = new Version(); } boolean add(Object o) { version = new Version(); } Iterator iterator() { return new Iterator(this); } } class Iterator { Collection set; Version definingVersion; Iterator (Collection s) { definingVersion = s.version; set = s; } void remove() { requires (definingVersion == set.version); set.ver = new Version(); definingVersion = set.version; } Object next() { requires (definingVersion == set.version); } }
Vanilla TVLA Encoding • Local iterators are pointers • Unary predicates • Relevant fields are pointer selectors • Binary predicates
Artificial Example Set v = new Set(); Iterator i1 = v.iterator(); Iterator i2 = v.iterator(); Iterator i3 = i1; i1.next(); i1.remove(); if (...) { i2.next(); } if (...) { i3.next(); } v.add("..."); if (...) { i1.next();}
Improved TVLA Encodings • Use reachability • Explicitly maintain relevant information • valid[i] = i.defVersion == i.set.Version • iterOf[i, v] = i.set == v • mutex[i, j] = i.set ==j.set && i != j • same[v, w] == v == w • Can be automatically derived from the specification • Polynomial complexity in programs where iterators are not stored in the client heap Meet over all path solution • Adaptive to programs with client heap
Partial Correctness • {P} S {Q} • How to derive loop invariants • Abstract interpretation provides a sound solution • The abstract domain represents a class of program invariants
Example Sorting of linked lists typedef struct node { struct node *n; int data; } *Elements; • dle(v1, v2) = v1.data v2.data • inOrder[n, dle](v) = v1: n(v, v1) dle(v, v1) • inROrder[n, dle](v) = v1: n(v, v1) dle(v1, v) • Captures intermediate invariants as well
L insert_sort(L x) { L r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r ->n; pl = NULL; while (l != r) { if (l->data > r->data) { pr->n = rn; r->n = l; if (pl == NULL) x = r; else pl->n = r; r = pr; break; } pl = l; l = l->n; } pr = r; r = rn; } return x; } typedef struct node { struct node *n; int data; } *Elements;
n n x inOrder[dle,n]=½ r[n,x] inOrder[dle,n]=1/2 r[n,x] dle dle L insert_sort(L x) { L r, pr, rn, l, pl; … return x; } n x inOrder[dle,n] r[n,x] inOrder[dle,n] r[n,x] dle dle
/*pred.tvp */ foreach (z in PVar) { %p z(v_1) unique box } %p n(v_1, v_2) function %i is[n](v) = E(v_1, v_2) ( v_1 != v_2 & n(v_1, v) & n(v_2, v)) foreach (z in PVar) { %i r[n,z](v) = E(v_1) (z(v_1) & n*(v_1, v)) } %i c[n](v) = n+(v, v) %p dle(v1, v2) reflexive transitive %i inOrder[dle,n](v) = A(v_1) n(v, v_1) -> dle(v, v_1) nonabs %i inROrder[dle,n](v) = A(v_1) n(v, v_1) -> dle(v_1, v) nonabs %r !dle(v_1, v_2) ==> dle(v_2, v_1)
/* cond.tvp */ %action uninterpreted() { %t "uninterpreted" } %action Is_Not_Null_Var(x1) { %t x1 + " != NULL" %f { x1(v) } %p E(v) x1(v) } %action Is_Null_Var(x1) { %t x1 + " == NULL" %f { x1(v) } %p !(E(v) x1(v)) } %action Is_Eq_Var(x1, x2) { %t x1 + " == " + x2 %f { x1(v), x2(v) } %p A(v) x1(v) <-> x2(v) } %action Is_Not_Eq_Var(x1, x2) { %t x1 + " != " + x2 %f { x1(v), x2(v) } %p !A(v) x1(v) <-> x2(v) }
%action Greater_Data_L(x1, x2) { %t x1 + "->data > " + x2 + "->data" %f { x1(v_1) & x2(v_2) & dle(v_1, v_2) } %p !E(v_1, v_2) x1(v_1) & x2(v_2) & dle(v_1, v_2) } %action Less_Equal_Data_L(x1, x2) { %t x1 + "->data <= " + x2 + "->data" %f { x1(v_1) & x2(v_2) & dle(v_1, v_2) } %p E(v_1, v_2) x1(v_1) & x2(v_2) & dle(v_1, v_2) }
stat.tvp %action Set_Next_Null_L(x1) { %t x1 + "->" + n + " = null" %f { x1(v) } %message !(E(v) x1(v)) -> { n(v_1, v_2) = ... is[n](v) = ... r[n,x1](v) = ... foreach (z in PVar –{x}) { r[n, x](v) = ... } c[n](v) = inOrder[dle,n](v) = inOrder[dle,n](v) | x1(v) inROrder[dle,n](v) = inROrder[dle,n](v) | x1(v) } }
stat.tvp(more) %action Malloc_L(x1) { %t x1 + " = (L) malloc(sizeof(struct node)) " %new { x1(v) = isNew(v) inOrder[dle, n](v1, v2)=… inROrder[dle, n](v1, v2)=… } }
Abstract interpretation of if x->data <= y.data
From Local Outlook to Global Outlook • (Safety) Every time control reaches a given point: • there are no garbage memory cells • the list is acyclic • each cell is locally ordered • (History) The list is a permutation of the original list
Bugs Found • Pointer manipulations • null dereferences • memory leaks • Forget to sort the first element • Swap equal elements in bubble sort(non-termination)
L insert_sort_b2(L x) { L r, pr, rn, l, pl; if (x == NULL) return NULL; pr = x; r = x->n; while (r != NULL) { pl = x; rn = r->n; l = x->n; while (l != r) { if (l->d > r->d) { pr->n = rn; r->n = l ; pl->n = r; r = pr; break } pl = l; l = l->n; } pr = r; r = rn; } return x; } x n n inOrder[dle,n]=½ inOrder[dle,n]=1 dle dle dle
Properties Not Proved • (Liveness) Termination • Stability
Example: Mark and Sweep void Mark(Node root) { if (root != NULL) { pending = pending = pending {root} marked = while (pending ) { x = SelectAndRemove(pending) marked = marked {x} t = x left if (t NULL) if (t marked) pending = pending {t} t = x right if (t NULL) if (t marked) pending = pending {t} } } assert(marked == Reachset(root)) } void Sweep() { unexplored = Universe collected = while (unexplored ) { x = SelectAndRemove(unexplored) if (x marked) collected = collected {x} } assert(collected == Universe –Reachset(root) ) } Run Demo
Total Correctness • Usually more complicated • Need to show that something good eventually happens • Difficult for programs with unbounded concrete states • Example linked lists • Show decreased set of reachable locations
Program Dependences • A statement s1 depends on s2 if • s2 writes into a location l • s1 reads from location l • There is no intervening write in between • Useful for • Parallelization • Scheduling • Program Slicing • How to compute • Scalars • Stack pointers • Heap allocated pointers
Flow Dependences vs. May-Aliases int y; List p, q; q = (List) malloc(); p = q; t=p; p->d = 5; t->d = 7; y = q->d; int y; List p, q; q = (List) malloc(); p = q; p->d = 5; … y = q->d; int y; List p, q; q = (List) malloc(); p = q; t=p; p->d = 5; p=(List) malloc(); y = q->d;
void append() { List head, tail, temp; l1: head = (List) malloc(); l2: scanf("%c", &head->d); l3: head->n = NULL; l4: tail = head; l5: if (tail->d == `x') goto l12; l6: temp = (List) malloc(); l7: scanf("%c", &temp->d); l8: temp->n = NULL; l9: tail->n = temp; l10: tail = tail->n l11: goto l_5; l12: printf("%c", head->d); l13: printf("%c", tail->d); exit
/*pred.tvp */ foreach (z in PVar) { %p z(v_1) unique box } %p n(v_1, v_2) function %i is[n](v) = E(v_1, v_2) ( v_1 != v_2 & n(v_1, v) & n(v_2, v)) foreach (z in PVar) { %i r[n,z](v) = E(v_1) (z(v_1) & n*(v_1, v)) foreach (l in Label) { %p lst_w_v[l,z]() // l is the last write to into the variable z } foreach (l in Label) { %p lst_w_f[l,n](v_1) box // l is the last write to into the v_1.n %p lst_w_f[l,d](v_1) box// l is the last write to into v_1.data } } %i c[n](v) = n+(v, v)
void append() { List head, tail, temp; l1: head = (List) malloc(); l2: scanf("%c", &head->d); l3: head->n = NULL; l4: tail = head; l5: if (tail->d == `x') goto l12; l6: temp = (List) malloc(); l7: scanf("%c", &temp->d); l8: temp->n = NULL; l9: tail->n = temp; l10: tail = tail->n l11: goto l5; l12: printf("%c", head->d); l13: printf("%c", tail->d); exit lst_w_v[l1, head] lst_w_v[l10, tail] lst_w_v[l6, temp] temp head tail lst_w_f[l9,n] lst_w_f[l8,n] lst_w_f[l2, d] lst_w_f[l7,d]
Java Concurrency • Threads and locks are just dynamically allocated objects • synchronized implements mutual exclusion • wait, notify and notifyAll coordinate activities across threads
Example - Mutual Exclusion l_0: while (true) { l_1: synchronized(sharedLock) { l_C: // critical actions l_2: } l_3: } • Allocate new lock ? • Allocate new thread ? Two threads: (pc1,pc2,lockAcquired1,lockAcquired2)
Program Model • Interleaving model of concurrency • Program is modeled as a transition system
Configurations • A program configuration encodes: • global store • program-location of every thread • status of locks and threads • First-order logical structures used to represent program configurations
Configurations • Predicates model properties of interest • is_thread(t) • { at[lab](t) : lab Labels } • { rval[fld](o1,o2) : fld Fields } • held_by(l,t) • blocked(t,l) • waiting(t,l) • Can use the framework with different predicates
Configurations blocked is_thread at[l_1] held_by is_thread at[l_C] rval[this] blocked rval[this] is_thread at[l_1] is_thread at[l_0] is_thread at[l_0] rval[this]
Configurations • Program control-flow is not separately represented • Program location for each thread is encoded inside the configuration • { at[lab](t) : lab Labels }
Structural Operational Semantics - actions • An action consists of: • precondition(when) formula • update formulae • Precondition formula may use a free variable ts for “currently scheduled” thread • Semantics is non-deterministic
lock(v) precondition tts: rval[v](ts,l) held_by(l,t) predicate update held_by’(l1,t1) = held_by(l1,t1) (l1 = l t1 = ts) blocked’ (t1,l1) = blocked(t1,l1) ((l1 l) (t1 ts)) Structural Operational Semantics - actions
Example: mutual exclusion t1,t2: (t1 t2) (at[lcrit](t1) at[lcrit](t2)) • Example: no total deadlock t,lb : is_thread(t) blocked(t, lb) Safety Properties • Configuration-local property as logical formula
Concrete Configuration blocked is_thread at[l_1] held_by is_thread at[l_C] rval[this] blocked rval[this] is_thread at[l_1] is_thread at[l_0] is_thread at[l_0] rval[this]