420 likes | 550 Views
Partially Disjunctive Heap Abstraction. Roman Manevich Mooly Sagiv Tel Aviv University. G. Ramalingam John Field IBM T.J. Watson. Motivation. Analysis of Object Oriented programs is hard Recursive data structures Unbounded number of objects Destructive update of references
E N D
Partially DisjunctiveHeap Abstraction Roman ManevichMooly SagivTel Aviv University G. RamalingamJohn FieldIBM T.J. Watson
Motivation • Analysis of Object Oriented programs is hard • Recursive data structures • Unbounded number of objects • Destructive update of references • Scalable heap analyses exist • e.g., flow-insensitive • Not precise enough for verification • Precise heap analyses exist • e.g., SRW shape analysis • Scaling is very challenging
Motivating example:verifying mark phase of GC // @Ensures marked == REACH(root) void mark(Node root, NodeSet marked) { Node x; if (root != null) { NodeSet pending = new NodeSet(); pending.add(root); marked.clear(); while (!pending.isEmpty()) { x = pending.selectAndRemove(); marked.add(x); if (x.left != null) if (!marked.contains(x.left)) pending.add(x.left); if (x.right != null) if (!marked.contains(x.right) pending.add(x.right); } } }
Motivating example:verifying mark phase of GC // @Ensures marked == REACH(root) void mark(Node root, NodeSet marked) { Node x; if (root != null) { NodeSet pending = new NodeSet(); pending.add(root); marked.clear(); while (!pending.isEmpty()) { x = pending.selectAndRemove(); marked.add(x); if (x.left != null) if (!marked.contains(x.left)) pending.add(x.left); if (x.right != null) if (!marked.contains(x.right) pending.add(x.right); } } }
Motivating example:verifying mark phase of GC // @Ensures marked == REACH(root) void mark(Node root, NodeSet marked) { Node x; if (root != null) { NodeSet pending = new NodeSet(); pending.add(root); marked.clear(); while (!pending.isEmpty()) { x = pending.selectAndRemove(); marked.add(x); if (x.left != null) if (!marked.contains(x.left)) pending.add(x.left); if (x.right != null) if (!marked.contains(x.right) pending.add(x.right); } } }
u1 u2 u3 Motivating example:verifying mark phase of GC root u6 x left u5 left left right pending = {root}marked = {} right left right u4
u1 u2 u3 Motivating example:verifying mark phase of GC root u6 x left u5 left left right pending = {u3,u2}marked = {u1} right left right u4
u1 u2 u3 Motivating example:verifying mark phase of GC root u6 left u5 left left right pending = {u4,u2}marked = {u1,u3} right left x right u4
u1 u2 u3 Motivating example:verifying mark phase of GC root u6 left u5 left left right pending = {u2}marked = {u1,u3,u4} right left x right u4
u1 u2 u3 Motivating example:verifying mark phase of GC root u6 left x u5 left left right pending = {}marked = {u1,u3,u4,u2} right left right u4
u1 u2 u3 Motivating example:verifying mark phase of GC root u6 left x u5 left left right pending = {}marked = {u1,u3,u4,u2} right left DONE right u4
u1 u2 u3 Motivating example:verifying mark phase of GC root u6 garbage garbage left x u5 left left right pending = {}marked = {u1,u3,u4,u2} right left right u4
u1 u2 u3 Motivating example:verifying mark phase of GC root x left pending = {}marked = {u1,u3,u4,u2} right left right u4
Motivating example:verifying mark phase of GC • Powerset heap abstraction • 584 seconds, 189,772 abstract heaps • Definitely too expensive • Can we verify more efficiently? • Partially disjunctive heap abstraction • 3 seconds, 1,133 abstract heaps • TVLA system
Overview and main results • New (parametric) heap abstraction • Uses a heap similarity criterion • Merges “similar” heaps • Robust implementation • Abstraction of choice among TVLA users • Suitable for other shape analysis systems • Empirical results • Significant speedups (2 orders of magnitude) • Precise in most cases
Talk outline • Shape analysis background • Representing heaps via logical structures • Disjunctive (powerset) heap abstraction • Partially disjunctive heap abstraction • Via universe congruence similarity • Empirical results • Related work • Future work • Conclusions
Shape analysis viaFirst-Order logic • SRW 2002 : Parametric shape analysis via 3-valued logic • Concrete heaps represented by 2-valued structures over predicate symbols P • A set of individuals (nodes) U • Interpretation of predicate symbols in Pp0() {0,1}p1(v) {0,1}p2(u,v) {0,1}
Concrete heap r[root]set[marked] r[root]set[marked] r[root]set[marked] root unary predicates left x rootset[marked] set[pending] r[root] left left right right left r[root]set[marked] binary predicates x left right right
3-valued structures • 2-valued structures abstracted into3-valued structures by merging individuals • p0() {0,1,1/2}p1(v) {0,1,1/2}p2(u,v) {0,1,1/2} • Kleene’s partially ordered set of logical values: • 0 1 = 1/2 1/2 1 0
Canonical abstraction • Merge individuals with same values for all unary predicates (canonical name) • Bounded structure with at most 2|A| individuals • A = set of unary predicates
Canonical abstraction r[root]set[marked] root left A = x(v) root(v)set[marked](v) set[pending](v)r[root](v) left left right r[root]set[marked] right left r[root]set[marked] x right r[root]set[marked]
Canonical abstraction r[root]set[marked] root left left left right r[root]set[marked] right x=0,root=0,r[root]=1,set[marked]=1,set[pending]=0 left r[root]set[marked] x right r[root]set[marked]
Canonical abstraction r[root]set[marked] root left left left right r[root]set[marked] right x=0,root=0,r[root]=1,set[marked]=1,set[pending]=0 x=0,root=0,r[root]=1,set[marked]=1,set[pending]=0 left r[root]set[marked] x right r[root]set[marked]
Canonical abstraction r[root]set[marked] root left x=0,root=0,r[root]=0,set[marked]=0,set[pending]=0 left left right r[root]set[marked] right x=0,root=0,r[root]=1,set[marked]=1,set[pending]=0 x=0,root=0,r[root]=1,set[marked]=1,set[pending]=0 left r[root]set[marked] x right r[root]set[marked]
Canonical abstraction r[root]set[marked] root left x=0,root=0,r[root]=0,set[marked]=0,set[pending]=0 x=0,root=0,r[root]=0,set[marked]=0,set[pending]=0 left left right r[root]set[marked] right x=0,root=0,r[root]=1,set[marked]=1,set[pending]=0 x=0,root=0,r[root]=1,set[marked]=1,set[pending]=0 left r[root]set[marked] x right r[root]set[marked]
Canonical abstraction r[root]set[marked] root left left left right r[root]set[marked] right left r[root]set[marked] x right r[root]set[marked]
Abstract heap r[root]set[marked] Bounded number of individuals root left left left right right r[root]set[marked] x left right r[root]set[marked]
Powerset heap abstraction • = canonical abstraction • pow(X) = {(s) | s X} • LUB (join) is set union • Worst-case is doubly-exponential in |A| • Can make unnecessary distinctions
Partially disjunctiveheap abstraction • Use a heap-similarity criterion • We defined similarity by universe congruence • Merge similar heaps • Avoid merging dissimilar heaps
r[root]set[marked] r[root]set[marked] Universe congruent heaps root root left left x left left left right left r[root]set[marked] right r[root]set[marked] right x right left left right r[root]set[marked] r[root]set[marked] right
Result of merge r[root]set[marked] root left x left left r[root]set[marked] right left right left right left r[root]set[marked] left right
r[root]set[marked] r[root]set[marked] Non-congruent heaps – no merge root root left left x left left left right left r[root]set[marked] right r[root]set[marked] right x right left left right r[root]set[pending] r[root]set[marked] right
Definition of partially disjunctiveheap abstraction • Two heaps are similar iff they are universe congruent (same canonical names) • piC = merge universe congruent heaps • pi(X) = {piC | C pow(X)}
Characteristics of the partially disjunctive heap abstraction • 3-valued structures partially-ordered • No LUB over singleton structure sets • if S1 piS2pi({S1,S2}) = pi{S1,S2} elsepow({S1,S2}) = {S1,S2} • Retain definite values of unary predicates • Size of set can be reduced exponentially
Related work • Reducing cost of powerset-based analysis • Function space domain construction • ESP [PLDI 02] • Deutsch [PLDI 94] • Widening operators [Bagnara et el. VMCAI03]
Future work • Experiment with other similarity criteria • Structures with different universes • Deflating operators • Widening operators
Conclusions • A new (parametric) heap abstraction • Partially disjunctive • Merges similar abstract heap descriptors • Significantly more efficient than full powerset • Essential for many TVLA analyses • Often no loss of precision in practice
Parametric partial isomorphism • Structures S1=U1,I1 and S2=U2,I2 • Isomorphic iff: • Exists bijection f : U1U2 • Preserves all predicate values • Partially-isomorphic relative to R iff: • Exists bijection f : U1U2 • Preserves values of relational predicates • A R P
No LUB over singletons p=1q=1 z=1/2 p=0q=1 z=0 p=1q=0 z=1 A p=0q=1 z=1 p=1q=0 z=0 p=1q=1 z=1/2 B C is an upper bound D is an upper bound p=1q=0 z=1/2 p=1/2q=1 z=1/2 p=0q=1 z=1/2 p=1q=1/2 z=1/2 incomparable