320 likes | 336 Views
Explore TVLA system for static analysis, first-order logical structures, abstract interpretation, and new representations using BDD and functional maps. Implementation techniques and empirical evaluation are discussed.
E N D
Compactly Representing First-Order Structures for Static Analysis Tel-Aviv University Roman Manevich Mooly Sagiv I.B.M T.J. Watson Ganesan RamalingamJohn FieldDeepak Goyal
Motivation • TVLA is a powerful and general abstract interpretation system • Abstract interpretation in TVLA • Operational semantics is expressed with first-order logic formulae • Program states are represented assets of Evolving First-Order Structures • Space is a major bottleneck
Desired Properties • Sparse data structures • Share common sub-structures • Inherited sharing • Incidental sharing due to program invariants • But feasible time performance • Phase sensitive data structures
Outline • Background • First-order structure representations • Base representation (TVLA 0.91) • BDD representation • Empirical evaluation • Conclusion
First-Order Logical Structures • Generalize shape graphs • Arbitrary set of individuals • Arbitrary set of predicates on individuals • Dynamically evolving • Usually small changes • Properties are extracted by evaluating first order formula: ∃v1 , v: x(v1) ∧ n(v1, v) • Join operator requires isomorphism testing
First-Order Structure ADT • Structure : new() /* empty structure */ • SetOfNodes : nodeSet(Structure) • Node : newNode(Structure) • removeNode(Structure, node) • Kleeneeval(Structure, p(r), <u1, . . . ,ur>) • update(Structure, p(r), <u1, . . . ,ur>, Kleene) • Structurecopy(Structure)
print_all Example /* list.h */typedef struct node { struct node * n; int data;} * L; /* print.c */#include “list.h”void print_all(L y) { L x;x = y; while (x != NULL) { /* assert(x != NULL) */ printf(“elem=%d”, xdata);x = xn; }}
print_all Example n=½ usm=½ u1y=1 n=½ S1 n=½ usm=½ u1y=1 n=½ S0 x = y x’(v) := y(v) copy(S0) : S1 nodeset(S0) : {u1, u} eval(S0, y, u1) : 1 update(S1, x, u1, 1) x=1 eval(S0, y, u) : 0 update(S1, x, u, 0)
print_all Example n=½ while (x != NULL)precondition : ∃v x(v) u1x=1y=1 usm=½ n=½ S1 n=½ x = x nfocus : ∃v1 x(v1) ∧ n(v1, v)x’(v) := ∃v1 x(v1) ∧ n(v1, v) usm=½ u1y=1 S2.0 n=½ u1y=1 ux=1 S2.1 n=1 n=½ n=½ n=½ u.0sm=½ u1y=1 n=1 S2.2 u.1x=1
Overview and Main Results • Two novel representations of first-order structures • New BDD representation • New representation using functional maps • Implementation techniques • Empirical evaluation • Comparison of different representations • Space is reduced by a factor of 4–10 • New representations scale better
Base Representation (Tal Lev-Ami SAS 2000) • Two-Level Map : Predicate (Node Tuple Kleene) • Sparse Representation • Limited inherited sharing by “Copy-On-Write”
BDDs in a Nutshell (Bryant 86) • Ordered Binary Decision Diagrams • Data structure for Boolean functions • Functions are represented as (unique) DAGs x1 x2 x2 x3 x3 x3 x3 0 0 0 1 0 1 0 1
BDDs in a Nutshell (Bryant 86) • Ordered Binary Decision Diagrams • Data structure for Boolean functions • Functions are represented as (unique) DAGs • Also achieve sharing across functions x1 x1 x1 x2 x2 x2 x2 x2 x3 x3 x3 x3 x3 x3 x3 0 1 0 1 0 1 Duplicate Terminals Duplicate Nonterminals Redundant Tests
Encoding Structures Using Integers • Static encoding of • Predicates • Kleene values • Dynamic encoding of nodes • 0, 1, …, n-1 • Encode predicate p’s values as • ep(p).en(u1). en(u2) . … . en(un) . ek(Kleene)
x1 x2 x2 x3 0 1 BDD Representation of Integer Sets • Characteristic function • S={1,5} 1=<001>5=<101> S = (¬x1¬x2x3) (x1¬x2x3)
x1 x2 x2 x3 1 BDD Representation of Integer Sets • Characteristic function • S={1,5} 1=<001>5=<101> S = (¬x1¬x2x3) (x1¬x2x3)
BDD Representation Example n=½ usm=½ S0 n=½ S0 u1y=1 1
BDD Representation Example n=½ usm=½ S0 S1 n=½ S0 u1y=1 x=y n=½ u1x=1y=1 usm=½ n=½ S1 1
S2.2 BDD Representation Example n=½ usm=½ S0 S1 n=½ S0 u1y=1 x=y n=½ u1x=1y=1 usm=½ n=½ S1 x=xn n=½ n=½ n=½ u.0sm=½ u1y=1 n=1 S2.2 u.1x=1 1
S2.2 BDD Representation Example n=½ usm=½ S0 S1 n=½ S0 u1y=1 x=y n=½ u1x=1y=1 usm=½ n=½ S1 x=xn n=½ n=½ n=½ u.0sm=½ u1y=1 n=1 S2.2 u.1x=1 1
Improved BDD Representation • Using this representation directlydoesn’t save space • Observation • Node names can be arbitrarily remapped without affecting the ADT semantics • Our heuristics • Use canonic node names to encode nodes • Increases incidental sharing • Reduces isomorphism test to pointer comparison • 4-10 space reduction
Reducing Time Overhead • Current implementation not optimized • Expensive formula evaluation • Hybrid representation • Distinguish between phases:mutable phase Join immutable phase • Dynamically switch representations
Functional Representation • Alternative representation for first-order structures • Structures represented by maps from integers to Kleene values • Tailored for representing first-order structures • Achieves better results than BDDs • Techniques similar to the BDD representation • More details in the paper
Empirical Evaluation • Benchmarks: • Cleanness Analysis (SAS 2000) • Garbage Collector • CMP (PLDI 2002) of Java Front-End and Kernel Benchmarks • Mobile Ambients (ESOP 2000) • Stress testing the representations • We use “relational analysis” • Save structures in every CFG location
Abstract Counters • Ignore language/implementation details • A more reliable measurement technique • Count only crucial space information • Independent of C/Java
What’s Missing from this Work? • Investigate other node mapping heuristics • Compactly represent sets of structures • Time optimizations
Conclusions • Two novel representations of first-order structures • New BDD representation • New representation using functional maps • Implementation techniques • Normalization techniques are crucial • Empirical evaluation • Comparison of different representations • Space is reduced by a factor of 4–10 • New representations scale better
Conclusions • The use of BDDs for static analysis is not a panacea for space saving • Domain-specific encoding crucial for saving space • Failed attempts • Original implementation of Veith’s encoding • PAG