300 likes | 455 Views
Combined Static and Dynamic Mutability Analysis. Shay Artzi, Michael D. Ernst, David Glasser, Adam Kiezun CSAIL, MIT. Motivation. Start w/a motivation example of why you want this.
E N D
Combined Static and Dynamic Mutability Analysis Shay Artzi, Michael D. Ernst, David Glasser, Adam Kiezun CSAIL, MIT
Motivation • Start w/a motivation example of why you want this. • Motivation slide - is this a real example (maybe some model) That might be good: show a small piece of code (maybe just the declaration, not the body) for which one would want to know that information. Briefly discussing a use could work, too. I don't have something specific in mind here; it just felt like the talk didn't motivate the problem enough. As an audience member, why should I care?
Outline • Parameter Mutability Definition and Applications • Classifying Parameters: • Staged analysis • Dynamic analyses • Static analyses • Evaluation • Conclusions
Mutability Example class List { … int size(List this){ return n;} void add(List this, Object o) {…} List addAll(List this, List l){…} List copy(List this){ return new List().addAll(this); } void remove(List this, Object O) { if(exist(o)) …. else return; } } class List { … int size(Listthis){ return n;} void add(List this, Object o) {…} List addAll(List this, List l){…} List copy(Listthis){ return new List().addAll(this); } void remove(List this, Object O) { if(exist(o)) …. else return; } } class List { … int size(Listthis){ return n;} void add(List this, Object o) {…} List addAll(List this, List l){…} List copy(Listthis){ return new List().addAll(this); } void remove(List this, Object O) { if(exist(o)) …. else return; } } <- Pure Method <- Pure Method * Immutable * Mutable
Mutability/Immutability Definition • Parameter P of method M is: • mutable if some execution of m can change the state of an object that at run-time corresponds to p • Immutable if no such execution exists • All parameters are considered to be fully un-aliased on top-level method entry • A method is pure (side-effect free) if: • All its parameters are immutable (Including receiver and global state)
Using Mutability InformationTest Input Generation • Palulu, Artzi 06 • Generate tests based on a model • Model is a directed graph describing permitted sequences of calls on each object • Model can be pruned (without changing the space it describes) by removing calls that do not mutate specific parameters • Smaller models • Faster systematic exploration • Greater search space exploration by random generator
Using Mutability InformationOther Applications • Program comprehension (Dulado 03) • Modeling (Burdy 05) • Verification (Tkachuk 03) • Compiler optimization (Clausen 97) • Program transformation (Fowler 00) • Regression oracle creation (Marini 05, Xie 06) • Invariant detection (Ernst 01) • Specification mining (Dallmeier 06)
Static Analysis Solutions • Rountev. ICSM 04 • Salcianu. VMCAI 05 • Tschantz. MIT Master Thesis 06 • Properties: • Never classify a mutable parameter as immutable • Problems: • Not scalable • Poor performance (on some programs) • Conservative (may not find all immutable parameters) • Complicated, pointer analysis, etc. • Imprecise; Never classifies as mutable
Analysis 3 Static Analysis 1 Static Analysis 2 Dynamic Our Solution: • Idea: • Combine simple analyses to outperform a complicated one • Harvest the power of both static and dynamic analyses • Optionally use unsound analyses. Unsoundness is mitigated by other analysis in the combination and is acceptable for some • Advantages: • Improve overall accuracy • Scaleable • Pipeline approach: • Connect a series of scalable analyses in a pipeline. • The i/o of each analysis is a classification of all parameters. • Analyses represent imprecision using the unknown classification 3 u 6 m 6 i 1 u 8 m 6 i 15 u 0 m 0 i 9 u 2 m 4 i
Scalable analyses • Dynamic Analysis • Observe execution to classify as mutable • Performance optimizations • Optional accuracy heuristics • Random generation of inputs • Static Analysis • Intra-procedural points-to analysis • Inter-procedural propagation phase
Dynamic Mutability Analysis • Observe program execution • On field/array write: • Find all mutated objects • Mark as mutable all formal parameters corresponding to those objects • Potential misclassifications due to aliased parameters: • Not found to be a problem in practice • A sound, but less scaleable algorithm exists
1. void addAll(List this, List other) { 3. add(this, o); 5. void add(List this, Object o) { Dynamic Analysis Example 1. void addAll(List this, List other) { 1. void addAll(List this, List other) { 2. for(Object o:other) { 4. } 6. ... 8. } 3. add(this, o); 5. void add(List this, Object o) { 5. void add(List this, Object o) { 7. this.array[index] = o; 7. this.array[index] = o; When executing line 3 the List object which corresponds to the receiver of add is mutated. As a result the receiver of add is classified as mutable
Dynamic AnalysisImplementation Optimizations • Maintain a model of the heap • Allows searching backwards in the object graph • Avoid using reflection for searching • Caching • For each object, set of corresponding formal parameters • Lazy computation • Compute reachable only when a write occurs
Performance Heuristic:Classifying as Immutable During Execution • Classify a parameters as immutable • After the method was called N times without observing a mutation to the parameter • Advantages: • Improved performance by preventing further analysis of side-effect-free methods (10%-100% faster in our experiments) • Disadvantages: • May classify mutable parameters as immutable (0.1% misclassification in our experiments)
Accuracy Heuristic: Using Known Mutable Parameters • Treat object passed to a mutable parameter as if it is immediately mutated • Advantages: • Can discover possible mutation that do not necessarily happen in the execution (In practice it was not highly effective in a pipeline due to the static analysis). • Minor performance improvement by not waiting for the field write and storing less information (1% -5% in our experiments) • Disadvantages: • Can propagate misclassification if the input classification contains misclassification (did not happen in our experiments)
void copyInto(List this, addAll(this, other); Using Known Mutable ParametersImproved Performance Example • Assumption: • The receiver of addAll was previously classified as mutable (by another analysis in the pipeline) • Analysis result: • The receiver of copyIntois classified as mutable when passed as the receiver of addAll even void main() { List lst = new List(); List lst2 = new List(); lst2.copyInto(lst, lst2); } void copyInto(List this, List other) { ... addAll(this, other); } void addAll(List this, List other) { for(Object o:other) { add(this, o); } }
Accuracy Heuristic:Classifying Parameters as Immutable After the Execution • Classify a parameter as immutable • At the end of the execution/pipeline • All unknown parameters in methods that were executed more then N times • Advantages: • Algorithm classifies parameters as immutable in addition to mutable (adds 6% correctly classified immutable parameters to the best pipeline) • Disadvantages: • May classify mutable parameters as immutable (0.2% misclassification in our experiments)
Classifying Parameters as Immutable After the ExecutionMisclassification Example void remove(List this, Object O) { if(contains(o)) …. else return; } void main() { List lst = new List(); lst.remove(5); } • The receiver of remove will be classified as immutable in this example
Example Execution Input to the Dynamic Analysis • Provided by user • Exercises complex behavior • May have limited coverage • Generated Randomly (Pacheco 06) • Fully automatic process • Focus on unclassified parameters • Dynamic analysis can be iterated • In practice using focused iterative random input generation suppressedthe need for user input
Static Analysis • Analyses: • Intra-procedural points-to analysis • Inter-procedural propagation analysis • Very simple • Scales well • Very coarse pointer analysis • No complicated escape analysis • No use of method summaries • Designed to be used in conjunction with other analyses • Other analyses make up for its weak points, and vice versa
Intraprocedural Static Analysis • Pointer Analysis • Calculate which parameters each Java local may point to • Assume that method calls alias all parameters • Intraprocedural Static Analysis • Mark direct field and array mutations as mutable • Mark parameters that aren’t directly mutated and don’t escape through method calls as immutable • Leave the rest as unknown
Interprocedural Propagation • Uses the same pointer analysis • Constructs a Parameter Dependency Graph • Shows how values get passed as parameters • Propagates mutability through the graph • Unknown parameters that are only passed to immutable positions should be immutable • Unknown parameters that are passed to mutable positions should be mutable
Static Analysis Example class List { int size(List this){ return n;} void add(List this, Object o) {… this.array[index] = o; … } void addAll(List this, List l){… this.add(x); … } } class List { int size(List this){ return n;} void add(List this, Object o) {… this.array[index] = o; … } void addAll(List this, List l){… this.add(x); … } } class List { int size(List this){ return n;} void add(List this, Object o) {… this.array[index] = o; … } void addAll(List this, List l){… this.add(x); … } } class List { int size(List this){ return n;} void add(List this, Object o) {… this.array[index] = o; … } void addAll(List this, List l){… this.add(x); … } } Initial classification – all unknown After Inter-Procedural propagation After Intra-Procedural analysis
Evaluation • Pipeline construction • Accuracy • Scalability • Applicability • 6 subject programs; largest 185KLOC
Pipeline Construction • Always start with the intra-procedural analysis: • Sound and very simple. • Discover the classification of many trivial or easily detectable parameters • Always run propagation after each analysis: • Only the first time is expensive (parameter graph is only calculated once) • Adds between 0.3%-40% to the correct classification results of a pipeline ending with a non propagation, classification changing analysis • Static analyses should be the first part of the pipeline for the best combination of dynamic and static analysis • The static analyses sets the ground for the following dynamic analysis reducing its imprecision by 75%
Pipeline Construction (cont) • Random input generation is more effective then user input • Using random discovered 3.3% more parameters and misclassify 1% less than using the user input. • Using both fares is a little better than just using random input in the number of correctly classified parameters but with higher misclassification rate • Use all the dynamic analysis heuristics • Using all heuristics increases the percentage of correctly classified parameters in 6.5% with an increase on 0.1 in the misclassification rate. • Best Combination (out of 168):
Accuracy Results • Results for eclipse compiler (107 KLOC):
Scalability Evaluation • Execution times on Daikon (185KLOC): * Only runs on 3 of the 6 programs ** Mostly call graph construction
Applicability Evaluation • Client application: • Palulu, Artzi 06: Model-based test input generation • Model size and creation time (for programs JPPA can run on):
Conclusions • Framework for a staged mutability analysis • Novel dynamic analysis • Iterative random input generation is competitive with user input • Combination of lightweight static and dynamic analysis • Scalable • Accurate • Evaluation • Sheds light into the complexity of the problem and the effectiveness of the analyses applied to it • Investigate tradeoffs between analysis complexity and precision • Improve client applications • Reduce model size for test generation