160 likes | 286 Views
Automating Big Refactorings for Componentization and the Move to SOA IBM Programming Languages and Development Environments Seminar 2008. Aharon Abadi, Ran Ettinger and Yishai Feldman Software Asset Management Group IBM Haifa Research Lab. Background. What is Refactoring?
E N D
Automating Big Refactorings forComponentization and the Move to SOAIBM Programming Languages and Development Environments Seminar 2008 Aharon Abadi, Ran Ettinger and Yishai Feldman Software Asset Management Group IBM Haifa Research Lab
Background • What is Refactoring? • The process of gradually improving the design of an existing software system by performing source code transformations that improve its quality in such a way that it becomes easier to maintain the system and reuse parts of it, while preserving the behavior of the original system • For example: Extract Method void printOwing(double amount) { printBanner(); // print details print(“name:” + _name); print(“amount:” + amount); } void printOwing(double amount) { printBanner(); printDetails(amount); } void printDetails(double amount) { print(“name:” + _name); print(“amount:” + amount); } Source: Martin Fowler’s online refactoring catalog
Prior Art • Refactoring to Design Patterns +
Our Interest: Big Refactorings • Enterprise Architecture Patterns
? ? The Gap: Techniques and Tools for Enterprise Refactoring
Enterprise Refactorings Separate Presentation Code from Business Logic (introduce the MVC pattern), Extract Reusable Services (implementing SOA), etc. Composition • Small Refactorings Rename Paragraph, Split/Merge Paragraphs, Extract/Inline Paragraph/Section/Program, Extract Slice, Swap Consecutive (Independent) Statements/Sentences/Paragraphs/Sections, Split/Merge (Consecutive) Conditionals, Loop-Invariant Code Motion, etc. Required Quality Flexibility Allow (user-determined) choice between alternatives Applicability Avoid unnecessary rejections by precise identification of (weak) preconditions Reliability* Guarantee behavior preservation Enabling Technology Deep Program Analysis Program analysis infrastructure for legacy enterprise software systems: A powerful static analysis infrastructure using the plan-calculus intermediate representation Automating Big Refactorings * Seehttp://progtools.comlab.ox.ac.uk/projects/refactoring/bugreportsfor examples of bugs in modern IDEs
As-Is Version: Photo Album Web Application Source: Alex Chaffee, draft “Refactoring to MVC” online article, 2002
To-Be Version: Photo Album Web Application Controller View Presentation Model Model Source: Alex Chaffee, draft “Refactoring to MVC” online article, 2002
Small Refactorings on the move to MVC • All kinds of renaming • Variables, fields, methods, etc. • Extracting program entities • Constants, local (temp) variables, parameters, methods (Extract Method, Replace Temp with Query, Decompose Conditional), classes (Extract Class, Extract Superclass, Extract Method Object) • Some reverse refactorings too, to inline program entities • Moving program entities • Constants, fields, methods (Move Method, Pull-Up Method), statements (Swap Statements), classes • Replace Algorithm
Shortcomings of Eclipse on the move to MVC • Missing implementation for key transformations • Extract Class, Extract Method Object • Buggy implementation of some refactorings • Extract/Inline Local Variable: Ignores potential modification of parameters (on any path from source to target location) • See http://progtools.comlab.ox.ac.uk/projects/refactoring/bugreports for examples of bugs in (earlier releases of) modern IDEs • Restricted implementation of existing refactorings • Extract Method: contiguous code only; weak control over parameters • Move Method: Source class must have a field with type of target class • Extract Local Variable: No control over location of declaration
Internal Representation: The Plan Calculus • Wide-spectrum • Specification to implementation • Canonical • Abstracts away from syntactic variations • Language independent • All legacy languages have similar capabilities • Expressive • Directly expresses program semantics in terms of data-flow and control-flow • Convenient for machine manipulation • Naturally expresses semantic transformations Rich, C. 1986. A formal representation for plans in the programmer's apprentice. In Readings in Artificial intelligence and Software Engineering, C. Rich and R. C. Waters, Eds. Morgan Kaufmann Publishers, San Francisco, CA, 491-506.
A family of provably-correct code-motion untangling transformations Automate slice extraction: Sequential composition of a selected slice with its complement (i.e. co-slice); Useful for refactoring, componentization, the move to SOA, obfuscation, etc. Combine statement reordering with code duplication, including duplication of assignments Benefit from the best of leading earlier solutions without suffering some of their respective deficiencies Sliding is particularly strong in Preserving behavior Maximizing reuse (of extracted computation’s results, in the complement) Minimizing code duplication, i.e. yielding a smaller, more desirable complement; Improving applicability, i.e. less reasons to reject a request Program Sliding MOVE 0 TO TOTAL-SALE MOVE 0 TO TOTAL-PAY PERFORM VARYING i FROM 1 BY 1 UNTIL i > DAYS ADD SALE(i) TO TOTAL-SALE COMPUTE TOTAL-PAY = TOTAL-PAY + 0.1*SALE(i) IF SALE(i)>1000 ADD 50 TO TOTAL-PAY END-IF END-PERFORM COMPUTE PAY = TOTAL-PAY / DAYS + 100 COMPUTE PROFIT = 0.9*TOTAL-SALE - COST MOVE 0 TO TOTAL-PAY PERFORM VARYING i FROM 1 BY 1 UNTIL i > DAYS COMPUTE TOTAL-PAY = TOTAL-PAY + 0.1*SALE(i) IF SALE(i)>1000 ADD 50 TO TOTAL-PAY END-IF END-PERFORM COMPUTE PAY = TOTAL-PAY / DAYS + 100 MOVE 0 TO TOTAL-SALE PERFORM VARYING i FROM 1 BY 1 UNTIL i > DAYS ADD SALE(i) TO TOTAL-SALE END-PERFORM COMPUTE PROFIT = 0.9*TOTAL-SALE - COST MOVE 0 TO TOTAL-SALE MOVE 0 TO TOTAL-PAY PERFORM VARYING i FROM 1 BY 1 UNTIL i > DAYS ADD SALE(i) TO TOTAL-SALE COMPUTE TOTAL-PAY = TOTAL-PAY + 0.1*SALE(i) IF SALE(i)>1000 ADD 50 TO TOTAL-PAY END-IF END-PERFORM COMPUTE PAY = TOTAL-PAY / DAYS + 100 COMPUTE PROFIT = 0.9*TOTAL-SALE - COST Example source: Lakhotia and Deprez (rewritten in COBOL)
Towards a COBOL Refactoring Catalog • Rename Paragraph • This refactoring might look trivial, but as it is with the renaming of variables, it must be done with care: the new name must be valid, it must not conflict with existing names, and it must be replaced correctly in each call (PERFORM, GO TO, etc.), without violating any column restrictions • Split/Merge Paragraphs • When merging two consecutive paragraphs, one must check the second is not referenced, or if it is, its reference must always follow a call to the first paragraph such that the two calls can be merged. Similarly, one must verify that any call to the first paragraph is either followed by a call to the second, or it must be a non-returning call that implies fall-through to the second paragraph • Extract/Inline Paragraph/Section/Program • Could support clone detection too, such that upon extraction, the tool will identify (at least exact) clones of the selected code, and suggest to replace it too with a call to the newly introduced program • Extract Slice (through Sliding) • First support the extraction of the code for computing a set of variables in a selected compound statement (or sentence); later add support for extraction from internal program points, i.e., the slicing criteria involves pairs of program point and (sets of) variables of interest (at that particular point); and finally support arbitrary method extraction, i.e., the slicing criteria involves a set of statements (or sentences), not necessarily contiguous, for extraction • Swap Consecutive (Independent) Executable Program Entities • Such as compound statements, sentences, and even paragraphs or sections • Split/Merge (Consecutive) Conditionals • So long as two instances of the conditional’s predicate are guaranteed to evaluate similarly • Loop-Invariant Code Motion • Computation flavor: A loop-invariant computation is moved inside/outside that loop, as in optimizing compilers • Conditional flavor: Instead of a computation, it is a loop-invariant conditional being moved • If moved out, the loop itself is duplicated, for each branch of the conditional, but with each branch simplified based on the known conditional’s result