1 / 25

Unification and Refactoring of Clones

This study explores the harmful effects of code clones and their association with error-proneness and increased maintenance effort and cost. It proposes a new approach to clone refactoring, addressing the limitations of current tools and improving software evolution.

kdurkee
Download Presentation

Unification and Refactoring of Clones

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clone images created by Rebecca Tiarks et al. Unification and Refactoringof Clones GiriPanamoottil Krishnan and NikolaosTsantalis Department of Computer Science & Software Engineering

  2. Motivation • Clones may be harmful • Clones are associated with error-proneness due to inconsistent updates (Juergens et al. @ ICSE’09) • Clones increase significantly the maintenance effort and cost (Lozano et al. @ ICSM’08) • Clones are change-prone (Mondal et al. 2012) • Some studies have shown that clones are stable Some clones need to be refactored IEEE CSMR-WCRE 2014 Software Evolution Week

  3. Motivation cont'd Current refactoring tools perform poorly A study by Tairas & Gray [IST’12] on Type-II clones detected by Deckard in 9 open-source projects revealed: • only 10.6% of them could be refactored by Eclipse • CeDAR[IST’12] was able to refactor 18.7% of them Tools should be able to refactor more clones IEEE CSMR-WCRE 2014 Software Evolution Week

  4. Limitation #1 • Current tools can parameterize only a small subset of differences in clones. • Mostly differences between variable identifiers, literals, simple method calls. Clone #2 Clone #1 Rectangle rectangle= new Rectangle( a, b,c, high – low ); Rectangle rectangle= new Rectangle( a, b, c, getHeight() ); IEEE CSMR-WCRE 2014 Software Evolution Week

  5. Limitation #2 • Current approaches may return non-optimalmatching solutions. • They do not explore the entire search space of possible matches. • In case of multiple possible matches, they select the “first” or “best” match. • They face scalability issuesdue to the problem of combinatorial explosion. IEEE CSMR-WCRE 2014 Software Evolution Week

  6. Clone #2 Clone #1 if (orientation == VERTICAL) { Line2D line = new Line2D.Double(); double y0 = dataArea.getMinY(); double y1 = dataArea.getMaxY(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(start2d, y0, start2d, y1); g2.draw(line); } if (range.contains(end)) { line.setLine(end2d, y0, end2d, y1); g2.draw(line); } } elseif (orientation == HORIZONTAL) { Line2D line = new Line2D.Double(); double x0 = dataArea.getMinX(); double x1 = dataArea.getMaxX(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(x0, start2d, x1, start2d); g2.draw(line); } if (range.contains(end)) { line.setLine(x0, end2d, x1, end2d); g2.draw(line); } } if (orientation == VERTICAL) { Line2D line = new Line2D.Double(); double x0 = dataArea.getMinX(); double x1 = dataArea.getMaxX(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(x0, start2d, x1, start2d); g2.draw(line); } if (range.contains(end)) { line.setLine(x0, end2d, x1, end2d); g2.draw(line); } } elseif (orientation == HORIZONTAL) { Line2D line = new Line2D.Double(); double y0 = dataArea.getMinY(); double y1 = dataArea.getMaxY(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(start2d, y0, start2d, y1); g2.draw(line); } if (range.contains(end)) { line.setLine(end2d, y0, end2d, y1); g2.draw(line); } } 24 differences NOT APPROVED IEEE CSMR-WCRE 2014 Software Evolution Week

  7. Clone #2 Clone #1 if (orientation == VERTICAL) { } Line2D line = new Line2D.Double(); double x0 = dataArea.getMinX(); double x1 = dataArea.getMaxX(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(x0, start2d, x1, start2d); g2.draw(line); } if (range.contains(end)) { line.setLine(x0, end2d, x1, end2d); g2.draw(line); } if (orientation == VERTICAL) { Line2D line = new Line2D.Double(); double y0 = dataArea.getMinY(); double y1 = dataArea.getMaxY(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(start2d, y0, start2d, y1); g2.draw(line); } if (range.contains(end)) { line.setLine(end2d, y0, end2d, y1); g2.draw(line); } } else if (orientation == HORIZONTAL) { } Line2D line = new Line2D.Double(); double y0 = dataArea.getMinY(); double y1 = dataArea.getMaxY(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(start2d, y0, start2d, y1); g2.draw(line); } if (range.contains(end)) { line.setLine(end2d, y0, end2d, y1); g2.draw(line); } else if (orientation == HORIZONTAL) { Line2D line = new Line2D.Double(); double x0 = dataArea.getMinX(); double x1 = dataArea.getMaxX(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(x0, start2d, x1, start2d); g2.draw(line); } if (range.contains(end)) { line.setLine(x0, end2d, x1, end2d); g2.draw(line); } } IEEE CSMR-WCRE 2014 Software Evolution Week

  8. Clone #2 Clone #1 if (orientation == VERTICAL) { Line2D line = new Line2D.Double(); double y0 = dataArea.getMinY(); double y1 = dataArea.getMaxY(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(start2d, y0, start2d, y1); g2.draw(line); } if (range.contains(end)) { line.setLine(end2d, y0, end2d, y1); g2.draw(line); } } elseif (orientation == HORIZONTAL) { Line2D line = new Line2D.Double(); double x0 = dataArea.getMinX(); double x1 = dataArea.getMaxX(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(x0, start2d, x1, start2d); g2.draw(line); } if (range.contains(end)) { line.setLine(x0, end2d, x1, end2d); g2.draw(line); } } if (orientation == HORIZONTAL) { Line2D line = new Line2D.Double(); double y0 = dataArea.getMinY(); double y1 = dataArea.getMaxY(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(start2d, y0, start2d, y1); g2.draw(line); } if (range.contains(end)) { line.setLine(end2d, y0, end2d, y1); g2.draw(line); } } else if (orientation == VERTICAL) { Line2D line = new Line2D.Double(); double x0 = dataArea.getMinX(); double x1 = dataArea.getMaxX(); g2.setPaint(im.getOutlinePaint()); g2.setStroke(im.getOutlineStroke()); if (range.contains(start)) { line.setLine(x0, start2d, x1, start2d); g2.draw(line); } if (range.contains(end)) { line.setLine(x0, end2d, x1, end2d); g2.draw(line); } } 2 differences APPROVED IEEE CSMR-WCRE 2014 Software Evolution Week

  9. Minimizing differences • Minimizing the differences during the matching process is critical for refactoring. • Why? • Less differences means less parameters for the extracted method (i.e., a more reusable method). • Less differences means also lower probability for precondition violations (i.e., higher refactoring feasibility) • Matching process objectives: • Maximize the number of matched statements • Minimize the number of differences between them IEEE CSMR-WCRE 2014 Software Evolution Week

  10. Limitation #3 • There are no preconditions to determine whether clones can be safely refactored. • The parameterization of differences might change the behavior of the program. • Statements in gaps need to be moved before the cloned code. Changing the order of statements might also affect the behavior of the program. IEEE CSMR-WCRE 2014 Software Evolution Week

  11. Our goal Improve the state-of-the-art in the Refactoring of Software Clones: • Given two code fragments containing clones; Find potential control structures that can be refactored. • Find an optimal mapping between the statements of two clones. • Make sure that the refactoring of the clones will preserve program behavior. • Find the most appropriate refactoring strategy to eliminate the clones. IEEE CSMR-WCRE 2014 Software Evolution Week

  12. Our approach differences unmapped statements isomorphic CDT pairs Control Structure Matching PDG Mapping Precondition Examination IEEE CSMR-WCRE 2014 Software Evolution Week

  13. Phase 1: Control Structure Matching • Intuition: two pieces of code can be merged only if they have an identical control structure. • We extract the Control Dependence Trees (CDTs) representing the control structure of the input methods or clones. • We find all non-overlapping largest common subtrees within the CDTs. • Each subtreematch will be treated as a separate refactoring opportunity. IEEE CSMR-WCRE 2014 Software Evolution Week

  14. CDT Subtree Matching CDT of Fragment #1 CDT of Fragment #2 x A a y B C b c D E F G f g d e IEEE CSMR-WCRE 2014 Software Evolution Week

  15. Phase 2: PDG Mapping • We extract the PDG subgraphs corresponding to the matched CDT subtrees. • We want to find the common subgraph that satisfies two conditions: • It has the maximum number of matched nodes • The matched nodes have the minimum number of differences. • This is an optimization problem that can be solved using an adaptation of a Maximum Common Subgraphalgorithm [McGregor, 1982]. IEEE CSMR-WCRE 2014 Software Evolution Week

  16. MCS Algorithm • Builds a search tree in depth-first order, where each node represents a state of the search space. • Explores the entire search space. • It has an exponential worst case complexity. • As the number of possible matching node combinations increases, the width of the search tree grows rapidly (combinatorial explosion). IEEE CSMR-WCRE 2014 Software Evolution Week

  17. Divide-and-Conquer • We break the original matching problem into smaller sub-problems based on the control dependence structure of the clones. • Finally, we combine the sub-solutions to give a global solutionto the original matching problem. IEEE CSMR-WCRE 2014 Software Evolution Week

  18. Bottom-up Divide-and-Conquer CDT subtree of Clone #1 CDT subtree of Clone #2 A a B C b c D E F G f g d e D d Level 2 Best sub-solution from (D, d) IEEE CSMR-WCRE 2014 Software Evolution Week

  19. Bottom-up Divide-and-Conquer CDT subtree of Clone #1 CDT subtree of Clone #2 A a B C b c E F G f g e E e Level 2 Best sub-solution from (E, e) IEEE CSMR-WCRE 2014 Software Evolution Week

  20. Phase 3: Precondition examination • Preconditions related to clone differences: • Parameterization of differences should not break existing data dependences in the PDGs. • Reordering of unmapped statements should not break existing data dependences in the PDGs. • Preconditions related to method extraction: • The unified code should return one variable at most. • Matched branching (break, continue) statements should be accompanied with the corresponding matched loops in the unified code. IEEE CSMR-WCRE 2014 Software Evolution Week

  21. Evaluation • We compared our approach with a state-of-the-art tool in the refactoring of Type-II clones, CeDAR [Tairas & Gray, IST’12]. • 2342 clone groups, detected in 7 open-source projects by Deckard clone detection tool. • CeDAR is able to analyze only clone groups in which all clones belong to the same Java file. IEEE CSMR-WCRE 2014 Software Evolution Week

  22. Clone groups within the same Java file IEEE CSMR-WCRE 2014 Software Evolution Week

  23. Clone groups within different Java files Clones in different files are more difficult to refactor 36% vs. 27% IEEE CSMR-WCRE 2014 Software Evolution Week

  24. Conclusions • Our approach was able to refactor 83% more clone groups than CeDAR. • Our approach assessed as refactorable27% of the clones groups, in which clones are placed in different files. • The study revealed that 36%of the clone groups can be refactored directlyor in the form of sub-clones. IEEE CSMR-WCRE 2014 Software Evolution Week

  25. Visit our project at http://jdeodorant.com IEEE CSMR-WCRE 2014 Software Evolution Week

More Related