280 likes | 397 Views
Aiding Comprehension of Cloning Through Categorization. Cory Kapser and Michael W. Godfrey Software Architecture Group School of Computer Science, University Of Waterloo. Overview. Motivation Background Methods Case Studies Results Discussion Summary. Motivation.
E N D
Aiding Comprehension of Cloning Through Categorization Cory Kapser and Michael W. Godfrey Software Architecture Group School of Computer Science, University Of Waterloo
Overview • Motivation • Background • Methods • Case Studies • Results • Discussion • Summary
Motivation • Code duplication (“cloning”) is common in large, long-lived industrial software systems. • Negatively affects successful system evolution! • Thus, clone management or removal is desirable.
Problems with clone detection technologies • Comprehension • Result sets often provide little information beyond “it’s a clone” • Scalability • VERY large result sets typical • Accuracy • Esp. false positives
Proposed solution • Classification of clones • Improve comprehension through informative grouping and statistical analysis • Improve scalability through easier navigation • Improve accuracy through region-specific filtering
Overview • Motivation • Background • Methods • Case Studies • Results • Discussion • Summary
Code cloning • A serious problem in industrial software. • Typically, 15% of a system is duplicated code. • As high as 50% in some cases [Ducasse]
Reasons for code cloning • Perceived cost • Time constraints • Insufficient understanding of the underlying problem • Architectural clarity
Problems with clones • Maintenance • Size • Comprehension • Bugs (copied and new) • Indication of poor design
Managing clones • Removal • Documentation
Overview • Motivation • Background • Methods • Case Studies • Results • Discussion • Summary
Our approach • Perform clone detection • Extract/define “regions” from source code • Map clone pairs to regions • Classify clones • Filter clones • Display results
The taxonomy • Classifies clones according to attributes such as location and region type of a clone • Hierarchical
ADD A SLIDE HERE • To discuss what you hoped yoru taxonomy would help you with • Why did you pcik that design? • Give an example of how using this taxonomy could be helpful in a (simple, made up) example case
Overview • Motivation • Background • Methods • Case Studies • Results • Discussion • Summary
Case studies • PostgreSQL • 543,387 LOC • 1097 source files • Linux kernel file-system subsystem • 280,177 LOC • 537 source files
Filtering and classification results • 85 – 87% of clones could be classified using the taxonomy • Fewer unclassified clones in Same Directory Clonescategory • Large percentage of false positives were removed via filtering structural and prototype regions.
Overall cloning in the systems • Function Clones dominate the SameDirectory Clones. • Most cloning occurs within the same directory.
Frequency of clone types • Very few loop clones • Relatively many conditional clones • 38% of the clone pairs in the Linux fs and 53% of the clone pairs of PostgreSQL made up function clones
It is possible to insert a table here with the results even if it is partial (to show that the work is there and that there are numbers)? • Or maybe a graph? Nice to have this to imply: here’s all the hard work we did, boy did we sweat, and there are so many results that the obersvations are probably meaningful
Overview • Motivation • Background • Methods • Case Studies • Discussion • Summary
Cloning comprehension • Classification of clones can improve comprehension • User will have a working understanding of what a clone in a certain type means • We believe navigation of the “clone space” will be greatly improved • We now know more about cloning as it occurs in a software system • Simple metrics are now available
Tool support • Clone Interpretation and Classification System (CICS) • Provides GUI to navigate classified clones • Will provide benchmarking support for clone detection tools • Many features can be added complement the sorting of clones in the taxonomy
Overview • Motivation • Background • Methods • Case Studies • Discussion • Summary
Summary • Management of clones is important for the healthy evolution of a software system • We can make the process of managing clones more comprehensible, scalable, and accurate
Future work • Deeper classification • Benchmark suite • IDE plugins • Evolution of clones