1 / 36

Fyrirlestrar 17 & 18 Does Code Decay?

“As part of our experience with the production of software for a large telecommunications system, we have observed a nearly unanimous feeling among developers of the software that the code degrades through time and maintenance becomes increasingly difficult and expensive.”. Eick et al, 1998.

mardi
Download Presentation

Fyrirlestrar 17 & 18 Does Code Decay?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “As part of our experience with the production of software for a large telecommunications system, we have observed a nearly unanimous feeling among developers of the software that the code degrades through time and maintenance becomes increasingly difficult and expensive.” Eick et al, 1998 MSc Software MaintenanceMS Viðhald hugbúnaðar Fyrirlestrar 17 & 18 Does Code Decay? Dr Andy Brooks

  2. Case Study Dæmisaga Reference Does Code Decay? Assessing the Evidence from Change Management Data, Stephen G Eick, Todd L Graves, Alan F Karr, J S Marron, and Audris Mockus, NISS-TR-81 (1998), National Institute of Statistical Sciences, 19 T. W. Alexander Drive, PO Box 14006, Research Triangle Park, NC 27709-4006, USAhttp://www.niss.org/technicalreports/tr81.pdf “Whether this code decay is real, how it can be characterized, and the extent to which it matters are the questions we address in this paper.” Eick et al, 1998 Dr Andy Brooks

  3. Previous Work “Early investigations of aging in large software systems by Belady and Lehman [2], [3], [4] reported the near impossibility of adding new code to an aged system without introducing faults.” Eick et al, 1998 Dr Andy Brooks

  4. Access To Large Data Set • Entire change management history of a 15 year old, real-time, software system for telephone switches: • 100,000,000 lines of code • C, C++, proprietary state description language • 100,000,000 lines of header and make files • Some 50 major subsystems and 5,000 modules • Here, a module is a directory containing several files. • Each release is some 20,000,000 lines of code • 10,000 developers have been involved. Dr Andy Brooks

  5. Categories Of Change • Adaptive • new functionality (e.g. caller ID) • adaptions to new hardware or other changes in environment • Corrective • fixing faults • Perfective • improve maintainability of software • reengineering (refactoring) Dr Andy Brooks

  6. Change Process • A new feature (e.g. call waiting) involves hundreds of Initial Modification Requests (IMRs). • Each IMR results in a number of Modification Requests (MRs) . • Developers open MRs, perform the changes and make limited checks that the changes are satisfactory. • Inspections and integration and system tests follow. • An editing change to a single file is captured as a delta. • Lines added and deleted are tracked separately. • Line edits involve first deletion, then addition. Dr Andy Brooks

  7. Data Tracked By Version Management System 89 fields including priority, date opened, date closed problem solution (change & reasons) Dr Andy Brooks

  8. What files were changed, How many modules, files, and lines were affected?... Answering Questions About Change Data D - directly from version management database A - by aggregation over constituent parts D* - problematic aspects Dr Andy Brooks

  9. What Is Code Decay? • “Code is decayed if it is more difficult to change than it used to be.” • But increases in difficulty of making changes may be as a result of an increase in the inherent difficulty of requested changes. • Decayed code does not mean that the software fails to meet current requirements. • Decayed code means it is difficult to add new functionality or make other changes. Dr Andy Brooks

  10. What Is Code Decay? • Decayed code may have increased value. • The changes that have caused the decay mean more functionality for the customer. • A code unit can decay as a result of changes elsewhere in the software. • A code unit can be inherently complex and to attribute the difficulty of making a change to decay can be misleading. Andy says: a complex application will result in complex software. Dr Andy Brooks

  11. Individual Ability • Making changes is less difficult for a more more able software maintainer. • Making changes is more difficult for a junior software maintainer. • “A definitive adjustment for developer ability has not been devised and usually we must relegate developer variability to ‘noise’ terms in our models.” Dr Andy Brooks

  12. Causes Of Decay • Inappropriate architecture • changes have wide scope • Violation of original design principles • fixed phone -> mobile/fixed phone • Imprecise requirements • ‘crisp code’ not produced • Time Pressure • short-cuts, sloppy code, kludges • limited code understanding Dr Andy Brooks

  13. Causes Of Decay • Inadequate programming tools • Organizational Environment • excessive staff turnover • developers fail to communicate properly • Programmer variability • weak programmers may not understand complex code written by more able colleagues • Inadequate change process • missing version control • handling changes in parallel Dr Andy Brooks

  14. Sjúkdómseinkenn batahorfur Medical Metaphor • The software is a patient with a disease called code decay. • What are the causes of the disease? • changes made to the code • What are the disease symptoms? • What are the prognoses if you have the disease? • What are the relevant risk factors for the disease? Andy says: I hope you do not smoke. Dr Andy Brooks

  15. Symptoms Of Code Decay • Excessively complex code • useful metrics: • standard software complexity metrics? • # loops & conditionals enclosing a line? • A history of frequent changes • also known as ‘code churn’ • A history of faults • fault fixes themselves may not be examples of good programming Dr Andy Brooks

  16. Symptoms Of Code Decay • Widely dispersed changes • Changes to well-engineered code tend to be local (within a class). • Kludges • Changes made knowing it could have been done more elegantly or more efficiently. • Numerous Interfaces (entry points) • Possible side-effects of changes elsewhere. Dr Andy Brooks

  17. Risk Factors For Code Decay- Risk factors increase chance of decay or worsen its effect. • Size of module m • NCSL(m), number of noncommentary source lines • Age of Code • but very stable code might never be changed • variability of age within a code unit may be the key characteristic • Inherent Complexity • real-time software is more likely to decay • Organizational Churn • company knowledge base degraded • inexperienced developers make changes Dr Andy Brooks

  18. Risk Factors For Code Decay- Risk factors increase chance of decay or worsen its effect. • Ported or Reused Code • Ariane 5 crash was caused by reused code from Ariane 4 • http://edition.cnn.com/WORLD/9606/04/rocket.explode/ • Requirements Load • very many requirements are difficult to understand and implement • Inexperienced Developers • lack of knowledge • lack of understanding of system architecture 3-tier? Dr Andy Brooks

  19. Code Decay Indices (CDIs) notation • c for changes (MRs) • l for lines of code • f for files • m for modules • c->m means ‘c touches m’ • Part of m is changed by c. • 1{A} • equals 1 if event A occurs • equals 0 otherwise Dr Andy Brooks

  20. Code Decay Indices (CDIs) notation • DELTAS(c) • number of deltas associated with c • ADD(c) • number of lines added by c • DEL(c) • number of lines deleted by c • DATE(c) • date on which c is completed • INT(c) • the calendar time required to implement c • DEV(c) • number of developers implementing c Dr Andy Brooks

  21. Historical Count Of Changes • The number of changes to a module m in the time interval I: • With |I| indicating length of time interval I, the frequency of changes is: Dr Andy Brooks

  22. Span Of Changes Scope of Changes • The span is the number of files touched by a change: • Changes touching more files are more difficult because: • The maintainer might have to spend time understanding unfamiliar files. • Code interfaces might have to be modified. Dr Andy Brooks

  23. Size • The size of a module m is NCSL(m) summing over all files f in m. • “most standard software complexity metrics are almost perfectly correlated with NCSL in our data sets” Dr Andy Brooks

  24. Age • AGE(m) • the average age of constituent lines • Variability in line ages is also of interest • The tool SeeSoft produces a visualization of the variability in line ages: • files represented by boxes • lengths of lines in the boxes proportional to the number of characters • files that change little have mostly a single colour • files that have been changed a lot are multi-colored Dr Andy Brooks

  25. SeeSoft View Of One Module Dr Andy Brooks

  26. SubSystem Under Analysis • 100 modules • 2,500 files • 6,000 IMRs • 27,000 MRs • 130,000 deltas • 500 different login names made code changes to the subsystem X 100 Dr Andy Brooks

  27. Temporal Behavior Of The Span Of Changes (different window widths) • Probabilities that a change will touch more than one file doubles from less than 2% in 1989 to more than 5% in 1996. • Ripples in the high resolution smooth are not statistically significant. initial development 96 89 Date Dr Andy Brooks

  28. Breakdown In Modularity? • Alone, the increase in span of changes does not imply a breakdown in the modularity of the subsystem. • The increase could simply reflect the growth of the subsystem and changes with a wide span need not cross module boundaries. c c Dr Andy Brooks

  29. Network Visualization Tool NicheWorks • Each tadpole shape corresponds to a module. • The tadpole tail indicates the picture at the end of the previous year. • Pairs of modules are placed nearby if they have been changed together as part of the same MRs a large number of times. Dr Andy Brooks

  30. NicheWorks View Of The SubSystem Modules 1988 1989 1996 The architecture that separated the functionally of two clusters of modules is breaking down. Dr Andy Brooks

  31. Alternative Interpretation implement caller-ID provide an extra area-code digit The inherent difficulty of the desired changes could have been increasing. The modification request data are not examined independently from this perspective. Dr Andy Brooks

  32. Prediction Of Faults Quality Prognosis • The best model derived from the data predicts numbers of faults using numbers of changes to the module in the past. • Large recent changes add most to the fault potential. • Parameter 0.75 was determined by statistical analysis. • The number of times a module has been changed is a better predictor than size. • The number of developers working on a module had no effect on fault potential. Dr Andy Brooks

  33. Prediction Of Effort Effort Prognosis • “Can the effort required to implement changes be predicted from symptoms and risk factors for decay?” • Effort data, available only at the feature level, displayed extreme variability, so suggestive results only: • A dependency on FILES(c) was discovered supporting the idea that the span of changes is a symptom of decay. • Some changes involved a small number of deltas but required close to maximum effort. Dr Andy Brooks

  34. Summary Eick at al Four analyses demonstrate: • “The increase over time in the number of files touched per change to the code. • The decline in modularity of a subsystem of the code, as measured by changes touching multiple modules. • Contributions of several factors (notably, frequency and recency of change) to fault rates in modules of code, and • That span and size of changes are important predictors (at the feature level) of the effort to implement a change.” Dr Andy Brooks

  35. Summary Eick at al • The system studied showed no evidence of dramatic, widespread decay: • In seven years, the probability of a change touching more than 1 file increased only from 2% to 5%. • The architecture that separated the functionally of two clusters of modules is breaking down. • Can code decay prove fatal? • “there are anecdotal reports of systems that have reached a state from which further change is not possible” Dr Andy Brooks

  36. Modification Request Difficulty • Analysing the nature of the modification requests over time was not done and alternative interpretations of the data set cannot be rule out. • How can you measure the inherent difficulty of a modification request? • By the span of changes? • By the complexity of the textual description & justification? • The temporal behaviour of the span of changes could be due to the inherent difficulty of modification requests increasing with time. Andy says: we do not know for this data. Dr Andy Brooks

More Related