1 / 23

An Empirical Study of the Relationship Between Code Bad Smells and Software Faults

An Empirical Study of the Relationship Between Code Bad Smells and Software Faults. Min Zhang School of Computer Science University of Hertfordshire. Introduction. What is a Code Bad Smell? Problems using Code Bad Smells An overview of the empirical study Code Bad Smell detection

kasi
Download Presentation

An Empirical Study of the Relationship Between Code Bad Smells and Software Faults

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Empirical Study of the Relationship Between Code Bad Smells and Software Faults Min Zhang School of Computer Science University of Hertfordshire

  2. Introduction • What is a Code Bad Smell? • Problems using Code Bad Smells • An overview of the empirical study • Code Bad Smell detection • Fault identification • Result and discussion • Conclusion • Q/A

  3. Code Bad Smells • The 22 Code Bad Smells are bad structures in source code informally identified by Fowler et al. (1999). • Fowler et al. (1999) suggest that Code Bad Smells can give “indications that there is trouble that can be solved by a refactoring”. • They are widely used for detecting refactoring opportunities in software (Mens and Tourwe, 2004).

  4. Problems in Using Code Bad Smells • Fowler et al. (1999) claim that Code Bad Smells are structures which cause detrimental effects on software. However, little empirical evidence has been provided. • Most existing Code Bad Smell detection tools are Metric-based. We argue about their accuracy.

  5. An Empirical Study of the Relationship between Code Bad Smells and Faults • Objective: Capture the relationship between Code Bad Smells and faults • Targeted Code Bad Smells: Data Clumps, Message Chains, Middle Man, Speculative Generality, and Switch Statements • Research Data: • Eclipse Core Packages (Release 3.0, 3.0.1, 3.0.2, 3.1 and 3.2) • Apache Common Packages (Common IO, Common Logging, Common Codec, Common DbUtils, Common DBCP, and Common Net)

  6. Code Bad Smell Detection • Pattern-based Code Bad Smell detection • Define each Code Bad Smell as particular code patterns • Ideas from Gamma et al.’s (1995) definition of the GoF Design Patterns • Use Recoder API to analyse Java source code

  7. An Example: The Pattern-based Definition of the Message Chains Bad Smell The Pattern-based Definition of the Message Chains Bad Smell

  8. Fault Identification • Zimmerman et al.’s (2007) fault identification approach: • Locate “bug”, “fix(ed)” and “update(d)” token in CVS comment messages. • If a version entry in CVS contains one or more above tokens and those tokens are followed by numbers, this version entry is seen as a bug fixing update. • Those numbers are treated as bug ID. • Confirm the bug ID using Bugzilla database.

  9. Results and Discussion: Binary Coding of the Existence of Code Bad Smells (1)

  10. Result and Discussion: Binary Coding of the Existence of Code Bad Smells (2)

  11. Result and Discussion: One-way Analysis of Variance Eclipse Data (1)

  12. Result and Discussion: One-way Analysis of Variance Eclipse Data (2) • The five profiles which indicate the existence of each of the five Code Bad Smells contain significantly lower mean number of faults than profile zero. • All profiles which have higher mean number of faults than profile zero contain the Message Chains and the Switch Statement Bad Smells.

  13. Result and Discussion: the Message Chains and Switch Statements

  14. Result and Discussion: the Message Chains and Switch Statements • All source code samples associated with more than 10 faults contain the Message Chains Bad Smell. • The Switch Statements Bad Smell does not show a clear relationship with high number of faults.

  15. Result and Discussion: One-way Analysis of Variance Apache Data (1)

  16. Result and Discussion: One-way Analysis of Variance Apache Data (2) • The five profiles which indicate the existence of each of the five Code Bad Smells contain lower mean number of faults than profile zero. • All the Message Chains Bad Smell contained profiles do not show higher mean number of faults than the profile zero.

  17. A Detailed Investigation of Message Chains • Objective: • To test whether the Message Chains Bad Smell is directly associated with faults. • To test whether the Message Chains Bad Smell is directly associated with particular types of faults. • Method: • Manually investigate 20 source code samples from the Eclipse project

  18. An Detail Investigation of Message Chains: Direct Association with Faults

  19. A Detailed Investigation of Message Chains: Fault Classification • Classification Schema: An adopted version of Seaman et al.’s (2008) fault classification schema • Results:

  20. A Detailed Investigation of Message Chains: Result • Message Chains Bad Smell is not likely to be directly associated with faults, but it indicates a complicated software context. • Message Chains Bad Smell is likely to be associated with Algorithm/Method faults.

  21. Conclusion • Source code containing only one of the five Code Bad Smells is not likely to be fault prone. • The Message Chains Bad Smell could cause a high number of faults and is likely to be associated with Algorithm/Method faults, so it deserves further attention. • The Message Chains Bad Smell may not be directly associated with faults but it may indicate a complicated software context.

  22. Q/A

  23. References • FOWLER, M., BECK, K., BRANT, J., OPDYKE, W. & ROBERTS, D. (1999) Refactoring: Improving the Design of Existing Code, Addison Wesley. • GAMMA, E., HELM, R., JOHNSON, R. & VLISSIDES, J. (1995) Design patterns : elements of reusable object-oriented software, Reading, Mass., Addison-Wesley. • MENS, T. & TOURWE, T. (2004) A survey of software refactoring. Software Engineering, IEEE Transactions on, 30, 126-139. • SEAMAN, C. B., SHULL, F., REGARDIE, M., ELBERT, D., FELDMANN, R. L., GUO, Y. & GODFREY, S. (2008) Defect categorization: making use of a decade of widely varying historical data. Proceedings of the Second ACM-IEEE international symposium on Empirical software engineering and measurement. Kaiserslautern, Germany, ACM. • ZIMMERMANN, T., PREMRAJ, R. & ZELLER, A. (2007) Predicting Defects for Eclipse. IN PREMRAJ, R. (Ed.) Predictor Models in Software Engineering, 2007. PROMISE'07: ICSE Workshops 2007. International Workshop on.

More Related