290 likes | 464 Views
Predicting Bugs Using Antipatterns. Ehsan Salamati Taba , Foutse Khomh , Ying Zou , Meiyappan Nagappan , Ahmed E. Hassan. Past Defects, History of Churn (Zimmermann, Hassan et al.). Model. Predict Bugs. Code. Antipatterns. Topic Modeling (Chen et al.). Antipatterns.
E N D
Predicting Bugs Using Antipatterns Ehsan SalamatiTaba, FoutseKhomh, Ying Zou, MeiyappanNagappan, Ahmed E. Hassan
Past Defects, History of Churn (Zimmermann, Hassan et al.) Model Predict Bugs Code Antipatterns Topic Modeling (Chen et al.)
Antipatterns • weaknesses in design • not technically incorrectand don't prevent a system from functioning
Motivation Antipatterns indicate weaknessesin the design that may increase the risk for bugs in the future. (Fowler 1999)
Approach RQ1 Mining Source Code Repositories CVS Repository Detecting Antipatterns RQ2 Analyzing Calculating Metrics Mining Bug Repositories Bugzilla RQ3
Mining Source Code Repositories Studied Systems Studied Systems
Detecting Antipatterns • DECOR (Moha et al.) • 13 different antipatterns # of Antipatterns # Files
Research Questions RQ1:Do antipatterns affect the density of bugs in files? RQ2: Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics? RQ3: Can we improve traditional bug prediction models with antipatterns information?
RQ1:Do antipatterns affect the density of bugs in files? • Null Hypothesis • Density of bugs in the files with antipatterns and the other files without antipatterns is the same. Wilcoxon rank sum test
RQ1: Do antipatterns affect the density of bugs in files? Files with Antipatterns Files without Antipatterns Density of Bugs Density of Bugs
Research Questions RQ1:Do antipatterns affect the density of bugs in files? RQ2:Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics? RQ3: Can we improve traditional bug prediction models with antipatterns information?
RQ2: Metrics • Average Number of Antipatterns (ANA) • Antipattern Recurrence Length(ARL) • AntipatternCumulative Pairwise Differences (ACPD) • Antipattern Complexity Metric (ACM)
RQ2: Example 3.0 4.0 5.0 6.0 1.0 2.0 3 0 2 3 4 1 a.java 4 1 0 3 5 0 b.java 0 5 4 4 6 5 c.java ANA(a.java) =2.16, ARL(a.java) = 18.76, ACPD(a.java) = 0
Provide additional explanatory power over traditional metrics • ARL shows the biggest improvement
Research Questions RQ1:Do antipatterns affect the density of bugs in files? RQ2: Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics? RQ3:Can we improve traditional bug prediction models with antipatterns information?
RQ3: Can we improve traditional bug prediction models with antipatterns information? Intra System Models Step-wise analysis Removing Independent Variables Collinearity Analysis
ARL remained statistically significant and had a low collinearity with other metrics # Versions # Versions
RQ3: Can we improve traditional bug prediction models with antipatterns information? • ARL can improve cross-system bug prediction on the two studied systems F-measure
RQ2) Example 3.0 4.0 5.0 6.0 1.0 2.0 3 0 2 3 4 1 a.java 4 1 0 3 5 0 b.java 0 5 4 4 6 5 c.java ANA(a.java) =2.16, ARL(a.java) = 18.76, ACPD(a.java) = 0
RQ1) Do antipatterns affect the density of bugs in files? Hypothesis There is no difference between the density of future bugs of the files with antipatterns and the other files without antipatterns. Wilcoxon rank sum test Findings In general, the density of bugs in a file with antipatterns is higher than the density of bugs in a file without antipatterns. Hypothesis There is no difference between the density of future bugs of the files with antipatterns and the other files without antipatterns. Wilcoxon rank sum test We perform a Wilcoxon rank sum test to accept or refuse the hypothesis, using the 5% level (i.e., p-value < 0:05).