1 / 9

Alexander Serebrenik and Mark van den Brand

Alexander Serebrenik and Mark van den Brand. Theil index for aggregation of software metrics values. Metrics and evolution. ?. Measure: micro, need: macro. How can we aggregate values?. Industry : sum, average Not always meaningful Theory : distribution Controversial even for LOC

conley
Download Presentation

Alexander Serebrenik and Mark van den Brand

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Alexander Serebrenik and Mark van den Brand Theil index for aggregation of software metrics values

  2. Metrics and evolution ? / SET / W&I Measure: micro, need: macro

  3. How can we aggregate values? • Industry: sum, average • Not always meaningful • Theory: distribution • Controversial even for LOC • Requires separate effort for each metrics • Econometry : • Measures of inequality (for wealth distribution) • Vasa et al. 2009: Gini coefficient • Not decomposable! • We: Theil coefficient / SET / W&I

  4. Decomposition? • Groups of individuals • How can we explain inequality? • programming language, development team, application domain • How does the inequality evolve? How do the explanations evolve? • Tomeasure I we use the Theil index! • Why? There are just two decomposable indices… / SET / W&I

  5. Evolution of the Theil index Slight increase in the index values. JBoss Debian Adempiere A huge file added and then removed “Quite stable in time, meaningful deviations” / SET / W&I

  6. Explanation of inequality in LOC Adempiere Programming language is quite poor as an explanation. Categories/packages are better as an explanation. Most significant part of the inequality is due to the inequality within the groups. Debian Package Programming language / SET / W&I

  7. OK… within a group. But which one? Only two languages contribute significantly to the inequality: Java and SQL Adempiere Adempiere Different packages contribute most at different versions. More and more migration scripts, some very small (3 LOC), some rather big (18800 LOC). / SET / W&I

  8. Debian (preliminary results) The largest contributors to inequality are ANSI C – presence of header files? / SET / W&I

  9. Conclusions • Decomposability for software metrics • provides insights in reasons for inequality • allows to compare different groups • Theil index is a decomposable inequality measure • Useful for assessing evolution • of a system as a whole • of the system subcomponents / SET / W&I

More Related