100 likes | 256 Views
Alexander Serebrenik and Mark van den Brand. Theil index for aggregation of software metrics values. Metrics and evolution. ?. Measure: micro, need: macro. How can we aggregate values?. Industry : sum, average Not always meaningful Theory : distribution Controversial even for LOC
E N D
Alexander Serebrenik and Mark van den Brand Theil index for aggregation of software metrics values
Metrics and evolution ? / SET / W&I Measure: micro, need: macro
How can we aggregate values? • Industry: sum, average • Not always meaningful • Theory: distribution • Controversial even for LOC • Requires separate effort for each metrics • Econometry : • Measures of inequality (for wealth distribution) • Vasa et al. 2009: Gini coefficient • Not decomposable! • We: Theil coefficient / SET / W&I
Decomposition? • Groups of individuals • How can we explain inequality? • programming language, development team, application domain • How does the inequality evolve? How do the explanations evolve? • Tomeasure I we use the Theil index! • Why? There are just two decomposable indices… / SET / W&I
Evolution of the Theil index Slight increase in the index values. JBoss Debian Adempiere A huge file added and then removed “Quite stable in time, meaningful deviations” / SET / W&I
Explanation of inequality in LOC Adempiere Programming language is quite poor as an explanation. Categories/packages are better as an explanation. Most significant part of the inequality is due to the inequality within the groups. Debian Package Programming language / SET / W&I
OK… within a group. But which one? Only two languages contribute significantly to the inequality: Java and SQL Adempiere Adempiere Different packages contribute most at different versions. More and more migration scripts, some very small (3 LOC), some rather big (18800 LOC). / SET / W&I
Debian (preliminary results) The largest contributors to inequality are ANSI C – presence of header files? / SET / W&I
Conclusions • Decomposability for software metrics • provides insights in reasons for inequality • allows to compare different groups • Theil index is a decomposable inequality measure • Useful for assessing evolution • of a system as a whole • of the system subcomponents / SET / W&I