Misleading Metrics and Unsound Analyses

Authors: Barbara Kitchenham, David Ross Jeffery, and Colin Connaughton Misleading Metricsand Unsound Analyses Presenter: Gil Hartman IEEE Software24(2), pp. 73-78, Mar-Apr 2007

About the authors Barbara Kitchenham - Professor of quantitative software engineering at Keele University, GB David Ross Jeffery - Professor of software engineering at the University of NSW, Australia Colin Connaughton - Metrics consultant for IBM’s Application Management Services, Sydney

Introduction • Software Project management – predicting and monitoring software development projects • Measurement is a valuable software-management support tool • Unfortunately, some of the “expert” advice can encourage the use of misleading metrics

Metrics in AMS • Data is from Application Management Services delivery group of IBM Australia • A CMM level 5 organization using standard metrics and analyses • The program was intended to confirm each project’s productivity and to set improvement targets on future projects

ISO/IEC 15939 Software Measurement Process • Indicator: Average productivity • Function: Divide project X lines of code by project Y hours of effort • Model: Compute mean and standard deviation of all project productivity values • Decision criteria: Computed confidence intervals based on the standard deviation

Non-normal data distributions • Frequency plot of the AMS productivity data over four years. • The Simple average isn’t a good estimate of a typical project’s productivity.

Productivity for application 1 • Standard deviation for all projects is very large. • The mean and standard deviations of the total data, don’t necessarily relate to a specific application.

Application 2 • What can we conclude from the standard run plot?

Scatter plot vs run chart

Scatter plot vs run chart Productivity = Function points / Effort

Application 3

Run charts • Advantages • Can identify productivity trends over time • provide a comparison with overall mean values • Disadvantages • actual productivity values are difficult to interpret • mean and standard deviation can be inflated by high-productivity values for small unimportant projects

Lessons learned - DO • Base all analysis of project data on data from similar projects • Use graphical representations of productivity data • Use the relationship between effort and size to develop regression models • Logarithmic transformations • actual effort vs predicted effort • Statistical confidence intervals

Lessons learned - DON’T • Use the mean and standard deviation for either monitoring or prediction purposes • Analyze projects that are dissimilar simply to get more data • Use any metrics that are constructed from the ratio of two independent measures unless you’re sure you understand the measure’s implications

Conclusion • Charts and metrics can sometimes be misleading. • But they often help display statistics and data in a perceptible way.

Misleading Metrics and Unsound Analyses

Misleading Metrics and Unsound Analyses

Presentation Transcript

MISLEADING STATISTICS

Misleading Graphs and Statistics

Misleading Graphs and Statistics

Misleading Graphs and Statistics

Misleading Graphs and Statistics

MISLEADING GRAPHS

MISLEADING GRAPHS

Misleading Graphs and Statistics

Misleading Graphs

Misleading Graphs

Misleading Graphs

Misleading Graphs and Statistics

Misleading Vividness

Misleading Advertisements

Misleading Prosperity

Misleading Graphs

Misleading

Sound and Unsound Documentation:

Misleading Graphs

MISLEADING GRAPHS

Misleading Results from Combining Residualized and Simple Gain Scores in Longitudinal Analyses

MISLEADING STATISTICS