Why Software Metric Programs Fail

Why Software Metric Programs Fail Jim Pederson Software Process Improvement Network (SPIN) April 26, 2001

About the Presenter • Metrics Practitioner – not a Guru • Gurus don’t really exist • Primary background is in Aerospace • C-17, SEI level 5, December 2001 • Boeing Corporate and Site Teams and Councils • Database and tool development • Secondary experience is in Teaching • Broad exposure to all software domains • I don’t believe in making process certification a goal in itself apart from business objectives. • I won’t recommend anything that I haven’t done and can’t implement myself.

What are Metrics and Why do Them? • Supports Decision Making (or at least they should) • Must be Timely and Representative • Must be accepted by decision maker(s) • Comparison to Expectation • May be either stated or inferred • At higher levels, expectations must be quantitatively defined • Are frequently done to impress • Assessments • Internal Initiatives • Customers

A Layered Architecture View of Software Metrics

What’s Needed to do Metrics • Analysis Layer • Method or process of using data to make decisions • Presentation Layer • Metric Outputs (Charts and Reports) • Data Layer • Underlying data that supports the presentation layer • Process Layer • The process that the data represents • Must also address the capture of data from the process

Metric Analysis Layer • Typically driven by deviation from expected outcome • Process Capability vs. Tolerance vs. “Seat of the Pants” • Deviations should be documented and managed • Analysis, Disposition, Implementation, Follow-up (Plan-Do-Check-Act) • Various Quality Tools should be used to support analysis • Fishbone, Correlation Analysis, DOE (in a couple of rare cases) • Unique metrics can be used to assess corrective action • In SEI model, SEPG would play active role. • Relates project to organizational process (DPP, PCM) • Common or Repetitive Analysis can flow back to Presentation Layer • Willingness must exist on the decision maker(s) to use data • Decision maker generally refers to project or functional management. • Could also refer to work team

Metric Presentation Layer • Different Presentation for different purposes and customers • Project Metrics (most common and easiest to do) • Organizational Metrics (integrates multiple projects) • Process Metrics (crosses project or domains and is ongoing) • Expected Outcomes should be depicted along with actual data • Initially this is an intuitive process (extrapolate to end point) • Defining expected outcomes requires historical process knowledge • Application of SPC is difficult and somewhat overrated • Common SPC applications represent continuous processes • Software data varies around the project lifecycle • Some sub-process data is fairly continuous but for short periods of time • Some data (primarily cost related) is self normalizing but is driven from lower level data that isn’t. • Generally avoid percentage based tolerances

Metric Presentation Layer (cont.) • Deciding What to Measure • It’s difficult to know what’s worth measuring until you measure it • Don’t define metrics based on what is easy to measure • Establish metrics based on value to the whole • Interesting Observations • Staffing level is a simple metric that is highly predictive of project outcomes and also shows interaction between projects. • Cost metrics must be closely linked to meaningful schedules to have substance (earned value). • Early schedule slippages don’t recover – they generally get worse. • Number or iterative releases is the key variable in modeling project performance. • Most common corrective action is to re-plan project. • Bad project outcomes are frequently linked directly to bad estimating. • Estimating, planning, and status monitoring (metrics) are intertwined.

Metric Data Layer • Almost always underestimated and under scoped • From management perspective, this often can be equated to the “then a miracle occurs” block on a flow chart. • Some aspects of data collection can be done manually but this is generally undesirable • Person dependent • Recurring costs • Inaccurate data that is difficult to reproduce • Metric data automation needs to be approached as an IT development effort • Use of commercial tools doesn’t avoid this problem at all – it can make it even more complex in some respects.

Metric Data Layer – Data Sources/Types • Schedule • Generally tracked in MS Project which is easily integrated • Quality of schedule data is frequently poor • Cost • Project costs are generally captured in separate cost accounts • Progress should link to quantitative schedule data but frequently doesn’t • Defects and Change requests • Universally kept but frequently doesn’t cover much of the life cycle • Requirements / Design • Either tracked in COTS tool or spreadsheet • Can supersede certain documents by handling content as data • Configuration • Several common commercial tools • Testing • Various COTS tools and test files - can be treated as data to some extent

Metric Data Layer – Data Model Estimating Project Tracking Project Schedule (Generally MS Project) Financial Performance Data (Cost Accounting System) Requirements (RM Tool) Design Definition (RM and Modeling Tools) Coding/Implementation (CM Tool) Test Definition (RM Tool) Test Files and Results Change Requests / Defects Change Management

Metric Process Layer • The data is only as good as the underlying process • From an development perspective, and application (data layer) can only model the process. • If the process is highly inconsistent, it will greatly limit the availability and validity of metrics data and outputs. • Metrics can show process variation • Look there first when investigating variations • Process also limits the degree to which metric data can capture and reflect the entire life cycle • Metrics need to match the process • Sophisticated metrics and unsophisticated process will prove to be meaningless and a waste of time

Software MetricsState of the Practice

Current State of the Practice • Wide gap exists between theory and practice • Theoretical side seems to be focused on academic credibility • Practitioner issues are driven by organizational politics and culture • Gap is caused by failure to address the whole problem • Few people will be equally comfortable with all layers • Move away from the comfort zone

Theoretical Perspective • Driven by Academia, Consultants, and specialists from larger companies (primarily aerospace) • Focuses on presenting and analyzing data with emphasis on application of quality tools and statistical methods • Texts and presentation material on the subject tend to assume that the data exists • Tool features are heavily influenced by these specialists leading to increased complexity and underused features • Cultural and Human Factor issues have been under-addressed • Depth of problem and relation to upper management values • Theoretical perspective is weak in addressing many practical issues that impact metric deployment • Reflects a lack of “real world” experience

Practitioner Perspective • Generally someone within in an organization that has knowledge of process, statistics, and data. • Breadth of knowledge makes those to choose from limited • Generally not the project manager of functional manager • Delegated task • Practitioner generally lacks adequate authority to act on data • Practitioner faces numerous constraints • Process limitations • Lack of data infrastructure • Poor tool integration • Interface problems between functional group • Relevancy of effort is an ongoing issue • Decision maker interest in and use of data • Maintenance of data on the part of developer population • Little victories and value of simplicity

The Use and Abuse of Metrics(Common Interpretational Issues)

Common Misapplications – Product Size • Size is a key metric because it drives effort and is a normalizing data element for other metrics (I.e. defects per KSLOC) • Size can be measured different ways • SLOC, Function Point, Feature Point, other • Measuring size is inherently difficult • Expertise, existing documentation, software logic • New and modified effort is what’s important (I.e. new/modified SLOC) – not the total size • Relatively few organizations can measure new and modified output • Measuring new and modified output is always subject to a certain amount of inaccuracy • Size assessment can tend to reward inefficient design and implementation in that bigger is generally interpreted as better.

Common Misapplications - Defects • Measurement of defects is meaningless apart from some assessment of size or effort (normalization) • Process characteristics frequently limit this metric to the end of the project life cycle • Hard to use as a predictive metric • Always favors the later part of LC regardless of process sophistication • What’s included? • Peer Reviews, Design Reviews, developer found, documentation • Lower than expected totals can be deceiving • Not found, Not documented, behind schedule, plan inaccurate • Higher than expected totals can also be deceiving • Higher capture rate, greater than planned complexity, scope growth • True assessment of defects should consider phase found and severity

Common Misapplications - Progress • Progress can be assessed either from schedule or financial data • Financial progress (earned value) should be linked directly to schedule but frequently isn’t. (also referred to as Earned Value) • The less tangible the linkage, the more arbitrary the metric • Schedule progress can be approached two ways • Very detailed schedule (inch stones) • General schedule supported by detailed measure to assess progress • Either way, maintaining detailed progress information is difficult and needs the active support of the entire team • Everyone performing work must contribute data • Picking “low hanging fruit” can bias metric (avoid straight line projections) • Progress metrics are the simplest to present and are probably the most common type of metric. • They are also the hardest to support with data and can easily become highly arbitrary.

Common Misapplications - Productivity • Can be based either on dollars or hours • Hours is better because they remain constant • Undefined size growth can make productivity appear poor • Size is difficult to assess during implementation • Unclear requirements can make size very difficult to predict up front. • Inefficient design (code bloat) can make productivity appear better than it really is • Resource utilization has a sharp impact on productivity • Functions/objects are normally assigned to individuals • Work scope by function or object is rarely constant • Resources don’t directly expand or contract to fit work scope • Significant productivity increases (as much as 4x) have been realized simply by adding and/or redistributing work • Never assume full time resource availability • Unplanned tasks • Maintenance, Level of Effort, Overhead

Common Misapplications - Rework • Can be calculated a variety of ways • Actual hours by problem report • Estimated hours by PR based on PR attributes • Separate rework charge number • Tasks relative to schedule milestones (waterfall model) • None of these methods is particularly accurate and all are subject to manipulation. • Defining rework can be contentious • Non-functional vs. non-optimized • Classification of unplanned builds / releases • Some development strategies can significantly increase rework while improving overall efficiency • Re-use, auto code, auto translation

Common Misapplications - Process • Separating impact of process change from other project factors is extremely difficult. • Readily subject to manipulation • Too many independent variable • ANOVA or DOE is the only way to do this convincingly • Some process changes can be valuable but meaningless • Should start with overall process performance and drill down • Don’t look for target of opportunity and try to make it fit • Most processes are not continuous • Either start and stop or are cyclical • Process metrics cross domains and projects • Origin of data is not homogenous • Be mindful of potential sub-populations

Common Misapplications – Giving Up • Application of software metrics is difficult • Subject to inaccuracy • Subject to manipulation • Easy to become sarcastic about overall value • Yet not everything is difficult • Relatively simple yet meaningful metrics exist • Problems are not insurmountable • Knowing potential problems helps you avoid them • Appearance of use vs. actual deployment • Striving for appearance ruins credibility • Consistency vs. Accuracy • Consistency is at least as important as accuracy • Breadth of usage • Data issues tend to be self correcting when data is used

Conclusions on Why Metric Programs Succeed and Fail

Why Metric Initiatives Fail • Failure to adequately address Data Infrastructure • Assumption that data exists when it doesn’t • Failure to acknowledge resource requirements to manage data • Deployment of Commercial Tools without adequate design effort • No method to meaningfully assess product size (new and modified) or track effort with enough detail to use • Relationship of Process to Data not understood • Limitations on availability of data • Limitations on span of life cycle that data can represent • Assumption that process is consistent • Potential of tools to drive process behavior (both good and bad)

Why Metric Initiatives Fail (cont.) • Failure to overcome Cultural Barriers • Engineering / Developer population inherently hard to manage • Value placed on reacting to problems (fire fighting) as opposed to avoiding them in the first place. • Quantitative management not an organizational value • General fear of metrics (difficult to implement) • Management Support Issues • Lack of top down support (upper management) • Primary focus on certification instead of deployment • Lack of demonstrated support (middle and lower management) • Unwillingness to manage employees to implement process • Failure to do adequate planning (generally goes back to estimating) • Unwillingness to use data to make decisions • Re-focus of metric issues on data and presentation (excuses)

How Metric Initiatives Succeed • Define Success to match Organizational Goals • Know what you want out get out of metrics • Establish Reasonable Expectations • Limited successes are better than large scale failure • Match Metrics to Process • Process must be defined and relatively consistent as a starting point • Take process compliance seriously • Don’t define a more sophisticated process than you intend to implement • Have a Plan • Identify ALL tasks and develop a schedule • Assign resources and manage dependencies • Manage like a real project • Don’t forget sustaining effort to maintain metric applications

How Metric Initiatives Succeed (cont.) • Create a Common Vision • Must extend from top management through all management levels and technical leads • Any break in the chain could cause entire effort to fail • Reinforce positive behavior – create enthusiasm • Forecast projects as opposed to sell them • Create Ownership • Use of specialist is OK but don’t push the effort entirely off-line • Use the data to make decisions, question rationale of decisions • Don’t commit to the impossible to get work or maintain staffing • Encourage work team involvement • Assess reactive vs. non-reactive activities • Develop culture that values planning and quantitative decision making

Why Software Metric Programs Fail

Why Software Metric Programs Fail

Presentation Transcript

Why We Fail ?

Why Databases Fail

WHY CHILDREN FAIL

Why Projects Fail

Why Projects Fail

Why Marriages Fail

Why Databases Fail

Why cryptosystems Fail

Why Projects Fail …

WHY PLANS FAIL

Why Structures Fail

Why Nations Fail

Software Metric

Why Metric?

Why Cryptosystems Fail

Why Cryptosystems Fail

Why Structures Fail

Why Bolts Fail

Why Projects Fail

Why Projects Fail

Software metric

Why Software Projects Fail (Part I)