330 likes | 506 Views
Measuring process attributes. Good Estimates. Predictions are needed for software development decision-making (figure 12.1) A prediction is useful only if it is reasonably accurate, close enough to actual A prediction/estimate is a range/window; not a single number. What is an Estimate.
E N D
Good Estimates • Predictions are needed for software development decision-making (figure 12.1) • A prediction is useful only if it is reasonably accurate, close enough to actual • A prediction/estimate is a range/window; not a single number.
What is an Estimate • A prediction/estimate is a range/window; not a single number. • It is a probabilistic assessment, estimate refers to the center of the range • Formal definition: median of the distribution (Fig 12.2) • Do not set estimate as target • Estimate should be presented as a triple (the median plus upper and lower bounds/confidence intervals)
Relative error (RE) in estimate : RE = (actual value – estimated value) / actual value Mean RE for n projects: --- n RE = 1 / n ∑ REi i = 1 Evaluating Estimation Accuracy
Mean magnitude of RE for n projects: ----- n MRE = 1 / n ∑ MREi i = 1 If mean magnitude of RE is small then our predictions are good Conte, Dunsmore and Shen, acceptable level of it is <= 0.25 Evaluating Estimation Accuracy
Evaluating estimation accuracy • Measure of prediction quality,PRED(q) = k/nOut of n projects, k number of projects have mean magnitude of relative error less than or equal to q. • Eg: PRED(0.25) = 0.4747% of the predicted values fall within 25% of their actual values. • Conte, Dunsmore and Shen suggests that an estimation technique is acceptable if PRED(0.25) is at least 0.75.
Evaluating estimation accuracy • DeMarco suggests EQF (estimating quality factor) to assess the accuracy of the prediction process. • Estimates are made repeatedly throughout the process as more info is known. • Fig 12.3 • Effectiveness of the estimating process = area of the hatched region divide by D x A.
Cost Estimation: Problems and Approaches • Cost estimation normally refers to likely amount of effort, time and staffing levels required to build software. • Cost estimation and effort estimation are sometime used interchangeably.
Problems with cost estimation • The nature of the problem • Convert estimate to target, manipulate estimation parameters to fit an already-given outcome. (price-to-win) • Not encourage to collect data, thus no history records to make judgment and predictions.
Current Approaches • techniques for estimating effort and schedule: • expert opinion: estimate is made based on experts past experience • analogy: identifying a similar past project and adjusting • decomposition: divide and conquer • models: using a model relating key inputs and effort • decomposition and modeling are preferred.
Current Approaches • These techniques can either be applied bottom-up or top-down. • Bottom-up estimation begins with the lowest-level parts of product or task, and provides estimates for each. • Top-down estimation begins with the overall process or product.
Models of Effort and Cost • Two type of models: • cost models • providing direct estimates of effort or duration • often based on empirical data reflecting factors that contribute to overall cost. • input consist of one primary input (size) and a number of secondary adjustment factors (cost drivers - characteristics that are expected to influence effort or duration). • Eg: COCOMO
Models of Effort and Cost • Two type of models: • constraint models • demonstrate the relationship over time between two or more parameters of effort, duration, or staffing level • Rayleigh curve(Figure 12.4)
Regression-based models • One of the models used is E = aS bwhere a, b are parameters that are estimated by regression techniques, see Figure 12.5. • next step - identify the factors that cause variation between predicted and actual effort (eg : experience of developers). • In this way an effort adjustment factor F is obtained. • The unadjusted result is then multiplied by this factor to give the adjusted effort (E = aS b F) • F is the product of the cost driver values.
COCOMO • In 1970s, Boehm derived the constructive cost model (COCOMO) . • The original COCOMO is a collection of three models: a basic model to be applied early, an intermediate model to be applied after requirements are specified, and an advanced model to be used when design is complete.
Oiginal COCOMO: Effort • All three have the form E = aSb F E - effort in person months, S - size in thousands of delivered source instructions (KDSI)F - adjustment factor (=1 in the basic model). • The parameters a,b are dependent on the type of software, organic(data processing)embedded(real time software within a larger, hardware-based system)and semi-detached (a blend of these) see Table 12.2.
Original COCOMO: Effort • Eg 12.7 • There are 15 independent adjustment factors, see Table 12.3, i.e. F = F1 x F2 x ... x F15 used in the intermediate and advanced model. • The advanced model uses the intermediate model on the component level and then a phase-based model is used to build up an estimate for the complete project.
Oiginal COCOMO: Duration • For duration (D) the model D = a E bis used. D is duration in months, and the parameters are given in Table 12.4. • Eg 12.8
COCOMO 2.0 • An updated, three stage version COCOMO 2.0, was presented in 1995. • Based on 3 major stages of any development projects: • Stage 1, project builds prototypes to resolve high-risk issues involving user interface, interactions etc. • Estimate size in object points based on number of screens, reports and 3rd generation language components (refer to page 265, 266), reuse is taken into account.
COCOMO 2.0 • Stage 2 • employs function point as a measure of size. • Function point estimate functionality captured in the requirement. • Stage 3 • Development has begun • Can use LOC as measure of size. • Other differences between the stages can be seen in Table 12.5. • COCOMO 2.0 incorporates reuse, takes in account maintenance and breakage.
Putnam's SLIM Model • Putnam´s model assumes that the effort for software development projects is distributed similarly to a collection of Rayleigh curves, one for each major development activity, see Figure 12.6. • constructed for US Army use in 1978 to cover projects exceeding 70 KLOC.
Putnam's SLIM Model • like the COCOMO model it is based on empirical studies. • Derived from basic Rayleigh formula. • S = C K 1/3 td4/3S - size in LOC C - a technology factor (C)K - total project effort in person years (includes maintenance) td - elapsed time to delivery in years (in theory the point at which Rayleigh curve reaches a max)
Putnam's SLIM Model • assess the effect of varying delivery date on the total effort needed to complete the project. • eg: a 10% decrease in elapsed time S = C K 1/3 td4/3 = C K´ 1/3 (0.9 td) 4/3results in K´/K = 1.52, i.e., in a 52% increase in total life-cycle effort.
Putnam's SLIM Model • To estimate effort/duration, Putnam introduce: D0 = K / td3 D0 - manpower acceleration constant (12.3 for new software with many interface and interactions with other system), 15 for stand-alone system, 27 for re-implementation of existing systems)
Putnam's SLIM Model • From the 2 equations can derive: K = (S / C) 9/7 D04/7 • SLIM uses separate Rayleigh curves for design and code, test and validation, maintenance and management. • Requirement specification is not included.
Multi-project models • Effort estimates are affected by other projects • Cost can be amortized over several upcoming projects. • Reuse normally involves multiple projects
Problems with existing modeling methods • Conte group suggests that a model be considered acceptable when PRED (0.25) exceeds 0.75 but that rarely happens. • It shows model insufficiency.
Problems with existing modeling methods • Reasons: • Model structure • Many studies agree that b in the effort-duration models is about 1/3. However, there is little consensus about the effect of reducing or extending duration. • Overly complex models • Models with many parameters (eg. cost drivers) are not necessarily preferable (accuracy, subjectivity, independency, static). • Product size estimation • Size estimates in LOC are not available early in the process.
Dealing with problems of current estimation methods • use local data definitions • calibrate models in the actual environment • use independent estimation group • reduce input subjectivity • do preliminary estimate (group estimate like Delphi; estimate by analogy) and re-estimation
Dealing with problems of current estimation methods • use other than LOC for early size estimation • Function points • Specification weight metrics/bang metrics (DeMarco) • Function bang measure - based on the number of functional primitive (bubble) in a data-flow diagram. • Data bang measure - based on the number of entities in the ER model.
Dealing with problems of current estimation methods • use locally developed cost models • 5 steps in defining a local cost model (DeMarco, 1982) • Decompose cost element • Formulate cost theory • Collect data • Analyze data and evaluate model • Check model - use PRED (0.25) or other to assess acceptable accuracy