320 likes | 517 Views
Evaluating Software Product, Process, and Resources. We have seen many techniques and methodologies. The question now is how do we evaluate these.
E N D
Evaluating Software Product, Process,and Resources • We have seen many techniques and methodologies. The question now is how do we evaluate these. • We evaluate any entity by measuring certain attribute, or characteristic, of that entity against a predetermined goalof attaining some desirable level of the attribute. • Sometimes we would evaluate an entity, such as a technique, indirectly in terms of the result of applying that technique to another entity such as the product or the people who employed the technique.
4 Main Categories of Evaluating Techniques • Feature analysis • Survey • Case Study • Formal Experiments Increase in degree of difficulties
Feature Analysis • This is a methodology that: • Lists the desired characteristics or features • Ranks the desirability (importance) of the characteristics • Assigns a rating of these listed characteristics to the entities under evaluation. • Computes an “over-all” rating for evaluation Example of a feature analysis of testing methodology: feature rated on a scale of 1(lowest) to 5 (highest) Features Importance (1-10) method 1 method2 - - - - method n easy to learn 9 4 (4 * 9 =36) 3 5 has good tools 7 3 5 2 adopted by many 7 5 3 2 effective 9 4 3 2 total score 128110 86
Survey • This is a methodology that documentswhat hashappened or what is the existing situation. • The variables (attributes of interest) are not “manipulated” but data is collected on the attributes of interest. • The methodology includes: • Listing a set of attributes of interest. • Posing those attributes in a form of questions that are easily answerable. • Collect and tabulate the answers • Analyze the data for categorization, trends, or relationships
Case Study • A methodology where a pre-determined set of factors of interest are studied, but the factors are not necessarily “controlled.” These factors are compared between (among) a small number of “cases”. • A case study of a design methodology between small project(s) and large project(s). • Pick “typical” small and “typical” large projects (common design methodology applied to different size projects) • A case study of two design methodologies in projects • Pick two “typical” projects (common projects to study design methodologies)
Formal Experiment • A methodology that has “controlled” variables (attributes) which are manipulated to allow observations of the dependent variables (another set of attributes.) • The method involves: • Setting up a formal “hypothesis” • Design, set up, and conduct the experiment to test the hypothesis • Collect and analyze the experimental result • Decide on whether the “hypothesis” is true or false • Two factors are often important in formal experiment: • Careful selection (e.g. randomization) of the subjects or entities for the experiment • Replication of the experiments to ensure that the results is not a unique case. Multiple data sets increases the confidence on the observed results.
Which Evaluation Technique to Use? • The key factors to consider in choosing the evaluation technique: • How much control can be placed on the attributes that will affect the outcome of the evaluation. • the more control - - the more formal the evaluation technique • How much replication can be performed. • The more replication --- the more formal the evaluation • Practically speaking, • Formal experiment usually involves a limited number of subjects and is a “research in the small.” • Case study is looking at comparing against “typical” situation and is a “research in the typical.” • Survey usually polls a large number of subjects and is a “research in the large”
Some Pitfalls of Evaluation to Watch • Confounding – another factor may be causing the effect • Cause/Effect – the factor may be the result, not the cause • Chance – the result may be purely accidental • Homogeneity – all subjects having the same level of the factor • Miscalculation – incapable of calculating the correct factor level • Bias – there exist some bias in the procedure • Too short – the result may just be a temporal effect • Wrong amount – factor has an effect, but to a different degree than what’s observed • Wrong situation – factor has an effect , but under a different circumstance In evaluations, we often fall into these pitfalls because: - want a certain outcome - know the desired goal ahead of time - rush to making a judgment
Evaluation and Measurement • When evaluating, we often have to involve measurement. A metric or measure is a mapping of the attribute of interest of an entity to a mathematical system. • given a real world system : (e, relation, operation) • a mathematical system : ( n, Relation, Operation) • a metric is a mapping M such that: • M (e) → n • M (relation) → Relation • M (operation) → Operation Consider line of code (loc) as the measure for program size. Then the following may be observed for the loc metric that maps program size to integers: loc (program x) → 350 loc (program y) → 570 loc (program y “larger than” program x) → ( 570 > 350 ) loc (program x “concat” program x) → ( 350 + 570 )
Software Measurement • In looking at software measurements for evaluation we need to consider: • Measurement for Assessment by numerically characterizing the attribute of interest • Validity of the assessment measure is concerned with whether the measure really characterize the attribute of interest • Measurement for Prediction by constructing a mathematical model and the prediction procedure • Validity of the prediction is whether the prediction system accurately predicts. You will find that many of the software properties are hard to assess in that there are no metrics defined for them. (e.g. Quality, Reliability or Maintainability metrics)
Evaluating Software Product(via Quality Property) • There are many attributes of a software product to consider when evaluating it (usually, we mean the attributes of code - - - sometimes we include other artifacts such as requirements doceument). • The attributes related to “Quality” of software have been grouped differently by different models: • Boehm • ISO 9126 • Dromey’s build your own “quality” model technique
Example of A “Quality” attribute Model (Boehm) Device Independence Portability Self-contained Other Utility Accuracy Reliability Completeness Robustness/ Integrity Efficiency Consistency Accountability General Utility Original Utility Device efficiency Human Engineering Accessibility Communicativeness Testability Self-descriptiveness Understandability Structuredness Maintainability Conciseness Modifiability Legibility Augumentability
Example of “Quality” attribute Model (ISO9126) Suitability Functionality Accuracy Interoperability Security * Reliability Maturity Fault tolerance Recoverability Usability Understandability Learnability Operability * Efficiency Time behavior Resource behavior Analyzability Maintainability Changeability Stability Testability * Portability Adaptability Installability * Ξsame as Boehm’s Conformance Replaceability
Dromey’s Criticism of “Quality” Model(1996) • Believes that: “a software product’s tangible internal characteristics or properties determine its external quality attributes.” • There are 2 issues in tackling the issue of linking “tangible” (and measurable internal) properties with “intangible” (higher level external ) properties: • Many properties seem to influence the “quality’ of software. • There is little evidence that indicates which lower level properties affect the higher level quality properties
Dromey’s Technique of building Quality Model • Pick the software component (e.g. code, requirements, etc.) • Identify a set of “high” level quality properties (e.g. from ISO 9126 as a starter). • Identify the product component (e.g. for code – data variables, assignment statements, etc. ) • Identify “quality carrying” properties associated with this component (e.g. what properties affect the quality of a data variable? – “is it correctly defined” property is tangible for data variable) • Link the lower tangible property to the higher level quality property. (e.g “correct definition” may linked to a higher level quality property of “reliability” or “maintainability”) • Refine and improve the model
What are the metrics for the lower properties that were linked to the higher software property Quality? Unless we can measure software quality, we will not be able to properly assess nor predict Software Quality.
Evaluating Software Product(via Reuse property) • Reusability property of software is a high level property like the quality property. • Software Reuse is the “repeated use of any part of a software system.” • Code • Design • Test cases • etc. • There are also re-use “producer’ and re-use “consumer” What motivates re-use producer? What motivates re-use consumer?
There are many ways to view Reuse • Re-use Intention: • Black-box oriented – reuse the complete software without modification • Must be assured that the re-used software functions in the expected manner • White-box oriented – reuse only portions of the software • Must understand how the software is put together so that it may be modified for re-use • Re-use Techniques: • Compositional – re-usable components are the generic “building” blocks for constructing all types of software • Re-use for cutting horizontally across the domains. • Generative – re-usable components are domain specific for building software only in that industry • Re-use for staying vertically within the domain,
Reuse may be Inclusive • Re-use does not have to be limited to only software artifacts. • Many other substances may be “re-used” • People • Process • Methodology • Project Plan • Project data • etc.
Re-use History • Peppered with “good” intentions and some anecdotal information. • There seems to be several inhibitors: • Motivation for re-use providers and consumers • What are their goals for re-use? • Clear metrics of re-usable items (functionality, quality, performance, etc.) • What are the sub-attributes of software product re-use? • How do we measure them? • An automated system to identify, store, retrieve re-usable items.
Evaluating Processes • Software development and support all utilizes some process: • Ad hoc • Semi-formal • Formally defined and strictly adhered • We are interested in process so that it could be improved. • The underlying belief and assumption is that good process leads to good product. There are a lot of indications that support the above assumption that good process leads to good product. Problem is that we do not have any hard measurements. (How do you measure process?)
Evaluating Process (cont.) • Many large projects conduct a project post-mortem to objectively assess: • People • Process • Product • Identify areas of strength • Identify areas of weakness • Identify areas for future improvement • Project history must be kept for any post-mortem to make sense. What data must be kept has to be first: - identified, - defined and then - assign resource to record the data How many software projects do believe performs these at the inception of a project?
Process Assessment • Several Organizations help in evaluating software process capabilities of an enterprise. • SEI (Software Engineering Institute) • CMM • CMMI • ISO (International Standards Organization) • ISO 9000-3 • SPICE (Software Process Improvement and Capability dEtermination) • Similar to CMM : mostly European and Canada
Evaluating Resources • There are many resources involved in a software project, but the main ones are: • Human resource • Monetary resource
Human Resource Assessment • SEI initially only dealt with process, but included human resource and established the People Capability Maturity Model composed of 5 levels: • Initial – organization takes no active role in developing people. • Repeatable – managers start to develop staff growth and establish basic work practices in their individual units • Defined – plan are put in place to locate and develop needed talents; people are rewarded for developing skills and meeting broader organizational goals • Managed – teams of individuals are built; quantitative performance goals are set for the teams to achieve • Optimizing – the entire organization is focused on improving both team and individual performance
Financial Resource Evaluation • All resources, other than human, may be evaluated with some form of financial assessment (e.g. the metric is $). • One popular metric is the Net Present Value (NPV) . • Calculate (estimate) the cost of investment (year to year until the end of the project). • Calculate the benefits in terms of dollars (year to year until the end of the project horizon). • Subtract the benefits from cost; assume that is $x. Compute the present value of $x at some %. If the NPV is positive, then it is a worthwhile project. • Use the NPV of different projects to compare which one is larger (or is a better investment in terms of current value).