7. Metrics in Reengineering Context

7. Metrics in Reengineering Context Outline • Introduction • Metrics and Measurements • Some metrics • Conclusion

The Reengineering Life-Cycle (0) requirement analysis Requirements (2a) problem detection (3) problem resolution (2b) Reverse Engineering Designs (2) Problem detection issues Academic vs. Practical (1) model capture Code (4) Code Transformation

RoadMap • Some definitions • Metrics and Measurement • Metrics for reverse engineering • Selection of OO metrics • Metrics for trends analysis • Step back and look

Why Metrics in OO Reengineering? Estimating Cost • Is it worthwhile to reengineer, or is it better to start from scratch? • Not covered in this lecture but… • Difficult: • Company should keep data: heavy process • Good data not Cobol figures for C# project

Why Metrics in OO Reengineering (ii)? • Assessing Software Quality • Which components have poor quality? (Hence could be reengineered) • Which components have good quality? (Hence should be reverse engineered)  Metrics as a reengineering tool! • Controlling the Reengineering Process • Trend analysis: which components did change? • Which refactorings have been applied?  Metrics as a reverse engineering tool!

Quantitative Quality Model (i) • Quality according to ISO 9126 standard • Divide-and conquer approach via “hierarchical quality model” • Leaves are simple metrics, measuring basic attributes  Not really useful but worth to know

Quantitative Quality Model (ii)

Product & Process Attributes

External & Internal Attributes

External vs. Internal Product Attributes

Metrics and Measurements • [Wey88] defined nine properties that a software metric should hold. Read [Fenton] for critiques. • For OO only 6 properties are really interesting [Chid 94, Fenton] • Noncoarseness: • Given a class P and a metric m, another class Q can always be found such that m (P)  m(Q) • not every class has the same value for a metric • Nonuniqueness. • There can exist distinct classes P and Q such that m(P) = m(Q) • two classes can have the same metric • Monotonicity • m(P)  m (P+Q) and m(Q)  m (P+Q), P+Q is the “combination” of the classes P and Q.

Metrics and Measurements (ii) • Design Details are Important • The specifics of a class must influence the metric value. Even if a class performs the same actions details should have an impact on the metric value. • Nonequivalence of Interaction • m(P) = m(Q)  m(P+R) = m(Q+R) where R is an interaction with the class. • Interaction Increases Complexity • m(P) + (Q) < m (P+Q). • when two classes are combined, the interaction between the too can increase the metric value Conclusion: Not every measurement is a metric.

Selecting Metrics • Fast • Scalable: you can’t afford log(n2) when n >= 1 million LOC • Precise • (e.g. #methods — do you count all methods, only public ones, also inherited ones?) • Reliable: you want to compare apples with apples • Code-based • Scalable: you want to collect metrics several times • Reliable: you want to avoid human interpretation • Simple • (e.g. average number of arguments vs. locality of data [LD = SUM |Li | / SUM |Ti |] ) • Reliable: complex metrics are hard to interpret

Metrics for Reverse Engineering • Size of the system, system entities • Class size, method size, inheritance • The intuition: a system should not contain too much big entities • really big entities may be problematic can be really difficult and complex to understand • Cohesion of the entities • Class internals • The intuition: a good system is composed by cohesive entities • Coupling between entities • Within inheritance: coupling between class-subclass • Outside of inheritance • The intuition: the coupling between entities should be limited

Inherit Class BelongTo Invoke Attribute Method Access Sample Size and Inheritance Metrics Inheritance Metrics hierarchy nesting level (HNL) # immediate children (NOC) # inherited methods, unmodified (NMI) # overridden methods (NMO) Class Size Metrics # methods (NOM) # instance attributes (NIA, NCA) # Sum of method size (WMC) Method Size Metrics # invocations (NOI) # statements (NOS) # lines of code (LOC)

Sample class Size • (NIV) • [Lore94] Number of Instance Variables (NCV) • [Lore94] Number of Class Variables (static) (NOM) • [Lore94] Number of Methods (public, private, protected) (E++, S++) • (LOC) Lines of Code • (NSC) Number of semicolons [Li93]-> number of Statements • (WMC) [Chid94] Weighted Method Count • WMC = SUM ci • where c is the complexity of a method (number of exit or McCabe Cyclomatic Complexity Metric)

Class Complexity • (RFC) Response For a Class [Chid94] • Response Set for a Class (RS) is the set of methods that can be executed in response to a message. • RS = {M} U {Ri}, RFC = | RS | • where {Ri} is the set of methods called by method i and {M} the set of all the methods in the class.

Hierarchy Layout • (HNL) [Chid94] Hierarchy Nesting Level , (DIT) [Li93] Deep of Inheritance Tree, • HNL, DIT = max hierarchy level • (NOC) [Chid94] Number of Children • (WNOC) Total number of Children • (NMO, NMA, NMI, NME) [Lore94] Number of Method Overridden, Added, Inherited, Extended (super call) • (SIX) [Lore94] • SIX (C) = NMO * HNL / NOM • Weighted percentage of Overridden Methods

Method Size • (MSG) Number of Message Sends • (LOC) Lines of Code • (MCX) Method complexity • Total Number of Complexity / Total number of methods • API calls= 5, Assignment = 0.5, arithmetics op = 2, messages with params = 3....

Sample Metrics: Class Cohesion • (LCOM) Lack of Cohesion in Methods [Chid94] for definition [Hitz95a] for critique Ii = set of instance variables used by method Mi let P = { (Ii , Ij ) | Intersection (Ii , Ij ) is Empty, Q = { (Ii , Ij ) | Intersection (Ii , Ij ) is not Empty if all the sets are empty, P is empty LCOM =|P| - |Q| if |P|>|Q| = 0 otherwise • Tight Class Cohesion (TCC) • Loose Class Cohesion (LCC) [Biem95a] for definition Measure method cohesion across invocations

Sample Metrics: Class Coupling (i) • Coupling Between Objects (CBO) [Chid94a] for definition, [Hitz95a] for a discussion • Number of other classes to which it is coupled • Data Abstraction Coupling (DAC) [Li93a] for definition • Number of ADT’s defined in a class • Change Dependency Between Classes (CDBC) [Hitz96a] for definition • Impact of changes from a server class (SC) to a client class (CC).

Sample Metrics: Class Coupling (ii) • Locality of Data (LD) [Hitz96a] for definition • LD = SUM |Li | / SUM |Ti | • Li = non public instance variables + inherited protected of superclass + static variables of the class • Ti = all variables used in Mi, except non-static local variables • Mi = methods without accessors

Metrics? Stepping Back • About the impact of the computation • Examples: • number of attributes should we count private attributes in NIV? Why not? • number of methods (private, protected, public, static, instance, operator, constructeurs, friends) • What to do? • Try first simple metrics, with simple extraction • Take care about absolute threshold • Metrics are good as a differential • Metrics should be etalonned • Do not numerically combine them: what is the multiplication of oranges and apples: Jam!

20%/80% • Take care of thresholds • Average line number in Smalltalk 7 lines • So what? • 20% outliers for 80% ok

“Define your own” Quality Model • Define the quality model with the development team • Team chooses the characteristics, design principles, internal product metrics... • ... and the thresholds

Conclusion: Metrics for Quality Assessment • Can internal product metrics reveal which components have good/poor quality? • Yes, but... • Not reliable • false positives: “bad” measurements, yet good quality • false negatives: “good” measurements, yet poor quality • Heavy Weight Approach • Requires team to develop (customize?) a quantitative quality model • Requires definition of thresholds (trial and error) • Difficult to interpret • Requires complex combinations of simple metrics • However... • Cheap once you have the quality model and the thresholds • Good focus (± 20% of components are selected for further inspection) • Note: focus on the most complex components first!

Trend Analysis via Change Metrics • Change Metric • Definition: difference between two metric values for the same metric and the same component in two subsequent releases of the software system • Examples: • difference between number of methods for class “Event” • in release 1.0 and 1.1 • difference between lines of code for method “Event::process()” • in release 1.0 and 1.1 • Change Assumption • Changes in metric values indicate changes in the system

Conclusion: Metrics for Trend Analysis Can internal product metrics reveal which components have been changed?

Identifying Refactorings via Change Metrics • Refactorings Assumption • Decreases (or Increases) in metric values indicate restructuring • Basic Principle of “Identify Refactorings” Heuristics • Use one change metric as an indicator (1) • Complement with other metrics to make the analysis more precise • Include other metrics for quicker assessment of the situation before and after • (1) Most often we look for decreases in size, as most refactorings redistribute functionality by splitting components. • See “Finding Refactorings via Change Metrics” in OOPSLA’2000 Proceedings [Deme00a]

Move to Superclass, Subclass or Sibling • Recipe • Use decreases in “# methods” (NOM), “# instance attributes” (NIA) and “# class attributes” (NCA) as main indicator • Select only the cases where “# immediate children” (NOC) and “Hierarchy Nesting Level” (HNL) remains equal (otherwise it is a case of “split class”) A A B B D D C C

Conclusion: Identifying Refactorings • Can internal product metrics reveal which refactorings have been applied?

Conclusion Can metrics (1) help to answer the following questions? • Which components have good/poor quality? • Not reliably • Which components did change? • Yes • Which refactorings have been applied? • Yes (1) Metrics = Measure internal product attributes (i.e., size, inheritance, coupling, cohesion,...)

Avoid Metric Pitfalls Complexity • Complex metrics require more computation and more interpretation • Prefer simple metrics that are easy to collect and interpret Avoid thresholds • Thresholds hide the true nature of the system • Prefer browsing/visualisation as a way to filter large amounts of information • Composition • Composed metrics hide their components, which is difficult to interpret • Show composed metrics side by side with its components • Visualize metrics

7. Metrics in Reengineering Context

7. Metrics in Reengineering Context

Presentation Transcript

REENGINEERING

Reengineering

Software Reengineering

REENGINEERING AND IMPORTANCE OF REENGINEERING

Research evaluation at CWTS Meaningful metrics, evaluation in context

Context-aware management of e-services (Tempura reengineering)

Grammar in Context 2 Chapter 7

Reengineering

Reengineering:

Isaiah 7:14 in Context

A Context For Clerical Reengineering

Reengineering

Software Reengineering

Reengineering

Reengineering

Isaiah 7:14 in Context

REENGINEERING HR

REENGINEERING