Object metrics. CONTENTS Definition of metric Usefulness of metrics Factors of quality Object metrics: The CK Metrics Suite Example tool Problems & Conclusions. What is a metric?.
What is a metric? • Metric is ”a quantitative measure of the degree to which a system, component, or process possesses a given attribute” (IEEE term definition) • Metric ~ measurement of an attribute or computed value from several attributes • There is metrics for processes, projects and products. My focus is on product metrics • Given software design (e.g. UML chart) or source code, what can we say about its quality?
What is a metric? • Applying metric gives numeric value for some property as a result, but that alone is not useful in general • Every metric model should define: • how it should be measured • limits: what does measured values mean (what is good/bad) • information about how metric is used to improve software quality • relation with quality factors • what can be done with software to improve result
Why metrics should be used? • Metrics give us objective information about properties of software, e.g. structure and complexity of the design • That information can be used to: • evaluate software quality • estimate duration, costs etc. of software project • evaluate development of software engineering process within company • Metrics should be used as thresholds or indicators of concern, not as absolute measures of quality • Metrics can help us to identify modules which • would benefit from code reviews • might be difficult to test and maintain • violate standards
Internal characteristics • External characteristics are difficult to measure directly • Internal characteristics are measurable and can be used as indicators for externals! • Some internal characteristics: • size: number of lines in a method, number of methods in a class, etc. • complexity of method: how many test cases are needed to test it comprehensively • cohesion: do the attributes and methods in a class achieve a single well-defined purpose? • coupling: number of connections between classes
Object orientation • Modeling of real-world concepts • Encapsulation: data + functions related to given concept within same module (class) • Information hiding: private data + public operations to use data • Inheritance: subclass inherits all properties of its superclass • Polymorphism: instance of a superclass can be replaced with an instance of its subclass • Connections between concepts: message passing
Object metrics • Some of traditional metrics are useful with oo software: • size measures for methods and classes • complexity measures for methods • comment percentage • But new metrics are necessary for oo features (inheritance, polymorphism, etc.) • Metrics suite: set of metrics to measure different aspects of software. E.g.: • MOOD (Metrics for Object Oriented Design) • The CK Metrics Suite (Chidamber & Kemerer) • Metrics Proposed by Lorenz and Kidd
The CK Metrics Suite (1994) • One of most referenced set of metrics • Six metrics measuring class size and complexity, use of inheritance, coupling between classes, cohesion of a class and collaboration between classes
CK_1: Weighted Methods per Class (WMC) • Number of methods weighted by their procedural complexity (~complexity of class). Using unity weights gives simply number of methods in a class. • WMC is indicator of the amount of effort required to implement and test a class. • As the number of methods for a class grows, it is likely to become more application specific and thus limiting possibilities for reuse • High WMC value means also greater potential impact on children • Difficult to set exact limits for metric, but WMC should be kept as low as possible • High WMC values -> split class • Easy to measure, but should we count inherited methods too?
CK_2: Lack of Cohesion in Methods (LCOM) • Cohesion measures ”togetherness” of a class: high cohesion means good class subdivision (encapsulation) • LCOM counts the sets of methods that are not related through the sharing of some of the class’s instance variables • LCOM* normalized version, range of values between 0..1 • LCOM* = 0 if every method uses all instance variables • LCOM* = 1 if every method uses only one instance variable • High values of LCOM indicate scatter in the functionality provided by a class, i.e. class attempts to provide many different objectives • Classes with high LCOM can be fault-prone and are likely to behave less predictable ways than classes with low LCOM • High LCOM -> split class
CK_2: Lack of Cohesion in Methods (LCOM) • Problems with LCOM: • Gives equal values for very different classes • Classes with ”getters & setters” (getProperty(), setProperty()) get high LCOM values although this is not an indication of a problem • Don’t even try to measure logical cohesion
CK_3: Response For a Class (RFC) • Number of methods that can be invoked in response to a message • Measure of potential communication between classes • Often computed simply by counting number of method calls at class’s method bodies (computing full transitive closure of each method is slow) • Complexity of the class increases and understandability decreases as RFC grows • Testing gets harder as RFC grows (better understanding required from a tester) • RFC can be used as indicator of required testing time • High RFC -> there could be a better class subdivision (e.g. merge classes)
CK_4: Depth of Inheritance Tree (DIT) • Maximum height of inheritance tree or level of a particular class in a tree • As DIT grows, it is likely that classes on lower level inherits lots of methods and overrides some. Thus predicting behavior for an object of a class becomes difficult. • Large DIT means greater design complexity • DIT also measures reuse via inheritance • High or low values of DIT might indicate problems with domain analysis
CK_5: Number Of Children (NOC) • Number of immediate subclasses for a class • Indicator of potential influence that class has on design • High value of NOC might indicate misuse of subclassing (=implementation inheritance instead of is-a relationship) • Class with very high NOC might be a candidate for refactoring to create more maintainable hierarchy • NOC is also measure of reuse via inheritance • NOC can be used as indicator of required testing time
CK_6: Coupling Between Objects (CBO) • coupling = class x is coupled to class y iff x uses y’s methods or instance variables (includes inheritance related coupling) • CBO for a class is a count of the number of other classes to which it is coupled • High coupling between classes means modules depend on each other too much • Independent classes are easier to reuse and extend • High coupling decreases understandability and increases complexity • High coupling makes maintenance more difficult since changes in a class might propagate to other parts of software • Coupling should be kept low, but some coupling is necessary for a functional system
Some other oo metrics • Method Inheritance Factor (MIF): ratio of inherited methods to all methods in a system. MIF=0 means inheritance mechanism is not used. Measure of reuse via inheritance. • Number of overridden operations (NOO). High value of NOO is indication of a design problem. Since subclass should be a specialization of its superclass, it should primarily extend the operations of its superclass, not replace them. • Attribute Hiding Factor (AHF): ratio of invisible (protected+private) attributes to all attributes (public+protected+private). Ideally AHF should be 1, meaning there is no public attributes. Measures the degree of information hiding. (In Java protected attributes are visible to classes in the same package and thus formula is not valid)
Example tool: JStyle • Automatic code reviewer • Checks for style, allows defining own coding conventions • Computes many traditional and object metrics • size: #classes, #methods/class, #statements/method • complexity: cyclomatic complexity of methods, WMC, bugs predicted • commenting: #comment lines/file, comment density • inheritance: reuse & specialization ratio, av.. inheritance depth, DIT, NOC • Gives knowledge about project without need to inspect code
Problems with current object metrics • Classes are designed for different purposes: GUI, collections, data records, etc. -> same limits for metrics don’t apply! • Classes should be divided into categories and give different ranges of metric’s objective values for each • Difficulty with limits: dependencies with problem domain, programming language, etc. • Definition of metric is often insufficient, e.g. what can be done to improve results • Are we measuring what we think we are? E.g. LCOM? • Lack of standards • Many measures for same property, which one to use?
Conclusions • Metrics can be useful indicators of unhealthy code and design, pointing out areas where problems are likely to occur • Unfortunately good metric results doesn’t necessarily imply good quality of software • Metrics are easily collected automatically without laborious inspection of source code • Metrics can’t reveal semantic errors of domain analysis, but structural misuse of oo constructs can be seen
