210 likes | 223 Views
This talk presents an open architecture for verifying coding rules in C++. It covers topics like C++ analysis model, tool architecture, preprocessing, language issues, implementation of coding rules, and the current state of development.
E N D
C++ Code Analysis: an Open Architecture for the Verification of Coding Rules Paolo Tonella ITC-irst, Centro per la Ricerca Scientifica e Tecnologica tonella@itc.it
ITC/CERN collaboration The collaboration aims at improving the quality of the code developed at CERN, by means of: • Automatic check of coding rules. • Recovery of the design from the code. • Refactoring of the design. All objectives share a common C++ code analysis functionality.
Outline of the talk • C++ analysis model • Tool architecture • Preprocessing • Language issues • Implementation of coding rules • State of development
C++ analysis model The model of the C++ language enjoys the following properties: • Generality. • Extensibility. • Abstraction.
Tool Architecture Packages syntax and entities collaborate to generate a network of objects according to the C++ model. Package rules contains the coding conventions to be checked.
Tool architecture The adoption of this architecture provides a remarkable flexibility. • All rules relying on properties of entities in the C++ model can be directly encoded. • The C++ model can be extended if additional properties need to be collected. • Adding a new application package is simple.
Preprocessing C++ macros are expanded in the code by the preprocessor. Macros do not necessarily comply with the C++ syntax. #define BEGIN { #define END } void f() BEGIN int x = 0; ... END
Strip filter The C++ preprocessor prepends all directly and indirectly included files. The strip filter removes those that are not user defined. Moreover, the C++ preprocessor inserts some flags that are useful for the successive compilation step. Examples are: __extension__, __const__ The output of the strip filtering is a legal C++ module, that can be analyzed by the parser.
C++ language Is it a function declaration or a global object creation? A x(); C++ was conceived as an object oriented evolution of C. A strong requirement in its design was a total backward compatibility with C. C++ had also a controversial evolution in its more advanced features, like exception handling and generic classes.
Language issues To deal with the complexity of C++, it is important to distinguish between the compilation perspective and the analysis perspective. • The analyzer can assume that the input program is compilable with no errors. • The compiler needs to capture the statement level semantics. • The performances expected from the compiler are substantially superior. All these considerations led to the choice of a javacc based C++ grammar
Compatibility with C • Structures and unions are reinterpreted as classes. • Although methods are available from classes, functions are still usable. • Functions may operate on class objects, and classes may invoke functions. • Global variables violate encapsulation, but are allowed. • Types other than classes can be defined with the typedef.
Language issues (cont.) • The language model contains C as a subset. • Type equivalence affects the association between declaration and definition. Additional difficulties: • Body of methods within class definition. • Constructors, destructors, conversion functions and operators. • Encapsulation violation via friend construct. • Generic classes (template). • Exception throwing and catching.
Coding rules Adding a new coding rule involves the following steps: • A new class is defined which extends the general class Rule. • Its constructor passes the rule name and description to the superclass constructor. • A method check must be defined to implement the interface of the superclass. • The body of the method check can use the access functions of the analysis package.
Coding rule example The following coding rule is taken from the Naming Rules enforced within the CERN experiment ALICE: RN3 No special characters in names are allowed (_, #, &, @, -, %). check() { classes = Module.getClasses(); foreach (c in classes) { if (c.getName().hasChar(_, #, &, @, -, %)) printViolationMessage(...); methods = c.getMethods(); foreach (m in methods) { if (m.getName().hasChar(_, #, &, @, -, %)) printViolationMessage(...); locals = m.getLocals(); foreach (l in locals) ...
Adding new coding rules The only constraint is that a formal description of the rule can be derived, for which a procedure can be written. • It may be necessary to augment the set of entities extracted by the CPPParser. • When entities are available, rule introduction is simple. • There is a clear and sharp separation between the responsibilities of packages rules and analysis.
Current limitations Known limitations are related to the difficulties of covering the whole range of C++. • Genericity is not handled. • Exception throwing and catching is not detected. • Type equivalence is implemented only in a simplified form. Such limitations did not substantially limit the possibility of analyzing ALICE code, which does not exploits genericity and exceptions.
State of development See:http://AliSoft.cern.ch/offline/codingconv.html
State of development (cont.) Coverage of the coding conventions for which an automatic check is feasible:
Analyzed code The RuleChecker tool was successfully executed with no errors on all the code in the current release of the ALICE experiment software. A violation report was generated for each module under analysis.
Conclusion To make analysis independent from the applications using its outcomes: • a C++ language model was defined, • a simple query protocol was used to access code entities. Executed on ALICE code, the tool RuleChecker: • collected information about 85730 lines of code, • reported no parse error, • produced a violation report associated to each input module.