320 likes | 471 Views
Source code analysis with Columbus. Quality assessment Architecture reconstruction. Árpád Beszéde s Department of Software Engineering , University of Szeged , Hungary. Who we are. One of the leading SE research groups in Hungary http://www.inf.u-szeged.hu/sed/ Competences
E N D
Source code analysis with Columbus Quality assessment Architecture reconstruction Árpád Beszédes Department of Software Engineering, University of Szeged, Hungary
Who we are • One of the leading SE research groups in Hungary • http://www.inf.u-szeged.hu/sed/ • Competences • Software quality • Software testing • Embedded systems • Networks • Open Source • .NET • Java Scalable Program Analysis - Dagstuhl Seminar
Place of source code analysis Software maintenance / evolution of large systems Processes (e.g. issue m) Source code quality Monitor Test management IT operation performance Architecture Scalable Program Analysis - Dagstuhl Seminar
References • Telecom, financial, other • Analyzed systems (1-30 MLOC) • Graphisoft ArchiCAD • Nuance-Scansoft Recognita • evoSoft • Erste Bank • Nokia S60 platform • Mozilla Firefox & Thunderbird • SUN, OpenOffice.org • Eclipse • NASA WorldWind • Etc. Scalable Program Analysis - Dagstuhl Seminar
Source code-based QA methodology architecture Continuous measurement and monitoring is needed! Scalable Program Analysis - Dagstuhl Seminar
Quality decrease of sw systems from: Roger Pressman - Software Engineering Software Engineering: A Practitioner's Approach, McGraw-Hill Scalable Program Analysis - Dagstuhl Seminar
Columbus technology • Analysis of large sw systems • In the scope of regular maintenance • C/C++/C#/Java/SQL • Quality measurements, auditing • Reverse Engineering, architecture reconstruction • One-shot assessment • Continuous monitoring Scalable Program Analysis - Dagstuhl Seminar
Some history • 1998-2001: Nokia Research Center • FrontEndART Software Ltd. • Further development • Industrial projects • Grants • 50-100 man-years Scalable Program Analysis - Dagstuhl Seminar
Main components • Robust source code parsers • Analysis methodologies • Representation metamodels designed for maintenance tasks (language schemas) • Programming interfaces (API) • Extensions: CFG, call graph, DU, support for dynamic analysis (testing), etc. • Back ends • Code measurement • Reverse engineering • Standalone or SDK integration, command-line, API • SourceAudit, SourceDoc (previously Columbus/CAN) • Monitoring subsystem: SourceInventory (was Monitor) Scalable Program Analysis - Dagstuhl Seminar
Reverse Engineering • SourceDoc tool • Creation of UML logical views of the analyzed system • Class diagrams • In standard XMI format • can be loaded e.g. into Rational Rose • Automatic generation of HTML documentation • Class-level • Hyperlinked • Design pattern usage detection • Detecting architecture-level dependencies • “Superlinking” • Among physical components (e.g. exe, dll) • E.g. function calls, includes Scalable Program Analysis - Dagstuhl Seminar
“Bad smell” detection • SourceAudit tool • Finds code fragments that • might be problematic • and error-prone, • so it needs refactoring • e.g.: Feature Envy: a function which does not use its own class, but relies on others Scalable Program Analysis - Dagstuhl Seminar
Clone detection • Finds duplicated code fragments (copy/paste, clones) • Clones can cause many hard-to-find bugs • The detection can be scaled • from exact match • to similar code fragments • Uses efficient flat-tree based recognition • Language independent Scalable Program Analysis - Dagstuhl Seminar
Code checking • Checking conformance to coding rules • Some of them indicate bugs • The rules • check the coding style • extend the warning capabilities of the compiler • check typical implementation errors • new rules can be added easily (C++ API) • General good practice rules • Company specific rules • Integration of other checkers’ results • Integration with MS Visual Studio and Eclipse • Command-line operation Scalable Program Analysis - Dagstuhl Seminar
Code measurement – metrics • All popular metrics for procedural, OO, SQL languages • CK, size, complexity, coupling, cohesion, OO-ness, etc. • Interpretation of metrics? • E.g. different types of cyclomatic complexity • The metrics-based fault predictor model selects C++ classes that are liable to errors1 • successfully tested on Mozilla [1] Gyimóthy T, Ferenc R and Siket I. Empirical Validation of Object-Oriented Metrics on Open Source Software for Fault Prediction. IEEE TSE vol.31, no.10, October 2005 Scalable Program Analysis - Dagstuhl Seminar
Connectivity to other tools • ART (RSF) • {CPP|J|CS}ML (XML) • FAMIX XMI • GXL • Maisa (Prolog) • RSF • UML XMI • VCG • Machine readable CSV • Human readable txt Scalable Program Analysis - Dagstuhl Seminar
Technical details • Own and commercial front ends • Analysis of complex systems: • Compiler wrapping technology • Standard schemas • OO-style • C++ API • High-level language independent model • Cross-module dependencies: superlinking Scalable Program Analysis - Dagstuhl Seminar
Scalability? • Limited low-level analyses • Lightweight? • Problems of other kind • Common high-level model • Relatively easy system integration • Anything else you wanted to know? Scalable Program Analysis - Dagstuhl Seminar
Monitoring subsystem • SourceInventory • The source code of the system is • downloaded automatically by the framework from a configuration management system (e.g. CVS) • analyzed, and the results are stored in a database • queries can be run and diagrams can be drawn from a web-based interface, which communicates with the database Scalable Program Analysis - Dagstuhl Seminar
Monitoring subsystem (cont.) • Automatic alerts can be issued when indicators overrun critical thresholds • metric baselines • Internally: quality assessment • Customer: continuous measurement • Public databases: Mozilla, maemo, (OpenOffice, Eclipse) Scalable Program Analysis - Dagstuhl Seminar
Screenshots (1) Scalable Program Analysis - Dagstuhl Seminar
Screenshots (2) Scalable Program Analysis - Dagstuhl Seminar
Screenshots (3) Scalable Program Analysis - Dagstuhl Seminar
Screenshots (4) Scalable Program Analysis - Dagstuhl Seminar
Screenshots (5) Scalable Program Analysis - Dagstuhl Seminar
Screenshots (6) Scalable Program Analysis - Dagstuhl Seminar
Screenshots (7) Scalable Program Analysis - Dagstuhl Seminar
Screenshots (8) Scalable Program Analysis - Dagstuhl Seminar
Screenshots (9) Scalable Program Analysis - Dagstuhl Seminar
Nokia R&D projects • History • 1998 – TED projects: C++ to UML • 2005 – ART projects: Symbian platform • Architecture reconstruction of Symbian platform • Identification of “architectural erosion” • Quality measurement of Symbian platform • Metrics • Official Symbian coding guidelines (SourceAudit) • Quality measurement of maemo platform Scalable Program Analysis - Dagstuhl Seminar
Technology extensions – 1 • SourceAudit coding rules (114) • general 'good practice' C++ coding rules • Symbian OS-specific rules • Nokia recommended rules • viability – the code will not work • reliability – the code may not work • maintainability – the code may be difficult to modify • readability – the code may be difficult to understand • reusability – the code may be less usable in conjunction with other code • convention – the code will be unconventional Scalable Program Analysis - Dagstuhl Seminar
Technology extensions – 2 • Architecture cross-component dependencies • call: inter-component function calls • include: the same header file is included by two components • resource: the consumer’s serviceClass attribute equals to the provider’s interface_uid, the consumer’s contentType attribute equals to provider's default data item, and consumer's serviceCmd attribute equals to provider's opaque_data • publish & subscribe: if a component sets a property (category and key), and an other subscribes for this property then there is a P&S dependency between the two components • Dependency metrics, e.g. xCallIn, xCallOut, xInclude • Visualization Scalable Program Analysis - Dagstuhl Seminar
Technology extensions – 3 • Build process (Symbian SDK) wrapping • Hides the original compiler with wrapper programs (e.g. gcc.exe) • After activating the SDKWrapper the project can be built as usual • creating abld.bat by ‘bldmake bldfiles’ and • building the project by ‘abld build …’ • Build scripts provided for linking and inter-component dependency computation • Difficulties: compiling resource files, excluding test components Scalable Program Analysis - Dagstuhl Seminar