170 likes | 288 Views
Supporting Release Management & Quality Assurance for Object-Oriented Legacy Systems -. Lionel C. Briand Visiting Professor Simula Research Labs. Project. Simula, Telenor Erik Arisholm, Valery Buzungu. Can we help plan and manage new releases of a legacy system?.
E N D
Supporting Release Management & Quality Assurance for Object-Oriented Legacy Systems- Lionel C. Briand Visiting Professor Simula Research Labs
Project • Simula, Telenor • Erik Arisholm, Valery Buzungu
Can we help plan and manage new releases of a legacy system? • Can we help focus V&V activities on high risk parts of the legacy system? • Can we help predict the fault correction effort of a release after delivery? • Can we help assess the impact of process change? • Can we predict the risk associated with a change? => Use error and change data, code analysis
Initial Study • Fault-proneness model to focus V&V • First one in the context of a legacy, OO system • Realistic model evaluation • Cost-effectiveness analysis
Corporate learning Update fault prediction models Package in tool (tree maps) Record Release Change and Fault Correction Data Deployment & Training Perform changes Focused V&V Release Feedback: Fault-prone components Vision
COS and XRadar • COS – large telecom system, evolved over 5 years, 30-60 developers, ~130K, Java, ~2000 application classes • XRadar • Code structural metrics (e.g., control flow complexity) • Code quality (duplications, violations, style errors) • Change and fault correction data over releases 12 to 15.1 (5 releases) of COS • # changes and #fault corrections per class
Class Fault probability is a function of • Size, complexity, inheritance, and coupling of classes (XRadar, JHawk) • Amount of change performed on classes • Code quality in terms of violations, style errors, duplication • Skills and experience • Class and fault history • Interactions are also likely
Building & Assessing a Prediction Model • Logistic regression • Four releases (R1 to R4) with fault and change data • Dependent variable: Fault corrections on R3 • Explanatory variables: R2 measurements plus fault and change history for R1 • The model is applied to R4, using R2 (history) and R3 measurements
Data Analysis • Explanatory variables are log-transformed • Alleviate the outlier problem • Helps account for interactions • PCA • Univariate analysis • Multivariate analysis (Stepwise) • Balanced modeling data set • 82 observations to build the model • Cross-validation (R3) • Cost-effectiveness (R4)
PCA Results • PC1: size, import coupling, violations, duplication, style errors, change counts • PC2: number of releases, structural change measures • PC3: Cohesion • PC4: Fan-in • PC5: Class ancestors • …
Univariate Analysis Results • PC1 measures very significant • Change counts and fault corrections in previous release also significant • No inheritance or cohesion measure is significant • Nor are historic variables
Multivariate Analysis • History variables selected • Inheritance variables too • Due to interactions?
Caveats and Problems • Assumption: Most of the faults in release n are related to changes in release n-1 and to a lesser extent n-2 • It is difficult, in general, to collect data about cause-effect relationships between changes and faults • No change effort data: size measures as surrogate? • Usually several concurrent version “streams” and merges are taking place • No centralized defect tracking system
Conclusions • CE analysis seems practical • Model seems cost-effective (cost reduction ~ 29%) • History variables are very important predictors • Still much room for improvement • Currently undertaking more extensive data collection through the configuration management system