450 likes | 725 Views
Software Process Evaluation: A Machine Learning Approach. Ning Chen, Steven C.H. Hoi, Xiaokui Xiao. Nanyang Technological University, Singapore November 10, 2011. Software Processes. Procedures. Artifacts. Organizational Structures. Technologies. Policies. Outline. Introduction
E N D
Software Process Evaluation: A Machine Learning Approach Ning Chen, Steven C.H. Hoi, Xiaokui Xiao Nanyang Technological University, Singapore November 10, 2011
Software Processes • Procedures • Artifacts • Organizational Structures • Technologies • Policies
Outline • Introduction • Problem Statement • Related Work • Approach • Experiments • Conclusion & Future Work
Introduction • Background • The quality of software processes are directly related to productivity and quality. • Key challenge faced by software development organizations: software process evaluation. • Conventional methods: Questionnaire/Interview/Artifacts study
Introduction (Cont’) • Motivation • Limitation of conventional methods • Time consuming • Authority constraints • Subjective evaluation • Experienced evaluation expert
Introduction (Cont’) • Contributions • We propose a novel machine learning approach that could help practitioners to evaluate their software processes. • We present a new quantitative indicator to evaluate the quality and performance of software processes objectively. • We compare and explore different kinds of sequence classification algorithms for solving this problem.
Outline • Introduction • Problem Statement • Related Work • Approach • Experiments • Conclusion & Future Work
Problem Statement • Scope: Restrict the discussion of a software process as a systematic approach to the accomplishment of some software development tasks. (Procedure) • Goal: Evaluate the quality and performance of a software process objectively. Status ofa requirement change Example of a simple requirement change process
Problem Statement (Cont’) • Key Idea: Model a set of process executions as sequential instances, and then formulate the evaluation task as a binary classification of the sequential instances into either “normal” or “abnormal". Given a set of process executions, a quantitative measure, referred to as the “process execution qualification rate” is calculated. A sequential instance <NEW,ASSESSED,ASSIGNED,RESOLVED,VERIFIED,CLOSED> “normal” or “abnormal".
Outline • Introduction • Problem Statement • Related Work • Approach • Experiments • Conclusion & Future Work
Related Work • Software Process Models and Methodologies • High level, comprehensive, and general frameworks for software process evaluation. • SPICE (1998) • Agile (2002) • CMMI 1.3 (2010) ; SCAMPI (2011) • Software Process Validation • Measure the differences between process models and process executions. • J. E. Cook et al. Software process validation: quantitatively measuring the correspondence of a process to a model, ACM TOSEM 1999
Related Work (Cont’) • Software Process Mining • Discover explicit software process models using process mining algorithms. • J. Samalikova, et al. Toward objective software process information: experiences from a case study, Software Quality Control, 2011. • Data Mining for Software Engineering • Software specification recovery(El-Ramly et al. 2002; Lo et al. 2007) • Bug assignment task(Anvik et al. 2006) • Non-functional requirements classification(Clenland-Huang et al. 2006)
Outline • Introduction • Problem Statement • Related Work • Approach • Experiments • Conclusion & Future Work
Approach (Collecting Data) • Extracts raw data from related software repositoriesaccording to the characteristics of the software process that we intend to evaluate. Example of a simple requirement change process
Approach (Data Preprocessing) • Convert every process execution from the raw data into a sequence-based instance. • Determine an “alphabet” that consists of a set of symbolic values, each of which represents a status of some artifact or an action of some task in the software process. (N,S,A,R,V,C)
Approach (Data Preprocessing) (Cont’) • Convert each process execution from the raw data into a sequence of symbolic values. • Get a sequence database contains a set of unlabelled sequences. <N,S,A,R,V,C>
Approach (Building Sequence Classifiers) • Sampling and labeling a training data set • Feature representation • Training classifiers by machine learning algorithms.
Approach (Building Sequence Classifiers)(Cont’) • Sampling and labeling a training data set • Class labels{Normal, Abnormal} • Training data size • Assign an appropriate class label
Approach (Building Sequence Classifiers)(Cont’) • Feature representation • K-grams feature technique • Training classifiers by machine learning algorithms • Support Vector Machine (SVM) • Naïve Bayes (NB) • Decision Tree(C4.5)
Approach (Quantitative Indicator) • A New Quantitative Indicator: Process execution qualification rate (P= # actual normal sequences/total # sequences) Remark: We adopt the precision and recall values on the training set as the estimated precision and recall values to calculate the value of P
Outline • Introduction • Problem Statement • Related Work • Approach • Experiments • Conclusion & Future Work
Experiments • Experimental testbed • Evaluation scope: Four projects form a large software development center of a commercial bank in China. • The target software process under evaluation: Defect management process
Experiments (Cont’) • Experimental setup • Defect reports collected form HP Quality Center. • Get a sequence database of 2622 examples. • Collect the ground truth labels of all the sequences.
Experiment 1 Results • Main Observations: • SVM achieves the best performance. • Very positive result (Partly derived from the nature of the data)
Experiment 2 Results • Main Observations: • Increase the size of training data leads to classification performance improvement. • But the improvement becomes minor when the size of training data is larger than 20%.
Experiment 3 Results • Main Observations: • The indicator is able to estimate a fairly accurate value of P when the amount of training data is sufficient . • SVM achieve the best performance typically when the amount of training data is small.
Case Studies Results • Main Observations: • Estimated P is close to the true value. • The P indicator is able to differentiate the quality of process among different projects.
Limitation of Validation • Lack of empirical studies for our proposed approach. • Classification performance for unusual sequences are not systematically analyzed.
Outline • Introduction • Problem Statement • Related Work • Approach • Experiments • Conclusion & Future Work
Conclusion • We propose a novel quantitative machine learning approach for software process evaluation. • Preliminary experiments and case studies show that our approach is effective and promising.
Future Work • Apply our proposed approach to other more complicated software process evaluation tasks. • Compare conventional approaches with our proposed machine learning approach.
Contract: Chen Ning E-mail: hzzjucn@gmail.com
Appendix: Criteria for labeling the training set • Adhere to the principles of the software process methodologies adopted. • Ensure no ambiguity cases appear in the minimal evaluation unit. • Let experts in the organization involved.
Appendix: K-grams feature technique • Given a long sequence, a short sequence segment of any k consecutive symbols is called a k-gram. • Each sequence can be represented as a fixed-dimension vector of the frequencies of the k-grams appeared in the sequence.
Appendix: Performance Metrics • “Area Under ROC Curve”(AUC) , another metric for evaluation. • “Root Mean Square Error” (RMSE), a widely-used measure of the differences between values predicated and the actual truth values.
Appendix: Formula for calculating process execution qualification rate • In the above definition, it is important to note that both true “precision” and “recall” on the test set are unknown during the software process evaluation phase for a real-world application (unless we manually label all the test sequences).
Appendix: Defect management process • There are two kinds of nodes: “Status” node which represents the possible status of defect and the responsible role, and “Decision” node which represents the possible decisions that can affect the state change of a defect.