10 likes | 98 Views
Scalability of Coevolutionary Automated Software Correction. ISC PI : Dr. Daniel Tauritz, Computer Science Co-PIs : Dr. Bruce McMillin, Computer Science & Dr. Thomas Weigert, Computer Science. ISC Graduate Student : Josh Wilkerson, Computer Science.
E N D
Scalability of Coevolutionary Automated Software Correction ISC PI: Dr. Daniel Tauritz, Computer Science Co-PIs: Dr. Bruce McMillin, Computer Science & Dr. Thomas Weigert, Computer Science ISC Graduate Student: Josh Wilkerson, Computer Science Missouri S&T Natural Computation Laboratory Objective Develop system to improve the scalability of the Coevolutionary Automated Software Correction (CASC) system • ARCD Technique One: Positive/Negative Traces • Execution trace comparison • Based on work by Stephanie Forrest et al. • Forrest uses “oracle comparator” fitness function • ARCD approach based on longest common sub-string dynamic programming algorithm • Positive test case: correct output • Negative test case: incorrect output • Lines unique to negative traces contain the bug • Motivation • Automated software correction in reasonable time, regardless of program size • Search space size is proportional to the number of code elements that could possibly contribute to the bug • Approximate effect of code element count on the search space: • Exponential if no assumptions • Polynomial if corrected software is similar to source software • Fault localization techniques can be used to automatically limit the number of code elements • ARCD Technique Two: Fitness Based Suspicion • Each line of code has an associated suspicion level • Fitness is determined for a set number of test cases • Ideally an even distribution of performances • Suspicion adjustment amount is calculated based on fitness • Low/negative amount for high fitness • High/positive amount for low fitness • Execution trace is used to adjust suspicion levels appropriately • Approach • Automated Relevant Code Discovery (ARCD) system • Preprocessor for CASC • Employ ensemble of fault localization techniques to generate suspicious line set for use in CASC • Exploit fitness function • Automatically instrument program to provide execution information • ARCD Technique Three: Fitness Plot • Additional instrumentation to calculate fitness after each line • Generates a plot of fitness differentials throughout program execution • Lines which consistently cause decreases in fitness are suspected to be buggy • Determining the optimal frequency of fitness reports is a major challenge • Future Work • Continue development/testing of fitness based techniques • Develop ensemble method for combining technique results • Incorporate state of the art software engineering fault localization methods • Voting Ensemble • Each technique gets one vote per program line • Votes are applied based on technique confidence Example Sorting Pseudocode Results