1.61k likes | 1.74k Views
Automatic program repair using genetic programming. Claire Le Goues February 12, 2013. How do humans fix bugs?. Mike. Mike (recent undergrad). (Mike’s project). regression test cases. (modified code). ??!. Now what?. How do humans fix bugs?. printf transformer. printf transformer.
E N D
Automatic program repair using genetic programming Claire Le Goues February 12, 2013 http://www.clairelegoues.com
How do humans fix bugs? http://www.clairelegoues.com
Mike http://www.clairelegoues.com
Mike (recent undergrad) http://www.clairelegoues.com
(Mike’s project) http://www.clairelegoues.com
regression test cases http://www.clairelegoues.com
(modified code) http://www.clairelegoues.com
??! http://www.clairelegoues.com
Now what? http://www.clairelegoues.com
How do humans fix bugs? http://www.clairelegoues.com
printf transformer http://www.clairelegoues.com
printf transformer http://www.clairelegoues.com
Input: 1 2 4 3 7 5 6 9 10 8 11 12 http://www.clairelegoues.com
Input: 1 2 4 3 7 5 6 Legend: Likely faulty. probability Maybe faulty. probability Not faulty. 9 10 8 11 12 http://www.clairelegoues.com
“Everyday, almost 300 bugs appear […] far too many for only the Mozilla programmers to handle.” • – Mozilla Developer, 2005 • Annual cost of software errors in the US: $59.5 billion (0.6% of GDP). • Average time to fix a security-critical error: 28 days. Problem: Buggy Software 10%: Everything Else 90%: Maintenance http://www.clairelegoues.com
How bad is it? http://www.clairelegoues.com
…Really? • Tarsnap: 125 spelling/style 63 harmless 11 minor • 1 major • 75/200 = 38% TP rate • $17 + 40 hours per TP http://www.clairelegoues.com
…Really? • Tarsnap: 125 spelling/style 63 harmless 11 minor • 1 major • 75/200 = 38% TP rate • $17 + 40 hours per TP http://www.clairelegoues.com
…Really? • Tarsnap: 125 spelling/style 63 harmless 11 minor • 1 major • 75/200 = 38% TP rate • $17 + 40 hours per TP http://www.clairelegoues.com
Solution: Pay Strangers http://www.clairelegoues.com
Solution: Pay Strangers http://www.clairelegoues.com
Solution: Automate http://www.clairelegoues.com
GenProg: automatic, scalable, competitive bug repair. http://www.clairelegoues.com
GenProg: automatic, scalable, competitive bug repair. http://www.clairelegoues.com
The plan • Input: program, test cases encoding required functionality, at least one (failing!) test case encoding the bug. • Approach: use genetic programming to conduct a biased, random search for a set of edits to a program that fixes a given bug. http://www.clairelegoues.com
Genetic programming: the application of evolutionary or genetic algorithms to program source code. http://www.clairelegoues.com
Genetic programming • Population of variants. • Fitness function evaluates desirability. • Desirable individuals are more likely to be selected for iteration and reproduction. • New variants created via: • Mutation • Crossover http://www.clairelegoues.com
Genetic programming • Population of variants. • Fitness function evaluates desirability. • Desirable individuals are more likely to be selected for iteration and reproduction. • New variants created via: • Mutation • Crossover ABCDEF ABADEF http://www.clairelegoues.com
Genetic programming • Population of variants. • Fitness function evaluates desirability. • Desirable individuals are more likely to be selected for iteration and reproduction. • New variants created via: • Mutation • Crossover ABCDEF ABCWVU ZYXWVU ZYXDEF http://www.clairelegoues.com
INPUT EVALUATE FITNESS DISCARD ACCEPT OUTPUT MUTATE
Secret Sauces • There are many ways to fix any bug • Upshot: coarse-grained edits. • Not every line of code is equally likely to contribute to the bug. • Upshot: use test cases to refine mutation probabilities. • Programs contain the seeds of bug repair; programmers know what they’re doing. • Upshot: do not invent new code. http://www.clairelegoues.com
EVALUATE FITNESS INPUT DISCARD ACCEPT OUTPUT MUTATE
Fitness • Compile and run each candidate: • Fitness = weighted count of passed test cases. • Heuristic: passing the bug test is worth more. • If it fails to compile, fitness = 0. • Selection and Crossover: higher fitness variants are (randomly) retained and combined into the next generation. http://www.clairelegoues.com
INPUT EVALUATE FITNESS DISCARD ACCEPT OUTPUT MUTATE
Mutation: Bird’s Eye View • Search: random (GP) search through population of patches. • Approach: compose small random edits. • Where to change? • How to change it? http://www.cs.virginia.edu/~csl9q
Mutation: Where • Use the test cases to associate bug with code: • Instrument program. • Record which statements are executed on failing vs. passing test cases. • Weight statements accordingly. • Initial weighting heuristic: • High (1.0) if S is only visited on a failed test • Low (0.1) if S is visited on both failed and passed tests • Zero (0.0) if S is not visited on any failed test. http://www.clairelegoues.com
Mutation: How • Three statement-leveledits: • delete X • replace X with Y • insert Y after X. • To mutate a variant: • Choose a random program statement S. • Choose a random edit to apply to S. • Append edit to the variant. • Replace/insert: pick Y from somewhere else in the program. http://www.clairelegoues.com
Mutation: How Coarse! Reduces search space by at least 2—10x. • Three statement-leveledits: • delete X • replace X with Y • insert Y after X. • To mutate a variant: • Choose a random program statement S. • Choose a random edit to apply to S. • Append edit to the variant. • Replace/insert: pick Y from somewhere else in the program. http://www.clairelegoues.com
Mutation: How Coarse! Reduces search space by at least 2—10x. • Three statement-leveledits: • delete X • replace X with Y • insert Y after X. • To mutate a variant: • Choose a random program statement S. • Choose a random edit to apply to S. • Append edit to the variant. • Replace/insert: pick Y from somewhere else in the program. http://www.clairelegoues.com
Mutation: How Coarse! Reduces search space by at least 2—10x. • Three statement-leveledits: • delete X • replace X with Y • insert Y after X. • To mutate a variant: • Choose a random program statement S. • Choose a random edit to apply to S. • Append edit to the variant. • Replace/insert: pick Y from somewhere else in the program. Biased by fault localization. http://www.clairelegoues.com
Mutation: How Coarse! Reduces search space by at least 2—10x. • Three statement-leveledits: • delete X • replace X with Y • insert Y after X. • To mutate a variant: • Choose a random program statement S. • Choose a random edit to apply to S. • Append edit to the variant. • Replace/insert: pick Y from somewhere else in the program. Biased by fault localization. Selection probabilities set heuristically. http://www.clairelegoues.com