240 likes | 413 Views
A Plethora of Paths. Eric Larson May 18, 2009 Seattle University. Paths are commonly used in static analysis techniques. Symbolic path simulation: Simulate each path with symbolic data values Issues: Path explosion Illegal paths. B. E. C. F. Paths. A. D. G. Format of Talk.
E N D
A Plethora of Paths Eric Larson May 18, 2009 Seattle University
Paths are commonly used in static analysis techniques. Symbolic path simulation: Simulate each path with symbolic data values Issues: Path explosion Illegal paths B E C F Paths A D G
Format of Talk • Research Questions • Implementation • Analysis framework • Program slicing • Path counting algorithm • Shortcomings • Results • Quantitative • Qualitative • Conclusion • Answers to the research questions • Future work
Research Questions:Single Run / Individual Operations • When employing high-quality static software bug detection techniques, is it better to analyze the entire program in a single run or to look at dangerous operations individually? • High-quality static software bug detection techniques: • catches most (ideally all) bugs • reports few (ideally none) false bug reports • Dangerous operation: Any operation that needs to be checked for potential errors. • In this study, we consider operations that access memory to be dangerous operations
Single Run / Individual Operations:Tradeoffs • Entire program: • Only one run • Most of the program is relevant • Big-O: 2n • Individual operations: • Many runs • More of the program is irrelevant (can be ignored) • Big-O: s x 2m • Key question: To what extent is m < n?
Research Questions:Program Slicing • What is the effectiveness of program slicing in reducing the number of paths? • Program slicing removes statements not relevant to the property. • Obtain path counts with different slicing criterion: • all statements (no slicing) • all dangerous statements • all dangerous statements within a function • one individual dangerous statement
Research Questions:Path Explosion • What types of tasks lead to path explosion? Is slicing more or less effective on particular tasks? • Quantitative and qualitative analysis across 15 different programs.
Analysis Framework • Uses modified version of SUDS (SCAM 2007) • Operates on the whole program • Analyzes programs written in C • Performs traditional analyses • Simplification • Control flow graph / call graph • Pointer analysis (flow-sensitive) • Data flow analysis • Program slicing (next slide) • Path counting (slide after next)
Program Slicing • Backwards, context-insensitive slicing algorithm • Prevents the slice from propagating into a function that is clearly not in the slice • Indirect uses from control statements are not part of the slice • Path counting will follow both directions regardless of condition • No attempt to make slice executable • Used for analysis only • Slicing criterion varies by experiment: • No slicing • All dangerous statements • All dangerous statements in a function • One dangerous statement
Path Counting • Control flow graph is collapsed after slicing • Path count is computed interprocedurally • Total paths is the sum of each function • Loops introduce two new paths: • One for the loop not taken • One for the loop taken once • Assumes fixed-point analysis summarizes the loop • Goto statements end a path • Not too many gotos in the programs used • Functions with gotos have a lot of paths even with this simplification
Shortcomings • Processing of loops and goto statements • Not all paths are equal • length of path • complexity of state • Intraprocedural path count depends on how the program is divided into functions • Amount of work to reduce the number of paths varies widely • Depends on factors such as loop depth
Results: Individual Statement Runs • One run for each dangerous operation • The runs are sorted by the number of paths from smallest to largest • Graphs show cumulative percentage of runs that have fewer than n paths
Qualitative Analysis • Look deeper at each program • What tasks lead to path explosion? • What does slicing do? • Example analysis – find • Function quotearg_buffer_restyled has the most paths (21 million) • Modifies and buffers a string • Many options and special character processing • After slicing, 4 million paths remain • Function consider_visiting has the second most paths • Individual runs effective for operations not either of the above two functions • See the paper for analysis of the other 14 programs.
Qualitative Analysis • Common tasks for path explosion: • Input processing functions (often not sliced away) • Parsing functions (often not sliced away) • Stylized output functions (often sliced away) • Other program-specific tasks suffered from path explosion: • divide in bc • finite state automata conversion in flex • finding the best move in gnuchess
Conclusions • When employing high-quality static software bug detection techniques, is it better to attempt to use the entire program in a single run or to look at dangerous operations individually? • Worst case individual run ≈ single run • But there are exceptions • Individual runs were effective for many operations • Especially those that were not from a function that suffered from path explosion
Conclusion • What is the effectiveness of program slicing in reducing the number of paths? • Slicing did reduce the number of paths. • Not enough in the worst cases of path explosion. • What types of tasks lead to path explosion? Is slicing more or less effective on particular tasks? • Input processing, parsing, and stylized output functions often suffered from path explosion. • Path explosion still existed in these functions after slicing. • Slicing was helpful for stylized output functions since little to no code was dependent on its results.
Future Work • Use the results to improve static bug detection: • Looking at task-specific techniques to address path explosion. • Incorporate some level of guidance from the user • Extend the study • Address shortcomings: loops, interprocedural analysis • Programs in different languages