180 likes | 293 Views
Data-Flow Analysis of Program Fragments. Atanas Rountev 1 Barbara G. Ryder 1 William Landi 2 1 Department of Computer Science, Rutgers University 2 Siemens Corporate Research. http://www.prolangs.rutgers.edu. Funded by NSF grants CCR-9501761, CCR-9804065
E N D
Data-Flow Analysis of Program Fragments Atanas Rountev1 Barbara G. Ryder1 William Landi2 1 Department of Computer Science, Rutgers University 2 Siemens Corporate Research http://www.prolangs.rutgers.edu Funded by NSF grants CCR-9501761, CCR-9804065 and Siemens Corporate Research
Overview • Motivation • Theoretical model • Application for pointer alias analysis • Experimental results
Data-Flow Analysis • Information about program behavior • Defines: • Graph for the control-flow structure • Lattice L of data-flow values • Transfer functions fi :L L • Flow sensitivity: propagate data-flow values by respecting execution order of statements
Limitations of Whole-Program Analysis • Traditionally designed as whole-program analysis • Precise analyses do not scale for large programs • Incomplete programs cannot be analyzed: e.g., programs with libraries • Information may be needed only for a small part of a large program
Fragment Data-Flow Analysis • Idea: analyze a program fragment instead of a whole program • Use summary information about the rest of the program • Advantages: • Analyze fragments of large programs • Analyze incomplete programs • Analyze only the “interesting part” of the program
Questions • What is the analysis structure? • What is the relationship to whole-program analysis? • How to define and ensure safety? • What factors affect analysis cost and precision?
Model of Whole-Program Analysis • Consider only flow-sensitive analysis • Interprocedural control-flow graph: • Lattice L of data-flow values • Node transfer functions fi : L L • Solutions and safety Entry Call Call Procedure Return Return Exit
Fragment Analysis Structure • Input: fragment + whole-program information • Graph, lattice , node transfer functions • Boundary nodes: entry, call, return • Boundary entry: summary value from • Boundary call: summary function Fragment Call Call Entry Entry Call Return Exit
Fragment Analysis Safety • All possible containing programs: pProgs • Abstraction relation If , then safely abstracts x • A safe solution safely abstracts the most precise whole-program solution for every p • Sufficient requirements for analysis safety: transfer functions, boundary summaries
An Application • Initial whole-program flow-insensitive analysis • Fragment analysis input • Flow-insensitive solution • Call graph • Use flow-insensitive solution at the boundary • Two fragment pointer alias analyses
Pointer Alias Analysis • Aliases refer to the same memory location Example: p = &x; (*p,x) • Whole-program flow- and context-sensitive analysis [Landi-Ryder] • Fixed and non-fixed locations: x, s.f, *p, pg • Resolution of through-deref assignments Example: *p = 0;
Fragment Alias Analyses • Input: whole-program flow-insensitive solution • Flow-insensitive analysis: almost linear time [Steensgaard, Zhang-Ryder-Landi] • Basic analysis: assumptions at boundary • Extended analysis: include called procedures; no boundary calls
Experiments • Sun Sparc-20, 75 MHz, 352 MB • 6 data programs: 8K - 25K LOC • 12 fragments: • Cohesive subsets of procedures implementing certain functionality • Size: 2%-22% of program size, median 7% • Resolved through-deref assignments • Metric: average number of modified fixed locations
Analysis Time • Flow-insensitive analysis • Range: 2-9 s • Median: 7 s • Basic analysis • Range: 18-99 s • Median: 52 s • Extended analysis • Range: 18-187 s • Median: 85 s
Summary • Fragment analysis as an alternative to whole-program analysis • Theoretical issues of safety and feasibility • Application using inexpensive whole-program analysis • Initial experiments • Extended analysis: significant precision increase at a practical cost • Ongoing work: scalability, incomplete programs
The New Lattice • What is the set of names? • Number of names should not depend on the size of the whole program • Each whole-program name is: • preserved • ignored • represented by a placeholder • One placeholder name per equivalence class