300 likes | 418 Views
Making Slicing Practical: The Final Mile. William Griswold University of California, San Diego in collaboration with Leeann Bent (UCSD) & Darren Atkinson (Santa Clara University) special thanks to GrammaTech Inc . The Conundrum.
E N D
Making Slicing Practical: The Final Mile William GriswoldUniversity of California, San Diegoin collaboration with Leeann Bent (UCSD) & Darren Atkinson (Santa Clara University) special thanks to GrammaTech Inc. PASTE 2001
The Conundrum • Program slicing has been an archetype of SE analysis for 20 years [Weiser 80] • reveal hidden or dispersed program relationships • assist in reasoning about and changing them • Yet program slicers are not widely used • Why not? • Then: implementations were too slow and imprecise • Now: inadequate attention to essential SE needs
Background Static Backward Program Slice • Set of expressions and statements that may affect the value of a chosen variable referenceduring execution. [Weiser 81] void setg(int a1, int a2) { int x = 1; g = a1; if (pred(a2)) { g = x; } printf(“%d”, g); } initial slicing criterion
Benefits & Applications of Slicing Key value is its subsetting property: helps a programmer focus on the parts of the system relevant to current task • Debugging: What subset of statements might have helped produce the wrong value for g? • Evolution: What components are inputs to the one I’m evolving? Does this change break that feature? • Testing: I changed this statement, what subset of the tests do I need to rerun? Underlying dependences are also of potential value
Freddie Krueger, Tool User “Your tool can solve all sorts of problems for us. But it’ll have to analyze our entire 1 MLOC program, which is written in 4 languages and doesn’t compile right now. I want the results as fastas compilation, with an intuitive graphical display linked to the source and integrated into our IDE. I want to save the results, and have them automatically updated as I change the program. By the way, I use Windows, but some of my colleagues use Unix. It’s OK if the tool misses stuff or returns lots of data, as long as we can post-process. We just need a net win.” Our most recent study involved a 500 KLOC Fortran/C app developed on SGI’s
Fast, scalable Easy to use Flexible (caps, words…) Language independent Integrated in every env. Imprecise Allows iteration & filtering Used in many activities Renaming, restructuring, finding change points… Slow, doesn’t scale Easy to use Forward, backward, chop C; Java coming Stand-alone Precise, but slices large Allows iteration & filtering Several postulated uses grepSlicer ?
Our Experience on the Final Mile • Several years designing Sprite, a fast, scalable program slicer for C • Comparative study [2000] with GrammaTech’s CodeSurferslicerfor C, a commercial product • Other tools: design, implementation, anduser studies
Slicing Criterion Sprite 1.2.2
Why We’re Still on the Final Mile Little understanding of programmers’ needs, tasks, or how they would use a slicer • Slicers suffer from inflexibility & poor usability • User interfaces are in formative stages • Several opportunities insufficiently explored
Part 1 Usability and Flexibility • Wrote few dozen “feature benchmarks” • Unstructured c-flow, context sensitivity, pointers... • Expose intervening, interacting algorithmic factors • 5 small production-quality programs • compress, wally, ispell, ed, diff • A few larger programs using Sprite • gcc, emacs, FEM app • Varied slicer options, studied results
Uninitialized Pointers int *p; // uninitialized ... *p = x; y = *p; • Neither tool included x in the slice • No warning • Misleading (say, if you’re debugging)
Undefined Functions x = f(&y,z); • Sprite: included call, but did not propagate • CodeSurfer: also propagated to &y and z • No propagation to zif slicing on y • Both printed warning to terminal (not GUI) • Easy to not notice • Easy to forget a library or have undefined function
Library Modeling • Both modeled effects of libraries with skeletal functions int write(int fd, char *s) { return (*filesystem[fd]++ = *s);} • Onlylibc and libm, both incomplete • Only a few undefined functions for our programs • CodeSurfer’s noticeably better, more complex • Missing could impact precision, effort, or perf. • What if need Xlib???
Slicing into Callers setg(a, b); ... } void setg(int a1, int a2) { int x = 1; ... • Sprite: could not slice into callers • CodeSurfer: must slice into callers Neither customizable to produce other’s results (CodeSurfer did support single-step slicing and slicing on just control- or data-dependences)
Control Dependence Sensitivity • “Spurious” dependence can bloat slice by x10 • Indirection through array of function pointers • Sprite’s “strongly typed function calls” feature fixed gcc problems, not emacs • Modelling of control dependences (CodeSurfer) switch (ch) { case ‘a’: if (pred(e)) return; g = x + 1; break; case ‘b’: g = x + 2; break; } FuncType ops[N] = {&f1, &f2, …} ... (*ops[i])(a1, a2);
Statement Inclusion & Highlighting • Sprite: no global decls, uninitialized locals, control flow keywords • Included goto’s only if option enabled • CodeSurfer: “executable slice” highlighting • Includes syntactic sugar Right choice depends on user’s task • Omitting a statement could lead to oversight • Overwhelming with highlighting can, too
Understanding Slicing Results • Hard: “Why is this statement in the slice?” • Forgotten or incomplete libraries • “Style” of slice (into callers, declarations) • Real dependence or algorithmic artifact? • Control, data, pointer, ... • Gap betw. highlighting & underlying dependences • Querying dependence edges, points-to sets unhelpful • Answer critical to correct software change • We modified: comment out pointer assignment • Must rebuild, reslice, and compare results by hand
Summary - The Bad News • Tools had “hidden” behaviors on erroneous or incomplete programs • Could mislead programmer, hampering use • Tools had rigid notion of a slice • Each suited for some tasks, but not others • Tools required significant effort to use • Completeness of libraries • Understanding reasons for inclusion in slice Implementations of algorithms, not full tools
Summary - The Good News • Both tools supported (potentially) inaccurate analyses to remove uninteresting information • Sprite: user customization of pointer properties • CodeSurfer: I/O functions that aren’t interdependent • Suggests generalizing slicing as an SE analysis • Unfortunately, manipulating analysis can be costly • Variance between tools exposes tool options • Suggests user-customizable features • E.g., highlighting options: decls versus no-decls
Part 2 The Human–Computer Interface • Good user interfaces are crucial to effectiveness • Leverage & aid programmer’s cognitive processes • A key difference between an algorithm and a tool • Example for evolution (what I understand) • Tremendous scale • Wide dispersal of information: redundancy, many references to an element • Requirement for complete and consistent changeof the dispersed elements (e.g., rename) • Invisibility of far-away information stresses recall
Example - grep & Company • Easy to use and fast • Takes a textual pattern and target for search grep ”cursor” *.java • Lists all matching lines and highlights in editor • Next/Previous operations traverse the matches • Scalability issues: limited visibility, inflexible • Can see just a few matches in context of use • One search at a time; cannot compare results
File strips Lines matched by pattern Lines with two or more matches Maps Enhance Visibility • Highly evolved artifact for coping with scale • Spatial organization, zooming, cursors, insets, indices, folding, itineraries • Idea: augment grep with Seesoft-like view • Delineated regions to represent files • Colored symbols for matched lines in files • Puts matches on an equal footing with modules • See only portion of program, little of value • Use map metaphor fully: Aspect Browser See only portion of program, little of value Use map metaphor fully: Aspect Browser
“You are here” “Folding” cue that files are hidden Views updated on save Now extending with Atlas metaphor to accommodate vast scale
Traversal operations One-file “Seesoft” summary Cue of underlying hits CodeSurfer GUI is half-way there
Part 3: Opportunities Complements • Slice explainer • High-level analysis of why statement is in slice • May/Must highlighting • Overlay slice with highlighting dynamic slices or execution coverage • Uncovered stmts, dependences hint at imprecision • Filtering w/ other analyses and customizations • CodeSurfer has chopping; Sprite set operations • grep, Ajax tools [O’Callahan] • PIM-like customizations [Field] • With small run-time impact
Emerging Challenges • Multi-threading [Hatcliff et al.], exceptions • intrinsically flow-sensitive, so flow-insensitive pointer analysis is substantial loss [Rugina] • What is a program? What is the program? • Federated client-server apps • Often multi-language • Trend is to write component glue • Less source code, but huge apps • How treat vast, numerous packages? Sans source? • Analysis via Java library byte codes costly [O’Callahan] • Vast amount of stub code to write for libraries
Long-Standing Issues • Robustness - incomplete or buggy programs • Useful for developing and evolving systems • Complicates the analysis • Harder for programmer to interpret results • Integration into IDE’s [Hayes 2000] • Work with IDE’s GUI and AST, or else bloat • Reuse analyses across IDE’s • In principle not hard • In practice, performance/precision tradeoffs can require big rewrites for “small” language change
Recommendations & Future Work • Explore non-algorithmic aspects of precision • Attend to users’ tasks and habits • Giving users what they need is “precision” • Do an observational study of programmers • Essential problems: complex tasks of programmers • Accidental problems: kludgy user interfaces… • Listen to Michael’s talk tomorrow ;)
Recap • 20 years of slicing research • Huge strides in algorithmic precision & performance • Little adoption • “Non-algorithmic” factors like libraries have significant impact on precision • Cannot ignore, now that libs dominate applications • Remaining problem is usefulness of slicers • Confusing behaviors, inflexibility, and need forcostly “programming” are barriers to adoption • Understanding programmer’s use of dependences