360 likes | 517 Views
Connecting Task to Source. Gail C. Murphy Department of Computer Science University of British Columbia. Includes joint work with: Elisa Baniassad, University of British Columbia David Notkin, University of Washington Kevin Sullivan, University of Virginia. ?. Change
E N D
Connecting Task to Source Gail C. Murphy Department of Computer ScienceUniversity of British Columbia Includes joint work with: Elisa Baniassad, University of British Columbia David Notkin, University of Washington Kevin Sullivan, University of Virginia
? Change is inevitable... Once Upon a Time...
Overview of Talk • A Typical Estimation Task • Software Reflexion ModelTechnique • A Typical Reengineering Task • Conceptual Modules Technique • Partial and ApproximateTechniques • Summary Task
A Typical Estimation Scenario • You are asked to provide, within five days, an estimate of the effort required to modify an implementation of a Unix operating system to page over a distributed network NetBSD Kernel Source Code
Boxology • Model of a Unix virtual memory subsystem drawn by a domain expert
1. State a High-Level Model • Syntactic • Multiple relations • “everyone has one or more”
2. Extract a Source Model • Use existing tools (e.g., cflow, Field, etc.) • Lightweight lexical source model extractor (Murphy/Notkin) • May contain multiple relations
3. State a Declarative Mapping Source Model EntitiesHigh-Level Model Entitiesfile=pager.c Pager file=vm_map.* VirtualAddressMaint. dir=vm func=active VMPolicy • Name source model entities using: • physical and logical software structure • regular expressions • Many-to-many mapping
Iteration • Want to investigate the data relationships? • augment the source model • update the mapping:var=queue.*active VMPolicy • recompute...
Excel: Experimental Reengineering • A Microsoft engineer computed Reflexion Models several times a day for four weeks • 120,000 calls and global variable references • map file with over 1000 entries • high-level model with 15 entities and 96 interactions • 4 minutes to compute on a 486 • Some lessons learned: • map files evolved to be larger than expected • scale places pressure on managing the information
Other Features... • Family of reflexion model systems • Parameterized by structural descriptions • Incremental computation algorithms • Typed model • Tagging and annotations to manage investigation • Used for a variety of tasks
Reengineering Scenario... Procedure main: Input Pipe Procedure sort:
Program Database • Identify variables of interest • For each variable • where is the variable declared? • where is the variable referenced? • Collate results • Repeat
Slicer • Compute backward slices on variables in pre-identified lines of code
Type Inferencer • Determine constraints on the representation of values • Can be used to identify abstract data types, detect abstraction violations, find unused variables, and determine where there are possible references to a value • The Lackwit [O’Callahan & Jackson 97] tool produces graphs summarizing how values are transmitted through a program
Software Reflexion Model • Difficult to ascertain interface of the module • No support for querying the source model • Syntactic comparison
Forming a Conceptual Module • Map lines of code to a logical module • Two ways to map the code: • by specifying line numbers (individual, ranges, etc.) • by specifying pieces of existinglogical structure (i.e., variablesor procedures) • Each module has a name • Formation can be iterative For sort, we ended up including about 24 lines in the input pipe conceptual module.
Interface Analysis Input Variables:sortalloc, main.ofp, main.minus, etc.Output Variables:main.mergeonly, sort.ofp, sortalloc, etc.Local Variables:main.files, main.nfiles, sort.filesControl Transfers:xmalloc at sort.c 1796, fillbuf at sort.c 248, etc. • Local (interface) analysis is used to summarize how the module interacts directly with the existing code
Interface Analysis... • Interface analysis is straightfoward. One twist is that the analysis is setup to be tolerant of the source model. • Source model consists of: • variable dependence relation • control transfer relation • procedure start relation May be either use-def pairs or uses & defs Two phase analysis for local variables:1. Use-def pairs: all uses & defs in module implies local variable. 2. Uses & defs: consider input/output; promote to local if all uses and defs in module.
Querying about Conceptual Modules • Once one or more conceptual modules are formed, the re-engineer typically needs to perform queries: • How do the Conceptual Modules relate to each other? • How do the Conceptual Modules relate to the existing source? • The tool provides both pre-coded queries as well as a programmable interface through which a user can code queries.
A A direct indirect def def B B use use A B A B Conceptual Module Relationships overlap contains
Programmable Interface SET common = new SET(); // Get the use-def chains for all input and local variables// of that module. Module first = (Module)Module.ModuleTable.elementAt(0); common=DefUse.GetFullUseDefChain(first); for(int i=1; i<Module.ModuleTable.size(); i++) { // Get the use-def chains for the next module Module current = (Module)Module.ModuleTable.elementAt(i); SET curr_chain = DefUse.GetFullDefUseChain(current); // Intersect the chains to determine common definition pointscommon = DefUse.INTERSECTION(common, curr_chain); } common.print();
SUIF xrefdb SUIF SUIF SUIF = tools built on SUIF provides use-def pairs SUIF = Field’s xrefdb provides uses & defs xrefdb Experience
Query Context and Form • Two parts to expressing context: • identify region of source over which to query • restrict the region for which results are reported • Conceptual Module identifies region and interface analysis summarizes local results • Form includes both input and output: • some tasks require queries over grouped items • reort results in terms of task • can use Conceptual Module structure to query against source; results are reported in terms of target structure
Partial and Approximate Techniques • Each of these characteristics can be an effective way to attack scale. • These characteristics can be combined to provide software engineers with a “smoother” means of managing source investigations. conservative approximate Bottom line for most developments is that time is money.
Task Summary Software Reflexion Model “Definitely confirmed suspicions about the structure of Excel. Further, it allowed me to pinpoint the deviations. ... very easy to ignore stuff that is not interesting and thereby focus on the part of Excel I want to know more about.” Microsoft EngineerConceptual Module “not only did the tool verify the independent nature of the ZDD functionality and allow me to rip out all that code, but, the process of using your tool forced me to analyze and understand the code in a way that I had not been doing and that ultimately it very quickly gave me the perspective I needed.” Yvonne Coady
Summary... • demonstrated benefits of task-aware program understanding techniques • current techniques are structurally task-aware • demonstrated role for approximate information • reflexion model technique makes engineer responsible • conceptual modules takes some of responsibility • goal is to get to “what-if” tools that would allow engineers to leverage, cost-effectively, connections between design and source