400 likes | 616 Views
Program Comprehension through Dynamic Analysis Visualization, evaluation, and a survey. Bas Cornelissen (et al.) Delft University of Technology IPA Herfstdagen, Nunspeet, The Netherlands November 26, 2008. 1. Context. Software maintenance e.g., feature requests, debugging
E N D
Program Comprehension through Dynamic AnalysisVisualization, evaluation, and a survey Bas Cornelissen (et al.) Delft University of Technology IPA Herfstdagen, Nunspeet, The Netherlands November 26, 2008 1
Context • Software maintenance • e.g., feature requests, debugging • requires understanding of the program at hand • up to 70% of effort spent on comprehension process Support program comprehension
Definitions Program Comprehension • “A person understands a program when he or she is able to • explain the program, its structure, its behavior, its effects on its operation context, and its relationships to its application domain • in terms that are qualitatively different from the tokens used to construct the source code of the program.”
Definitions (cont’d) Dynamic analysis • The analysis of the properties of a running software system Unknown system e.g., open source • Advantages • preciseness • goal-oriented • Limitations • incompleteness • scenario-dependence • scalability issues Instrumentation e.g., using AspectJ Scenario Execution (too) much data
Outline • Literature survey • Visualization I: UML sequence diagrams • Comparing reduction techniques • Visualization II: Extravis • Current work: Human factor • Concluding remarks
Why a literature survey? • Numerous papers and subfields • last decade: many papers annually • Need for a broad overview • keep track of current and past developments • identify future directions • Existing surveys (4) do not suffice • scopes restricted • approaches not systematic • collective outcomes difficult to structure
Characterizing the literature • Four facets • Activity: what is being performed/contributed? • e.g., architecture reconstruction • Target: to which languages/platforms is the approach applicable? • e.g., web applications • Method: which methods are used in conducting the activity? • e.g., formal concept analysis • Evaluation: how is the approach validated? • e.g., industrial study
Characterization Etc.
Survey results • Least common activities • surveys, architecture reconstruction • Least common target systems • multithreaded, distributed, legacy, web • Least common evaluations • industrial studies, controlled experiments, comparisons
UML sequence diagrams • Goal • visualize testcase executions as sequence diagrams • provides insight in functionalities • accurate, up-to-date documentation • Method • instrument system and testsuite • execute testsuite • abstract from “irrelevant” details • visualize as sequence diagrams
Evaluation • JPacman • Small program for educational purposes • 3 KLOC • 25 classes • Task • Change requests • addition of “undo” functionality • addition of “multi-level” functionality
Evaluation (cont’d) • Checkstyle • code validation tool • 57 KLOC • 275 classes • Task • Addition of a new check • which types of checks exist? • what is the difference in terms of implementation?
Results • Sequence diagrams are easily readable • intuitive due to chronological ordering • Sequence diagrams aid in program comprehension • supports maintenance tasks • Proper reductions/abstractions are difficult • reduce 10,000 events to 100 events, but at what cost?
Results (cont’d) • Reduction techniques: issues • which one is “best”? • which are most likely to lead to significant reductions? • which are the fastest? • which actually abstract from irrelevant details?
Trace reduction techniques • Input 1: large execution trace • up to millions of events • Input 2: maximum output size • e.g., 100 for visualiz. through UML sequence diagrams • Output: reduced trace • was reduction successful? • how fast was the reduction performed? • has relevant data been preserved?
Example technique Stack depth limitation [metrics-based filtering] • requires two passes determine maximum depth discard events above maximum depth determine depth frequencies Trace Trace 200,000 events • 0 28,450 • 13,902 • 58,444 • 29,933 • 10,004 • ... 42,352 events > depth 1 maximum output size (threshold) 50,000 events
How can we compare the techniques? • Use: • common context • common evaluation criteria • common test set Ensures fair comparison
Approach • Assessment methodology • Context • Criteria • Metrics • Test set • Application • Interpretation • need for high level knowledge • reduction success rate; performance; info preservation • output size; time spent; preservation % per type • five open source systems, one industrial • apply reductions using thresholds 1,000 thru 1,000,000 • compare side-by-side
Techniques under assessment • Subsequence summarization [summarization] • Stack depth limitation [metrics-based] • Language-based filtering [filtering] • Sampling [ad hoc]
Extravis • Execution Trace Visualizer • joint collaboration with TU/e • Goal • program comprehension through trace visualization • trace exploration, feature location, ... • address scalability issues • millions of events sequence diagrams not adequate
Evaluation: Cromod • Industrial system • Regulates greenhouse conditions • 51 KLOC • 145 classes • Trace • 270,000 events • Task • Analysis of fan-in/fan-out characteristics
Evaluation: JHotDraw • Medium-size open source application • Java framework for graphics editing • 73 KLOC • 344 classes • Trace • 180,000 events • Task • feature location • i.e., relate functionality to source code or trace fragment
Evaluation: Checkstyle • Medium-size open source system • code validation tool • 73 KLOC • 344 classes • Trace: 200,000 events • Task • formulate hypothesis • “typical scenario comprises four main phases” • initialization; AST construction; AST traversal; termination • validate hypothesis through trace analysis
Motivation • Need for controlled experiments in general • measure impact of (novel) visualizations • Need for empirical validation of Extravis in particular • only anecdotal evidence thus far • Measure usefulness of Extravis in • software maintenance • does runtime information from Extravis help?
Experimental design • Series of maintenance tasks • from high level to low level • e.g., overview, refactoring, detailed understanding • Experimental group • ±10 subjects • Eclipse IDE + Extravis • Control group • ±10 subjects • Eclipse IDE
Concluding remarks • Program comprehension: important subject • make software maintenance more efficient • Difficult to evaluate and compare • due to human factor • Many future directions • several of which have been addressed by this research
Want to participate in the controlled experiment..? • Prerequisites • at least two persons • knowledge of Java • (some) experience with Eclipse • no implementation knowledge of Checkstyle • two hours to spare between December 1 and 19 • Contact me: • during lunch, or • through email: s.g.m.cornelissen@tudelft.nl