220 likes | 334 Views
Program Slicing on Java byte-code for Locating Functional Concerns. Takashi Ishio † Ryusuke Niitani † Gail Murphy ‡ Katsuro Inoue † † Osaka University, Japan ‡ University of British Columbia, Canada. Concern Location.
E N D
Program Slicing on Java byte-code for Locating Functional Concerns Takashi Ishio† Ryusuke Niitani † Gail Murphy‡ Katsuro Inoue † † Osaka University, Japan ‡ University of British Columbia, Canada
Concern Location • A functional concern is code that helps fulfill a functional requirement. • A software maintenance task usually focuses on a functional concern. • Concern location comprises “Search and Explore.” • Search “interesting” methods • grep or other feature location tools • Explore the interaction among the methods • call graph, class hierarchy tree, cross reference
Example: Autosave function in jEdit • jEdit periodically saves the contents of text area. • A user can specify the frequency. • We can easily find Autosave class, Buffer.autosave() method and BufferIORequest.autosave() method. • How the classes and methods are interacting?
Exploring Interaction among methods • Important information: control-flow and data-flow. • Which method triggers the autosave function. • Which class has a necessary data (e.g. filename). • How a method saves the contents to a text file. • We have to read following classes: Autosave, Buffer, BufferIORequest, PerspeciveManager, VFSManager, FileVFS …
Automated Concern Location • We are trying to extract a concern graph from code fragments specified by a developer. • Our approach is based on program slicing. • Our tool is based on Soot, a Java bytecode analysis framework. Code fragments related to a functionality Program Slicing with Heuristics a program slice Slice-to-ConcernGraph Translation A concern graph
Autosave concern graph Input = Autosave.*(), Buffer.autosave(), BufferIORequest.autosave()
i = 3; • if (a > 0) { • print i; • } definition data dependence control dependence <3,i> use Program Slicing • Slicing extracts statements related to criteria statements specified by a user. • A program P is converted to a program dependence graph. • vertices: statements in P • edges: control/data dependence relations • A user specifies “slicing criteria” statements in P. • The statements are translated into “criteria vertices” in the PDG. • A program slice, a set of statements that affect or depend on criteria,is extractedby graph traversal from criteria vertices.
Slice including unrelated concerns • Slicing usually extracts many statements. • A functional unit is connected to other units by control/data-flow. • 28% on average in C program† slicing autosave_dirty flag activate set/reset Autosave UndoManager reset set CompleteWord † Binkley, D., Gold, N. and Harman, M.: An Empirical Study of Static Program Slice Size. ACM TOSEM Vol.16, No.2, Article 8, April 2007.
Slicing with Barriers • A barrier is a vertex or an edge that terminate graph traversal†. A barrier blocks graph traversal. slicing autosave_dirty flag activate set/reset Autosave UndoManager reset set CompleteWord † Krinke, J.: Slicing, Chopping, and Path Conditions with Barriers. Software Quality Journal, Vol.12, No.4, pp.339-360,December 2004.
Similarity-based Barrier • The key idea is following: if two methods are contributing to the same functionality, the methods use similar methods, fields and classes. Name Set NS(m) = a set of types, classes, methods and fields referred in m. A long name is “tokenized”. e.g. “java.io.File” “java”, “io”, “File”, “java.io.File”
Example of Similarity package org.gjt.sp.util; class IntegerArray { private int[] array; private int len; public void add(int num) { if(len >= array.length) { int[] arrayN = new int[len * 2]; System.arraycopy(array,0,arrayN,0,len); array = arrayN; } array[len++] = num; } public final int getSize() { return len; } public final void setSize(int len) { this.len = len; } } NS(IntegerArray.add) org.gjt.sp.util.IntegerArray, org, gjt, sp, util, integer, array, void, add, int, len, int[], java.lang.System, java, lang, system, arraycopy sim = 0.639 NS(IntegerArray.getSize) org.gjt.sp.util.IntegerArray, org, gjt, sp, util, integer, array, getSize, get, size, int, len NS(IntegerArray.setSize) sim = 0.801 org.gjt.sp.util.IntegerArray, org, gjt, sp, util, integer, array, len, setSize, set, size, void, int
Identifying Barriers • Program slicing is blocked at method m if m is not related to slicing criteria Similarity(m, C) ≦ threshold • A method m is related to slicing criteria if slicing criteria includes a method n such that m is similar to n. C = a set of methods that contain slicing criteria vertices.
Slicing algorithm Code fragments related to a functionality • Slicing with summary edges and barriers • defined by Horwitz • extended by Krinke • PDG based on Jimple code • “jimple” is an intermediate representation for bytecode. • 3-address code • Simple control-flow: “if” and “goto” • Independent of JVM stack operation Calculate similarity for each method Identify barriers Slicing with Barriers a program slice
Visualizing a slice as a concern graph • Concern Graph • A vertex is a class, a method or a field. • An edge represents a relation between two vertices. • call, create, check, read, write, superclass, … • We applied rule-based translation.† Slice Concern Graph v2 in m2 call v1 in m1 m2 m1 call or parameter v1 in m1 read field m1 READ obj.field † Kameda, D. and Takimoto, M.: Building Cocnern Graph Based on Program Slicing. IPSJ Transactions on Programming, Vol.46, No.11 (Pro 26), pp.45-56. in Japanese.
A graphical output with Graphviz • We omit intra-class edges in graphical format. • Detail is provided in textual format. e.g. “Autosave.setInterval(interval) calls new Timer(interval, Autosave).”
The effectiveness of barriers • Barriers reduced concern graph size: • 1000 methods 20 methods • Printable on an A3 or A4-sized paper • Comparing extracted graphs with hand-made concern graphs (not finished yet). concern graph size on 6 maintenance tasks on jEdit and our Slicer Our previous experiment is reported in: 仁井谷竜介,石尾隆,井上克郎: プログラムスライシングを用いた機能的 関心事の抽出手法の提案と実装. PPL 2007. in Japanese.
Information extracted from Java program • To construct a dependence graph • Control dependence relation • Data dependence relation • Call Graph (with dynamic binding information) • To identify barriers • a set of types, methods, fields referred in each method m • To slice the dependence graph • Mapping source code to vertices
Slicing Tool Overview Soot Framework (http://www.sable.mcgill.ca/soot/) Annotated Jimple Jimple 3-Address Code Java Class Files Control-Flow Data-flow Analysis Jimple Translator SPARK Points-to Set Analysis Call Graph Points-to Set PDG Constructor Slicing Criteria PDG Slicer Concern Graph
Our effort to implement the system • The program size • PDG Construction: 2731 LOC (without comments) • Slicing: 9296 LOC (without comments) • slicing algorithms, heuristic functions and concern graph translation • We could implement the PDG construction phase in two weeks: • One week to understand how Soot works. • The other week to implement code. • Soot enabled us to focus on the essential part of the research idea.
Advantage of Soot • A rich analysis toolkit • Soot provides control-flow and data-flow for each method. • Jimple is simpler than source code and bytecode. • Complex Java statements are simplified during compilation. Control-flow Exceptional UnitGraph Method use 1 n 1 n Unit Value Body use Data-flow is-a is-a SmartLocalDefs Stmt (Jimple code) Local Expr
Limitation of Soot • Soot is not a program analysis framework. • Soot keeps all data in memory to compile Jimple code to bytecode after the optimization. • Soot requires 2-4GB RAM to analyze jEdit and JDK. • Soot supports only the simple workflow: whole program analysis (call-graph construction) followed by local program analysis. • We cannot implement a statistics tool (whole-program analysis) that uses the result of method-local analysis.
Summary • Concern location based on program slicing • We introduced heuristics in order to extract a functional concern of interest to a developer. • Input is the same as a traditional program slicing. • Most of graphs can be printed on an A3-sized paper. • Soot framework reduced the implementation effort. • Soot is a good framework, but we hope a framework specialized for program analysis. • easy-to-learn, extensible and scalable