200 likes | 500 Views
Thin Slicing. Manu Sridharan , Stephen J. Fink, Rastislav Bodík. What is slicing good for?. Identifies statements in code that affect a particular seed statement Debugging Code understanding This paper Argues traditional slicing captures too many statements in most cases
E N D
Thin Slicing Manu Sridharan, Stephen J. Fink, RastislavBodík
What is slicing good for? • Identifies statements in code that affect a particular seed statement • Debugging • Code understanding • This paper • Argues traditional slicing captures too many statements in most cases • Focuses on static slicing for Java (but could be applied to other languages as well)
Overview • Definition & motivation • Thin slice expansion • Computing slices • Context-sensitive algorithm • Context-insensitive algorithm • Evaluation
Traditional slicing • Traditional definition • “executable program subset in which the seed statement performs the same computation as in the original program” • Too much information to be helpful • In worst case, may include entire program! • May have irrelevant information • Statements that have no direct effect on the seed statement
class Vector { Object[]elems;int count; Vector(){elems = new Object[10];} void add(Object p){ this.elems[count++] = p; } Object get(intind){ returnthis.elems[ind]; }... } Vector readNames(InputStream input) { Vector firstNames = new Vector(); while(!eof(input)) { String fullName = readFullName(input); int space = fullName.indexOf(’ ’); String firstName = fullName.substring(0,space-1); firstNames.add(firstName); } returnfirstNames; } void printNames(Vector firstNames) { for(inti = 0; i<firstNames.size();i++) { String firstName = (String) firstNames.get(i); print(“FIRSTNAME:“ + firstName); } } void main(String[]args) { Vector firstNames = readNames(newInputStream(args[0])); SessionState s = getState(); s.setNames(firstNames); ...; SessionState t = getState(); printNames(t.getNames()); } Thin slicing looks for producer statements – statements that directly compute values for the seed statement
Thin slicing: definition • direct use • Statement s directly uses a location Liffs uses L for some computation other than a pointer derefrence • e.g. y = x.f • producer • Statement t is a producer for a seed siff 1) s = t or 2) t writes to a location directly used by some other producer • Thin slicing: A subset of the traditional slice containing ONLY producer statements
ENTER printNames SessionState s = getState() ENTER main args = args_in SessionState t = getState() for (inti = 0; i < firstNames.size(); i++) s.setNames(firstNames) call printNames firstNames_in =t.getNames() call readNames print(“FIRSTNAME:” + firstName); firstNames = firstNames_out input_in = new InputStream(args[0])) firstNames = firstNames_in call get firstName = firstNames.get_out ENTER readNames ind_in = i input = input_in firstNames_out = firstNames ind = ind_in while(!eof(input)) get_out = this.elems[ind] String fullName = readFullName(input) this_in ENTER add ENTER get call add this_out p = p_in int space = fullName.indexOf(‘ ‘) this.elems[count++] = p String firstName = fullName.substring(0, space-1) p_in = firstName
ENTER printNames SessionState s = getState() ENTER main args = args_in SessionState t = getState() for (inti = 0; i < firstNames.size(); i++) s.setNames(firstNames) call printNames firstNames_in =t.getNames() call readNames print(“FIRSTNAME:” + firstName); firstNames = firstNames_out input_in = new InputStream(args[0])) firstNames = firstNames_in call get firstName = firstNames.get_out ENTER readNames ind_in = i input = input_in firstNames_out = firstNames ind = ind_in while(!eof(input)) get_out = this.elems[ind] String fullName = readFullName(input) ENTER add ENTER get call add p = p_in int space = fullName.indexOf(‘ ‘) this.elems[count++] = p String firstName = fullName.substring(0, space-1) p_in = firstName
Categorizing statements • Categorize statements in traditional slice • Producer statements • Explainer statements • Heap-based value flow – value flow may occur through aliasing pointers in the heap • Control flow – show conditions under which producer statements execute x = new A(); z = x; y = new B(); w = x; w.f = y; if(w == z) { v = z.f; } Heap-based value flow Producer Control flow Seed
Expanding thin slices • Thin slices may be too thin • May not contain enough information • Idea: Hierarchically expand the slices to include explainer statements • In the limit → compute a traditional slice
Expanding to include control flow • A problem statement may only be executed under some condition • Observation: When considering control flow, in most cases, looking at the control statements lexically close to the statement in question is enough readFromFile(File f) { boolean open = f.isOpen(); if(!open) throw new ClosedException(); } … }
Expanding to explain heap-value flow Class File { boolean open; File(){…; this.open = true;} isOpen() {returnthis.open; } close() {…;this.open = false; } … } readFromFile(File f) { boolean open = f.isOpen(); if(!open) throw new ClosedException(); } … } main() { File f = new File(); Vector files = new Vector(); files.add(f); …; File g = (File)files.get(i); g.close(); …; File h = (File)files.get(i); readFromFile(h); } Seed #2 Producer (1) Seed #1 Producer (2)
Computing thin slices • Use SSA form for analysis (flow sensitivity) • Create a subset of the system dependence graph (SDG) • Requires precise points-to analysis • Call graph for interprocedural dependencies • Heap-based data flow • Basic SDG construction • For x=e, add edges to all statements using x excluding pointer dereferences (x.f) • For method calls, query call graph to find possible targets; add an edge from the actual parameter node to the corresponding formal parameter node
Context insensitive • Handles heap accesses by: • For a statement x.f := e, add an edge to each statement with an expression w.f on it’s right hand side, such that the pre-computed points-to analysis indicates x may alias w. • Transitive closure gives the thin slice
Context insensitive SDG construction 1 Class Entry { 2 int a, b; 3 } 4 intaddEntry(Entry x) { 5 int y = x.a + x.b; 6 return y; 7 } 8 main() { 9 Entry f = new Entry(); 10 f.a = 1; 11 f.b = 3; 12 13 List z = new List(z); 14 z.addEntry(f); 15 Entry g = z.getEntry(); 16 int j = g.a; 17 inti = addEntry(g); 18 } y = x.a + x.b f.a = 1 f.b = 3 int j = g.a
Context sensitive • Handles heap accesses by: • For a statement x.f := e, add an edge to each statement with expression w.f on its right-hand side in the same procedure such that the pre-computed points-to analysis indicates x may alias w. • Compute reachability (as in traditional slicing) to construct thin slice • Generally – can be expensive for large programs • Does not provide much improvement over context-insensitive
Context sensitive SDG construction 1 Class Entry { 2 int a, b; 3 } 4 intaddEntry(Entry x) { 5 int y = x.a + x.b; 6 return y; 7 } 8 main() { 9 Entry f = new Entry(); 10 f.a = 1; 11 f.b = 3; 12 13 List z = new List(z); 14 z.addEntry(f); 15 Entry g = z.getEntry(); 16 int j = g.a; 17 inti = addEntry(g); 18 } y = x.a + x.b f.a = 1 f.b = 3 int j = g.a
Evaluation • Context insensitive thin slice computation time insignificant with respect to computing points-to and call graph information • 6 seconds vs. 5 minutes • Context sensitive often exhausts memory for larger programs • Points-to analysis precision is KEY • On average – requires 3.3X fewer statements to identify a bug than traditional slicing