460 likes | 566 Views
Slide 1. Decompiling Java Using Staged Encapsulation Welcome!. Authors: Jerome Miecznikowski (decompiler writer) Laurie Hendren (decompiler writer’s supervisor) What is Dava? Dava is our Java decompiler. The focus is to reclaim a Java control flow structure from java bytecode. Slide 2.
E N D
Slide 1 Decompiling Java Using Staged EncapsulationWelcome! • Authors: • Jerome Miecznikowski (decompiler writer) • Laurie Hendren (decompiler writer’s supervisor) • What is Dava? • Dava is our Java decompiler. • The focus is to reclaim a Java control flow structure from java bytecode.
Slide 2 Decompiling Java Using Staged EncapsulationOverview of Presentation • General Goals. • Why restructuring bytecode to Java is challenging. • Overview of Restructuring algorithm. • Walk through Example. • Advanced issues. • Testing and Results. • Conclusions and future work.
Slide 3 Decompiling Java using Staged EncapsulationGeneral Goals. • Handle any and all sources of bytecode. • “At all costs” restructuring. • Output should look natural. • If a restructuring is possible, there are many correct restructurings. • Use low number of control-flow grammar productions. • Good runtime performance. (Results are near linear)
label_0: while (a()) { while (b()) { if (c()) break label_0; } d(); // missed by break } Slide 4 Decompiling Java using Staged EncapsulationWhy restructuring to Java is interesting. • Java’s interesting properties: • No gotos, so there’s no easy fall-back solution. • Exceptions. • Multi-level breaks and continues. • Labeled control flow statements, and labeled blocks
label_0: while (a()) { while (b()) { if (c()) break label_0; } d(); // missed by break } label_0: { while (b()) { if (c()) break label_0; } d(); // missed again } Slide 5 Decompiling Java using Staged EncapsulationWhy restructuring to Java is interesting. • Java’s interesting properties: • No gotos, so there’s no easy fall-back solution. • Exceptions. • Multi-level breaks and continues. • Labeled control flow statements, and labeled blocks
CLASS GRIMP CFG SET JAVA Slide 6 Decompiling Java using Staged EncapsulationOverview of Restructuring algorithm. Original Java class file. Method int exampleMethod(int, int) 0 goto 28 3 iload_1 4 iload_2 5 idiv 6 iconst_2 7 if_icmpne 13 10 goto 41 13 iinc 1 1 16 goto 28 19 astore_3 20 getstatic #3 <Field java.io.PrintStream out> 23 ldc #4 <String "div by 0"> 25 invokevirtual #5 <Method void println(java.lang.String)> 28 iload_1 29 bipush 10 31 if_icmplt 3 34 iinc 2 -1 37 iload_2 38 ifgt 28 41 iload_1 42 ireturn Exception table: from to target type 3 16 19 <Class java.lang.ArithmeticException>
CLASS GRIMP CFG SET JAVA Slide 7 Decompiling Java using Staged EncapsulationOverview of Restructuring algorithm. public int exampleMethod(int int ) { exampleClass r0; int i0, i1; java.lang.ArithmeticException r1, $r2; r0 := @this; i0 := @parameter0; i1 := @parameter1; goto label4; label0: if i0 / i1 != 2 goto label1; goto label5; label1: i0 = i0 + 1; label2: goto label4; label3: $r2 := @caughtexception; r1 = $r2; java.lang.System.out.println("div by 0"); label4: if i0 < 10 goto label0; i1 = i1 + -1; if i1 > 0 goto label4; label5: return i0; catch java.lang.ArithmeticException from label0 to label2 with label3; } Resulting Grimp.
CLASS GRIMP CFG a SET JAVA Resulting CFG. b c i j d f e k g l h Slide 8 Decompiling Java using Staged EncapsulationOverview of Restructuring algorithm. public int exampleMethod(int int ) { exampleClass r0; int i0, i1; java.lang.ArithmeticException r1, $r2; a: r0 := @this; b: i0 := @parameter0; c: i1 := @parameter1; goto label4; label0: d: if i0 / i1 != 2 goto label1; goto label5; label1: e: i0 = i0 + 1; label2: goto label4; label3: f: $r2 := @caughtexception; g: r1 = $r2; h: java.lang.System.out.println("div by 0"); label4: i: if i0 < 10 goto label0; j: i1 = i1 + -1; k:if i1 > 0 goto label4; label5: l: return i0; catch java.lang.ArithmeticException from label0 to label2 with label3; }
CLASS GRIMP CFG a SET JAVA b c i j d f e k g l h Slide 9 Decompiling Java using Staged EncapsulationOverview of Restructuring algorithm. Resulting Structure Encapsulation Tree Top Node a b c k i d e f g h j l Stmt Sequence do-while Statement Stmt Seq a b c k i d e f g h j l while Statement Stmt Seq i d e f g h j try Statement d e f g h if Stmt Stmt Seq d e f g h Stmt Seq e
CLASS GRIMP CFG SET JAVA Slide 10 Decompiling Java using Staged EncapsulationOverview of Restructuring algorithm. Resulting Java Source public int exampleMethod(int i0, int i1) { label_0: do { while (i0 < 10) { try { if (i0 / i1 != 2) { i0 = i0 + 1; } else { break label_0; } } catch (ArithmeticException e) { System.out.println("div by 0"); } } i1 = i1 + -1; } while (i1 > 0); return i0; } Top Node a b c k i d e f g h j l Stmt Sequence do-while Statement Stmt Seq a b c k i d e f g h j l while Statement Stmt Seq i d e f g h j try Statement d e f g h if Stmt Stmt Seq d e f g h Stmt Seq e
Cycles if & switch Exceptions Stmt Seq. breaks Top Node a b c k i d e f g h j l Slide 11 Decompiling Java using Staged EncapsulationWalk through Example. a b c i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks Slide 12 Decompiling Java using Staged EncapsulationWalk through Example. • Control Flow Graph - Informal Definitions: • Dominators. If for every path from the start of the program, you have to pass through “A” to get to “B” then • “A” dominates “B” • (Note that dominance is transitive.) • Strongly Connected Component. A set of nodes in the control flow graph such that every nodes is reachable from every other node. Multiple hops to reach a node are allowed.
Cycles if & switch Exceptions Stmt Seq. breaks Slide 13 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b c i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks k i d e f g h j Slide 14 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b c i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks k i d e f g h j Slide 15 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement c i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks Slide 16 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks Slide 17 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c i d e f g h i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks Slide 18 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c while Statement i d e f g h i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks Slide 19 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c while Statement i d e f g h i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks Slide 20 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c while Statement i d e f g h i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks if Stmt d e Slide 21 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c while Statement i d e f g h i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks if Stmt d e Slide 22 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c while Statement i d e f g h i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks if Stmt d e Slide 23 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c while Statement i d e f g h i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks if Stmt d e Slide 24 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c while Statement i d e f g h i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks if Stmt d e Slide 25 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c while Statement i d e f g h i j d f e k g l h
Cycles if & switch Exceptions Stmt Seq. breaks Slide 26 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c while Statement i d e f g h i try Statement j d f d e f g h e if Stmt k g d e l h
Cycles if & switch Exceptions Stmt Seq. breaks Slide 27 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b do-while Statement k i d e f g h j c while Statement i d e f g h i try Statement j d f d e f g h e if Stmt k g de l h
Cycles if & switch Exceptions Stmt Seq. breaks Slide 28 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b Stmt Sequence do-while Statement Stmt Seq a b c k i d e f g h j l c while Statement Stmt Seq i d e f g h j i try Statement j d f d e f g h e if Stmt Stmt Seq k g d e f g h l Stmt Seq h e
Cycles if & switch Exceptions Stmt Seq. breaks Slide 29 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b Stmt Sequence do-while Statement Stmt Seq a b c k i d e f g h j l c while Statement Stmt Seq i d e f g h j i try Statement j d f d e f g h e if Stmt Stmt Seq k g d e f g h l Stmt Seq h e
Cycles if & switch Exceptions Stmt Seq. breaks Slide 30 Decompiling Java using Staged EncapsulationWalk through Example. a Top Node a b c k i d e f g h j l b Stmt Sequence do-while Statement Stmt Seq a b c k i d e f g h j l c while Statement Stmt Seq i d e f g h j i try Statement j d f d e f g h e if Stmt Stmt Seq k g d e f g h l Stmt Seq h e
Cycles if & switch Exceptions Stmt Seq. breaks Slide 31 Decompiling Java using Staged EncapsulationWalk through Example. Top Node public int exampleMethod(int i0, int i1) { label_0: do { while (i0 < 10) { try { if (i0 / i1 != 2) { i0 = i0 + 1; } else { break label_0; } } catch (ArithmeticException e) { System.out.println("div by 0"); } } i1 = i1 + -1; } while (i1 > 0); return i0; } a b c k i d e f g h j l Stmt Sequence do-while Statement Stmt Seq a b c k i d e f g h j l while Statement Stmt Seq i d e f g h j try Statement d e f g h if Stmt Stmt Seq d e f g h Stmt Seq e
Slide 32 Decompiling Java using Staged EncapsulationAdvanced issues. • Multi-entry point loops • Exceptions • Synchronized blocks • Arbitrary labeled blocks • Others …
a b c d Slide 33 Decompiling Java using Staged EncapsulationAdvanced issues. Multi-entry point loops a b c d
a b g c d Slide 34 Decompiling Java using Staged EncapsulationAdvanced issues. Multi-entry point loops a b c d
a b e f g e c d Slide 35 Decompiling Java using Staged EncapsulationAdvanced issues. Multi-entry point loops a b c d
Slide 36 Decompiling Java using Staged EncapsulationAdvanced issues. Multi-entry point loops a b c d e f g
a b a b c d e c d e f g f g Slide 37 Decompiling Java using Staged EncapsulationAdvanced issues. Multi-entry point loops
Slide 38 Decompiling Java using Staged EncapsulationAdvanced issues. Exceptions • The “try” block may be non-contiguous. • The exception’s SET node may not nest. • Solution: Divide the “try” block into as many parts as needed, such that every part is contiguous and nests well in the SET. • Caveat: Every “try” block must have its own unique catch clause. Therefore, at every divide, the catch clause must be cloned.
Slide 39 Decompiling Java using Staged EncapsulationAdvanced issues. Synchronized Blocks • Java provides object locking and monitors with synchronized() blocks. • To detect, we need a contiguous sub-graph in the CFG that is: • Entered by monitorenter, and exited by monitorexit instuctions. • Covered by an exception handler that handles all exceptions. Further, the exception handler must call monitorexit and rethrow the original exception.
Slide 40 Decompiling Java using Staged EncapsulationAdvanced issues. Synchronized Blocks Sometimes synchronized() blocks cannot represent all monitorenter and monitorexit instructions. monitorenter a monitorenter b monitorexit a monitorexit b
Slide 41 Decompiling Java using Staged EncapsulationAdvanced issues. Synchronized Blocks Sometimes synchronized() blocks cannot represent all monitorenter and monitorexit instructions. monitorenter a • Solution: • Create monitor library in Java. • Replace monitor instructions with method calls to the monitor library. monitorenter b monitorexit a monitorexit b
Slide 42 Decompiling Java using Staged EncapsulationAdvanced issues. Arbitrary Labeled Blocks Original SET A B C D E
Modified SET A D B C E Slide 43 Decompiling Java using Staged EncapsulationAdvanced issues. Arbitrary Labeled Blocks Original SET A B C D New labeled block E
Slide 44 Decompiling Java using Staged EncapsulationTesting. • Types of Testing • Simple stress testing suite. • Decompilation and recompilation of small and mid-sized applications (up to 10,000 lines of code) • Java sourced applications • Other sources ... • Ada - JGNAT • Eiffel - GNU SmallEiffel • Haskell - D. Wakeling’s Haskell to JVM-code compiler • SML - MLJ
Slide 45 Decompiling Java using Staged EncapsulationResults. * based on tests with a mobile pentium III with 128M RAM
Slide 46 Decompiling Java using Staged EncapsulationConclusions and future work. • The Structure Encapsulation Tree performs very well. • It is useful to find control flow constructs in an order determined by their features rather than their locations. • Restructuring speed nearly linear to input program size. • Results are readable and recompilable regardless of source language. • Improvements may include recognizing opportunities to create commonly used programming idioms. • It’s available now! You can download it today from: http://www.sable.mcgill.ca/~jerome/public (Email: [jerome, hendren]@sable.mcgill.ca)