130 likes | 415 Views
Joshua Cranmer. Java 6 Decompiler. Why decompile?. Source code may be lost but compiled code not Examples Accidentally deleted source code (happened to me!) Need to patch abandonware (happened to me!) Security analysis (not happened to me) . Myths of decompilation.
E N D
Joshua Cranmer Java 6 Decompiler
Why decompile? • Source code may be lost but compiled code not • Examples • Accidentally deleted source code (happened to me!) • Need to patch abandonware (happened to me!) • Security analysis (not happened to me)
Myths of decompilation • Decompilers are illegal • Just as legal as BitTorrent • If so, then why does IDA Pro exist? • Decompilation is impossible • Undecidable step is actually pre-disassembly (code v. data) • Decompilation is impractical • Based on the notion that it merely undos the steps a compiler does
Steps of decompilation • Signature recovery • Simple parser • Newer features make this more difficult • Stack analysis and variable recovery • “simple” without optimization or arbitrary scoping • Trivial decompilation • Example: fadd -> + • Control flow graph recovery • Most difficult portion • Direct translation impossible in some circumstances • Post-decompilation transformation • Changes legal syntax to sensible syntax
Signature Recovery • Signatures are stored like (Ljava/lang/Object;I)V • Generics use a syntax like (TE;)V • Proposed Java 7 features are crazier • Enums, annotations, etc. use specific bits or binary JVM attributes (relatively simple) • Completed Q1
Stack Analysis • Used to infer about variables and unroll some optimizations • Uses Static Single Assignment (a “variable” can only be assigned once) • Variables are not presently unified, making ugliness • Most work done in late Q1 and Q2
Control Flow Graph Reconstruction • Hardest part of decompiling • Worked on during Q2, Q3, and Q4 • Basic algorithm: create blocks and unify • Only unifications currently supported are if-else blocks • Couldn’t complete due to difficulty to get loops working
Example of CFG Reconstruction • Following is an if-else-block recovery A <block A> if <expression> { <block B> } else { <block C> } <block D> B C D
Post-decompilation Transformation • Not implemented • Idea is to take certain recognizable blocks of code and refactor them into common expressions • Examples: • Object.class (before Java 5) • Inner class private accessors • Bridge code • String concatenation
Future work • Code is a horrible internal mess • Probably switch to building off of other open-source projects • Better type analysis and unification (especially generics) • Allow especially CFG recovery to be generified for other types of decompilation • ??? • Profit • Send any and all questions to Pidgeot18+jbca@gmail.com