200 likes | 397 Views
The Superdiversifier: Peephole Individualization for Software Protection. Mariusz H. Jakubowski Prasad Naldurg Chit Wei (Nick) Saw Ramarathnam Venkatesan Microsoft Research. Matthias Jacob Nokia. International Workshop on Security: IWSEC ’08 Kagawa, Japan November 25-27, 2008.
E N D
The Superdiversifier:Peephole Individualization for Software Protection Mariusz H. Jakubowski • Prasad Naldurg Chit Wei (Nick) Saw Ramarathnam Venkatesan Microsoft Research Matthias Jacob Nokia International Workshop on Security: IWSEC ’08 Kagawa, Japan November 25-27, 2008
Introduction • Software individualization • “Different-looking” but functionally equivalent code • Diversity as a defense against attacks • Important role in both biological and man-made systems • Superoptimization • Brute-force search for shortest code sequences that implement a given function • Compiler optimization introduced by Massalin ‘87 • Goals of our work: • Leverage and extend superoptimization to individualize instruction sequences • Study superdiversification in the context of more comprehensive protecton frameworks
What Does This Do? unsigned __int64 nInput = _atoi64(argv[1]); __int64 n; n = nInput - ((nInput >> 1) & 033333333333333333333LL); n = n - ((nInput >> 2) & 011111111111111111111LL); n = n + (n >> 3); n = n & 07070707070707070707LL; n = n % 077; printf("%d\n", n);
Overview • Introduction • Background • Individualization • Superoptimization • Superdiversification • Experimental results • Applications • Conclusion Instruction-level diversity via guided search
Software Individualization • Element of software security • Defends against BORE attacks (Break Once/Run Everywhere) • Forces duplication of effort to break systems • Alleviates “software monoculture” problem • Many practical uses: • ASLR (Address Space Layout Randomization) • Secure DRM clients • Self-mutating malware • …
Individualization Schemes • Static: Individualization of program code • Algorithmic • Bubble sort quicksort • Red-black trees splay trees • Syntactic • MOV EAX,0 XOR EAX,EAX • MOV EAX,5; MOV EBX,1 MOV EBX,1; MOV EAX,5 • Dynamic: Individualization of runtime behavior • Varying paths at runtime • Variable data encoding • Self-modifying code • Byte-codes with variable semantics • …
Superoptimization • Brute-force search for shortest equivalent instruction sequence • [Massalin ‘87]: • “Startling programs have been generated, many of them engaging in convoluted bit fiddling bearing little resemblance to the source programs which defined the functions.” • “… like a typical superoptimized program, the logic is really convoluted.”
Superoptimization • Input: Instruction sequence implementing a function • Algorithm outline: • Enumerate all possible sequences up to a given length (e.g., 10 instructions). • Check for equivalence to input sequence: • Quick test: Test candidate sequence on several random inputs. • Slow test: Check Boolean equivalence of sequences (if quick test passes). • Skip sequences longer than current shortest sequence. • Quick test takes most of the computation time. • Slow test guarantees equivalence to input sequence.
Overview • Introduction • Background • Individualization • Superoptimization • Superdiversification • Experimental results • Applications • Conclusion Instruction-level diversity via guided search
The Superdiversifier • Adapt and extend superoptimization to diversify code: • Restrict set of instructions and operands allowed in search. • Guide search based on instruction frequencies occurring in real-life programs. • Use pruning techniques to cut down search time. • Accept a secret key to control the above operations. • Output any equivalent sequences, not necessarily only the shortest. • Secret key determines order of search. • Different keys may yield dramatically different equivalent sequences.
Equivalence Test Using a SAT Solver • Input: Two Boolean functions, F(x) and G(x). • Goal: Determine whether F(x) ≡ G(x). F(x) ≡ G(x) iff x, F(x) = G(x). F(x) ≡ G(x) iff x│F(x) ≠ G(x). • Thus, simply run a SAT solver on F(x) ≠ G(x) represented as a Boolean (CNF) formula. • F(x) ≡ G(x) iff F(x) ≠ G(x) is unsatisfiable.
Overview • Introduction • Background • Individualization • Superoptimization • Superdiversification • Experimental results • Applications • Conclusion Instruction-level diversity via guided search
Experimental Results Function: Swap registers Input code Sample equivalent versions
Experimental Results Function: Swap registers Input code Only arithmetic and logical instructions allowed in search. Sample equivalent versions
Experimental Results Function: Fragment of compiler-generated code Input code Sample equivalent versions Small set of constants allowed in search (may be harvested from real-life programs).
Overview • Introduction • Background • Individualization • Superoptimization • Superdiversification • Experimental results • Applications • Conclusion Instruction-level diversity via guided search
Some Applications An element of comprehensive individualization systems • Defense against signature-based attacks • Patch obfuscation • Patches reveal location of vulnerabilities. • “Patch Tuesdays” often followed by exploits. • Diffing tools locate vulnerable code quickly. • Superdiversification helps to hide patches. • Maximize size of diff between unpatched and patched applications. • For best results, diversify large sections of the patched binary, not just the patch code.
Conclusion • Main contribution: Guided search for instruction sequences to individualize binaries. • Future work • Extend range of superdiversified code. • Other types of instructions • Control-flow constructs • Optimize for better speed. • Adapt to custom byte-codes. • Modern instructions sets are geared towards generality and performance. • Custom byte-codes may be designed for individualization and obfuscation. • Instructions may perform arbitrary operations, not just serve as elementary building blocks.