280 likes | 396 Views
EECS 354 Network Security. Reverse Engineering. Reverse Engineering. Reversing Basics Preventing Reverse Engineering Reversing High Level Languages Reversing an ELF Executable. Anything is possible. There is no computer system in existence that cannot be reverse engineered
E N D
EECS 354Network Security Reverse Engineering
Reverse Engineering • Reversing Basics • Preventing Reverse Engineering • Reversing High Level Languages • Reversing an ELF Executable
Anything is possible • There is no computer system in existence that cannot be reverse engineered • Most important limiting factors • Complexity • Time
Reversing by Language • Ruby, javascript, HTML, etc • Not compiled • Python, Java, C#, VB.NET, etc • Byte compiled • Easier to decompile/inspect • Many symbols still exist in bytecode • Code obfuscators can obstruct decompilation • C, C++ • Compiled into native machine instructions • Much harder to decompile • Still possible to reverse engineer with debugger and disassembler
Scalability of techniques • Basic reversing techniques work for small code bases • It’s possible to determine what assembly code does for a 100 line C program without too much difficulty • Do they scale up to larger code bases? • If you can crash an application or if it leaks an error message, maybe you'll get lucky
Windows • Is it possible to reverse engineer Windows? • How many lines of code does it have? • How long would it take?
Wine’s reverse engineering • The Wine project attempts to implement the windows API • Project began in 1993, still unstable and incomplete • Has over 1.4 million lines of code (written by 700 contributors) • Does not cover all of Windows (core OS, windowing, etc) • On the other hand, Samba (reverse engineering Windows file sharing) has been pretty successful
Wine's reverse engineering • Note the difference between reverse engineering and reimplementation • Wine reimplements the Windows API • Programs will expect the API to perform certain actions, return certain codes in certain conditions • Challenge is in exactly implementing this, otherwise you'll end up with undefined behavior
Why Reverse Engineering? • Defense • Security companies often reverse malware binaries • Protocol reversing for botnet analysis • Working with proprietary APIs or protocols • Hacking • Finding vulnerabilities is easier with the code
Reversing Basics • Preventing Reverse Engineering • Reversing High Level Languages • Reversing an ELF Executable
Preventing reverse engineering • Obfuscation • Translate code into something unreadable or unnatural • Must trick a human reader without tricking the machine interpreter/loader • Reverse engineering, besides in the most basic form, is combating software obfuscation
Obfuscation Techniques • Renaming functions/variables • Adding bogus code with no side-effects • Remove whitespace • Make strings/numbers hex values • Using “dynamic” code • Javascript: eval • Java: GetName, GetAttribute • Python: getattr, setattr • Most of these are reversible • Except function/variable names can’t be recovered
Obfuscation Techniques • Packing • Storing an executable as a string (or otherwise) within an executable • Can make use of compression and encryption to hide contents • Decompression or decryption code must be packed in the executable as well • Complex packers exist for most languages
Javascript Obfuscation <script>eval(unescape('%3C%64%69%76%20%73%74'))</script> <script>a = ‘t’; b = ‘er’; c = ‘a’; d = eval; e = ‘\”XSS\”’; d(c+'l'+b+a+'('+e+')'); </script>
Reversing Basics • Preventing Reverse Engineering • Reversing High Level Languages • Reversing an ELF Executable
What is byte code? • Byte code is compiled code that cannot be executed by the processor • Distinct from machine code • Architecture independent • Executed by a software interpreter: a VM, a JIT compiler, etc • Byte code is often dynamic • Symbols can be referenced at runtime • This means the program structure still exists, can be rebuilt
Decompilers • Decompilers reverse the steps taken by a compiler • Opcode translation • Abstract Syntax Tree construction • Python • Uncompyle2, decompyle, unpyc • Java • Jad, JD
Reversing Basics • Preventing Reverse Engineering • Reversing High Level Languages • Reversing an ELF Executable
Executables • Machine code is changed significantly from the original source code • Variables have been allocated to registers or somewhere in memory • Optimization steps have changed the program structure • No way to decompile this back to the original source • Machine instructions translate directly to assembly code • Disassembly analysis can be effective
Reversing Executables • We will be focusing on x86 32-bit LSB ELF executables • Runs on x86 linux • Contains ELF header, program header, section table • ELF Header contains program entry point, basic identifying information • Program header describes memory segments (e.g. where’s the stack? what parts of memory are r/w/x?) • Section table describes section layout (e.g. where’s the .rodata? .text? .bss?) • May contain a symbol table
X86 Assembly • mov • add, sub shl, shr, sar, mul, div • and, or, xor • jmp, je, jne, jl, jg, jle, jge • cmp, test • call, push, pop, ret, nop • 0x8(%esp), -0xc(%ebp)
Reversing Basics • Basic tools: • file • strings • strace (and ltrace) • objdump • readelf • tcpdump • gdb • You can reverse anything with a good debugger, but…
Reversing Frameworks • For more advanced reversing, it may help to have more than just a debugger • IDA • https://www.hex-rays.com/products/ida/pix/idalarge.gif • Radare • http://radare.org/doc/html/contents.html
ELF Obfuscation • There are some additional techniques for obfuscating executable formats: • Storing data in unusual sections: .ctors, .dtors, .init, etc • “Corrupting” the ELF header • Stripping the symbol table • Checking ptrace to prevent debuggers • Packing • Code is unpacked dynamically during execution
Demo... Source: http://crackmes.de/users/synamics/xrockmr/