210 likes | 350 Views
Address Obfuscation: An Efficient Approach to Combat a Broad Range of Memory Error Exploits. Authors: Sandeep Bhatkar, Daniel C. DuVarney, and R. Sekar Publish: Usenix Security Symposium 2003 Presented by: Hua Zhang. Contributions.
E N D
Address Obfuscation: An Efficient Approach to Combat a Broad Range of Memory Error Exploits Authors: Sandeep Bhatkar, Daniel C. DuVarney, and R. Sekar Publish: Usenix Security Symposium 2003 Presented by: Hua Zhang
Contributions • It systematically protects against a wide range of attacks which exploit memory programming errors • It can be easily applied to existing legacy code without modifying the source code, or the underlying operating systems • It can be applied selectively to protect security-critical applications without needing to change the rest of the system • The transformation is fast and introduce only a low runtime overhead
Outline • Introduction • Stack Smashing • Address Obfuscation – Transformations • Address Obfuscation – Implementation • Address Obfuscation – Effectiveness • Conclusion
Introduction • Attacks exploit memory programming errors are one of today’s most serious security threats • It requires the attacker to have an in-depth understanding of the internal details of a victim program • Program Obfuscation • A technique to prevent such a understanding • Code Obfuscation • Prevent such an understanding and reverse engineering • Address Obfuscation • Each time the transformed code is executed, the virtual addresses of the code and data are randomized.
Memory Layout of a Typical Binary • Code • Machine code • Read only • Data and BSS • Global variables • Initialized or Un-initialized • Not executable • Stack • Local variables • Parameters, Return addresses • Heap • Memory area when allocate memory during execution, e.g. malloc()
Stack Smashing • A stack-allocated buffer can be intentionally overflowed to overwrite the return address • The attacker must • Guess the right value to put into the faked return address • Guess the location of the return address on the stack relatively to the overflowed buffer
How Address Obfuscation Works • Two ways to exploit a memory error • Overwriting pointer • Code Pointer and Data Pointer • Point to the address of data or code chosen by the attacker • Require the attacker to know the absolute address of such data or code • Overwriting non-pointer Data • Code is protected and can not be overwritten • A example is to overwrite the arguments to chmod and execve • Require the attacker to know the relative distance between a buffer and the location of the data to be overwritten
Transformations in Address Obfuscation - 1 • Randomize the base address of memory regions • Randomize the base address of the stack • All addresses on the stack are randomized • Make it very difficult to find the address of injected code and pointer the return address to it • Randomize the base address of the heap • Against attacks where code is injected to the heap, and then a buffer overflow to pointer to this address • Randomize the starting address of dynamically-linked libraries • Randomize the locations of routines and static data in the executable
Transformations in Address Obfuscation - 2 • Permute the order of variables/routines • Make it difficult to overwrite data without corrupting other data that is critical for continued execution of the program • Three possible ways • Permute the order of local variables in a stack frame • Permute the order of static variables • Permute the order of routines in shared libraries or the routines in the executable
Transformations in Address Obfuscation - 3 • Introduce random gaps between objects • Locations of objects can be randomized • Four possible ways • Introduce random padding into stack frames • Introduce random padding between successive malloc allocation requests • Introduce random padding between variables in the static area • Introduce gaps within routines, and add jump instructions to skip over these gaps
Implementation Issues - 1 • When to perform the transformations • Compile-time, link-time, installation-time or load-time • Compile-time means better performance • Load-time does not require special compilers and linkers, and can be applied to binary program without source code • Link-time is chosen by this paper
Implementation Issues - 2 • When to determine the transformation amounts • Transformation time • Best performance • But the randomization will be the same every time the program is executed, a possible solution is periodically re-transformation • Beginning of program execution • Continuously changing during execution • Most difficult to attack • Not good for performance • Transformation time is chosen by the paper
Implementation Approach - 1 • Approach • At binary level • Inserting additional code with the LEEL binary-editing tool • Only rewriting routines that can be completely analyzed • Safe rewriting of machine code requires understanding of the complete control-flow graph, which is difficult because of • Data may be intermixed with code • Indirect jumps and calls
Implementation Approach - 2 • Stack base address randomization • By adding extra code to the text segment of the program • Skipped from execution by inserting a jump instruction at the beginning of the main routine • Decrement the stack pointer by a random number between 1 and 108 • This gap is write-protected using the mpprotect system call • Overflow beyond the base of the stack into this area will cause crash
Implementation Approach - 3 • DLL base address randomization • To prevent attacks that jump to library code • Two options • Dynamically randomize library addresses using mmap • Implemented by a wrapper to mmap • Location of shared memory will be different for every execution • Statically randomize library addresses at link-time • Implemented by dynamically linking the executable with a dummy shared library • No change to the loader or rest of the system
Implementation Approach - 4 • Text/Data segment randomization • Prevent attacks • modify a static variable • Jump to existing program code • Two approaches • Compile to a shared library and create a new main to load this library and call the old main • Code in shared library are position independent • Less efficient than address dependent counterpart • Relocate program’s code and data at link-time • No performance overhead
Implementation Approach - 5 • Random stack frame padding • Pushing extra storage onto the stack during the initialization phase of each subroutine • Two issues • Padding size • Static – no runtime overhead • Dynamic • Placement of padding • Between the base pointer and local variables • Before parameters to function
Implementation Approach - 5 • Heap Randomization • Code that will allocate a randomly-sized large chunk of memory is added • Wrapper functions is used to intercept calls to malloc • Dynamical memory allocation requests are randomly increases by 0 to 25%
Effectiveness • It is critical to have an estimate of the increase in attacker work load, as • Address obfuscation is foolproof • A probabilistic technique • Mathematical analysis of the effectiveness on different kinds of attacks are conducted in the paper For example, success rate of a single attack • Stack smashing – 4/(2.5*104) • Existing code attacks – 4x10-5
Conclusion • Strong Points • A comprehensive study of approaches for address obfuscation • A real tools is implemented • Mathematical analysis on the effectiveness • Weak Points • Insufficient introduction on background info • Only some routines can be transformed • Transformed binaries or the memory map of these binaries can be accessed by the attacker to extract random values from the binary