310 likes | 412 Views
Annual Computer Security Applications Conference (ACSAC) 2012. Using Memory Management to Detect and Extract Illegitimate Code for Malware Analysis. Carsten Willems 1 , Thorsten Holz 1 , Felix Freiling 2 1 Ruhr-University Bochum, Germany 2 University of Erlangen, Germany. 左昌國
E N D
Annual Computer Security Applications Conference (ACSAC) 2012 Using Memory Management to Detect and Extract Illegitimate Code for Malware Analysis Carsten Willems1, Thorsten Holz1, Felix Freiling2 1Ruhr-University Bochum, Germany 2University of Erlangen, Germany 左昌國 Seminar @ ADLab, NCU-CSIE
Outline • Introduction • Related Work • Model and Definitions • Approach • Implementation • Application to the Analysis of PDF Documents • Detection Evaluation • Extraction Evaluation • Conclusion
Introduction • Exploiting Software • The ultimate aim is to perform malicious computation • Executing illegitimate code (a.k.a. shellcode) • Countermeasures in OSs • Data Execution Prevention (DEP) • Address Space Layout Randomization (ASLR) • Above countermeasures do not help in the analysis of illegitimate code
Introduction • CWXDetector • A tool for the analysis of malware for the Windows OS • Dynamic analysis for detecting and extracting illegitimate code • Using memory management techniques • For analysis • CWXDetector is not meant to protect a system • But to monitor and analyze the illegitimate code
Introduction • Limitations • Dynamic analysis • Could get incomplete results • Incapable to detect malicious code embedded in arbitrary data (only those being executed) • Malicious computation does not always imply the existence of illegitimate code • Return Oriented Programming (ROP) • JIT-spraying
Related Work • Preventive Measures • Detection of Illegitimate Code • Extraction of Illegitimate Code
Related Work – Preventive Measures • Microsoft EMET tool (Enhanced Mitigation Experience Toolkit) • http://blogs.technet.com/b/srd/archive/2012/05/15/introducing-emet-v3.aspx • CFI • Cons: • Self-modifying or dynamically created code • No assistance to further analysis
Related Work – Detection of Illegitimate Code • Static signatures • “sled component” searching • Heuristics-based approaches • Cons: • Not generic • Have to be extended when new anti-detection come up
Related Work – Extraction of Illegitimate Code • OllyBone • OmniUnpack • Renovo • Cons: • OllyBone: debugger-driven malware analysis • OmniUnpack: rely on signatures, FP • Renovo: requiring emulation environment
Model and Definitions • Attacker Model • A remote attacker provides some malicious piece of data to exploit a vulnerability application • Resulting in the execution of shellcode • Not single staged full-ROP/JIT-sprayed attacks • Illegitimate code (ILC) • Code that is not legitimate (would not execute if functions properly)
Approach • Enforcing an Invariant • When a vulnerability is exploited, the control flow is redirected to one of the following locations: • ILC on the stack (buffer overflow) • ILC in the heap (heap-spraying) • ILC in a static data area (exploiting a static data buffer) • The approach enforces the following invariant • All ILC resides in non-executable memory • all execution attempts of ILC will result in page fault exception
Approach • Trusted Files and Functions • All existing files are trusted before analysis. • All created or modified files are untrusted during later operation. • Trusted memory modification functions • Trusted callers
Approach • Memory Protection Modifications • Only trusted callers can allocate executable memory. • Only trusted callers can modify existing memory to being executable. • Only trusted files can be mapped into executable memory. • all attempts that violate these rules are intercepted (hooked) and the resulting memory regions becomes non-executable. • Exception: • If the target memory belongs to a mapped trusted file (writable), the executable right has to be removed. • Enforcing W ^ X
Approach • Custom Page Fault Handler • When page fault triggered • Check if it is related to the system • If so, • Dumping memory • Modifying the memory to executable and continuing • Multi Version Dumping • Different versions of each executed page may be created • Compare each dumped record
Implementation • CWXDetector • For x86 version of Windows XP • Memory FunctionHooks • NtAllocateVirtualMemory • NtProtectVirtualMemory • NtMapViewOfSection • Checking the Caller • Trace the user-mode call stack • Custom Page Fault Handler • Hook MmAccessFault • Check if the fault was cause by execute operation • Check if the fault address resides in user-space • Additional • NtCreateFile • NtCreateProcess
Application to the Analysis of PDF Documents • 32bit Windows XP SP2 • Adobe Acrobat Reader versions 6.0.0, 7.0.0, 7.0.7, 8.1.1, 8.1.2, 8.1.6, 9.0.0, 9.2.0, and 9.3.0 • Foxit Reader version 3.0.0 • 1. installed the customized page fault handler and the system hooks • 2. started the particular viewer application • 3. disabled DEP for the viewer application • 4. opened the PDF document • 5. enforced the invariant: new allocated memory and modified memory
Application to the Analysis of PDF Documents • 6. if the execution of ILC was detected, dumped the memory page to a file, created a log entry, and modified the related PTE to being executable. • Check the dumped file for patterns marked as “PATTERN” • 7. if a new process was created by the PDF viewer • marked as “PROCESS” and prevent the process from spawning • 8. if a dialog window was shown • marked as “DIALOG” and simulated a user input to close the window • 9. “CRASH” • 10. time out “NOTHING”
Application to the Analysis of PDF Documents • The log file contains information about: • All attempts to allocate executable memory invoked by untrusted callers • All attempts to modify existing memory to being executable invoked by untrusted callers • All attempts to execute memory that contains ILC • All created files • All created processes • All shown user dialog windows
Application to the Analysis of PDF Documents • Every PDF file ended up with a tuple (d, c) • d: whether illegal code was detected • c: {PATTERN, CRASH, PROCESS, DIALOG, NOTHING} • (d, c) > (d’, c’) iff either d had detected ILC and d’ not, or (if d = d’) c > c’ • PATTERN > CRASH > PROCESS > DIALOG > NOTHING
Application to the Analysis of PDF Documents • Determining Trusted Callers • Identifying all the functions from all trusted files that are used to produce executable memory • Loader-related function in ntdll.dll • Tested benign PDF documents and manually inspected the function calls
Detection Evaluation • Benign PDF Sampleset • Retrieved URL of the TOP 5000 sites from alexa • Queried Google for the first 10 PDF documents on each site • Using tool pdfid, selected all documents which contained JavaScript, OpenActions or some other extended PDF features • Uniformly picked random samples from other files to 7,218 samples • The benign set on the system • No ILC execution detected FP 0%
Detection Evaluation • Malicious PDF Sampleset • 7,278 known malicious PDF documents form a well-known AV vendor • all their valid PDF samples from Jan. 2011
Detection Evaluation • 15 “CRASH”: • All performed invalid memory accesses (before executing ILC) • 33 “PROCESS”: • Using regular build-in features to create IE or OE with special crafted parameters • 295 “DIALOG”: • Invalid embedded JavaScript code • Social engineering • 154 “NOTHING”: • Required specific environment
Extraction Evaluation • Adobe Acrobat Reader 9.0.0 • 4,869 samples • Quality: • Valid x86 instructions (code ratio) • Contained strings • Partitions
Extraction Evaluation • Analyzed all 4,869 PDF samples and got 2 versions of partitions • Initial partition • Final partition • Code ratio determined by IDA Pro • Valid strings
Extraction Evaluation • Initail partitions • 7,807 strings • 1,866 URLs • Final partitions • 8,676 strings • 2,280 URLs
Conclusions • A generic and automatic method to detect and extract illegitimate code • Effective in supporting malware analysis • Also good detection rates