260 likes | 387 Views
Annual Computer Security Applications Conference (ACSAC) 2012. Down to the Bare Metal: Using Processor Features for Binary Analysis. Carsten Willems 1 , Ralf Hund 1 , Andreas Fobian 1 , Thorsten Holz 1 , Amit Vasudevan 2 1 Ruhr-University Bochum, Germany 2 Carnegie Mellon University.
E N D
Annual Computer Security Applications Conference (ACSAC) 2012 Down to the Bare Metal:Using Processor Features for Binary Analysis Carsten Willems1, Ralf Hund1, Andreas Fobian1, Thorsten Holz1, Amit Vasudevan2 1Ruhr-University Bochum, Germany 2Carnegie Mellon University 左昌國 2013/02/25 Seminar @ ADLab, NCU-CSIE
Outline • Introduction • Software Emulators • Delusion Attacks • Binary Analysis with Branch Tracing • Experiments • Limitations • Conclusion
Introduction • Binary(malware or vulnerable software) analysis • Static • Dynamic • Number of execution paths • (on behavior analysis) Every Instruction or Critical Point • Native Machine or Emulation/Virtualization
Introduction • Native Machine • The analysis result must be unaffected by malicious code • Reverting to clean states • Lack of monitoring abilities • Emulator • Artificial environment detection • Delusion attacks • No explicit test
Introduction • Contributions: • Introducing several delusion attacks • An approach to perform behavior analysis • Branch tracing feature of x86 CPU • Implementing a prototype that shows the usefulness of this approach
Software Emulators • BOCHS • QEMU • Dynamic Translation • Guest code block (before branch) intermediate code optimization translated to host instruction code block (Translation Block) saving TBs in code cache • Isolated Memory • BitBlazeand Anubis • Taint Propagation Tracking
Delusion Attacks - Motivation • Current emulator detection techniques consist of 2 steps: • Probing the existence of a non-native system environment • Depending on the outcome of (1), different actions are performed • These techniques are easy to spot and mitigate • Powerful analysis methods like multi-path execution • This paper proposes detection methods that have no explicit check and do not have conditional branch
Delusion Attacks – Basic Principle • Self-Modifying Code (SMC) • On a native system, handling SMC correctly is sophisticated • Instruction prefetch • Multi-processor environment • Modern CPUs can handle these problems correctly • In an emulator, the CPU facilities for SMC detection cannot be utilized • Implemented in software • Preparing a list of addresses of instructions huge overhead • Most emulators (like QEMU) use page fault handling for SMC detection • All executable memory pages are set read-only • If (memory write on executable memory), page fault handler triggered • (In the handler) If the target memory should be writable (writable in guest OS), • Memory protection is modified to writable • The memory write instruction is executed again • Memory protection is changed to read-only
Delusion Attacks – REP MOVS • rep movs instruction • Copying a number of bytes, words, or double words within an implicit loop • esi: source memory location • edi: destination location • ecx: loop counter, -1 for each loop, 0 for stopping loop • On a real machine, the copy loop is atomically • In an emulator, if the destination is a code address, • The first loop iteration triggers the page fault handler • Making it writable, re-executing the write operation, and making it read-only • The instruction is re-read from memory (second loop iteration) • …
Delusion Attacks – REP MOVS lea eax, BENIGNCODE lea ebx, MALICIOUSCODE lea esi, NEW lea edi, OLD movecx, 2 ecx = 1 ecx = 2 ecx = 0 eip = OLD+0x0 eip = OLD+0x2 OLD+0x0: rep movsd OLD+0x2: nop OLD+0x3: nop OLD+0x4: call eax //BENIGNCODE OLD+0x6: nop OLD+0x7: nop ret NEW+0x0: nop NEW+0x1: nop NEW+0x2: nop NEW+0x3: nop NEW+0x0: nop NEW+0x1: nop NEW+0x2: nop NEW+0x3: nop NEW+0x4: call ebx //MALICIOUSCODE NEW+0x6: nop NEW+0x7: nop Double word NEW+0x4: call ebx //MALICIOUSCODE NEW+0x6: nop NEW+0x7: nop On a real machine
Delusion Attacks – REP MOVS lea eax, BENIGNCODE lea ebx, MALICIOUSCODE lea esi, NEW lea edi, OLD movecx, 2 re-read the instruction from memory ecx = 1 ecx = 2 ecx = 1 eip = OLD+0x0 eip = OLD+0x1 read-only page fault writable OLD+0x0: rep movsd OLD+0x2: nop OLD+0x3: nop OLD+0x4: call eax //BENIGNCODE OLD+0x6: nop OLD+0x7: nop read-only ret NEW+0x0: nop NEW+0x1: nop NEW+0x2: nop NEW+0x3: nop NEW+0x0: nop NEW+0x1: nop NEW+0x2: nop NEW+0x3: nop NEW+0x4: call ebx //MALICIOUSCODE NEW+0x6: nop NEW+0x7: nop Double word NEW+0x4: call ebx //MALICIOUSCODE NEW+0x6: nop NEW+0x7: nop In QEMU
Delusion Attacks - INVD • Many kinds of caches are available on a contemporary system • In an emulator, there is no explicit cache support, and all cache-related instructions have no effect • On a real machine • The modification in cache will not be written back to memory immediately • On an emulated machine • The modification is written directly to RAM
Delusion Attacks - INVD lea eax, BENIGNCODE lea ebx, MALICIOUSCODE lea esi, A incesi wbinvd mov byte ptr [esi], 0xD0 invd esi = A+0x0 esi = A+0x1 The modification is done in cache, not yet writing back to memory The cache is now invalidated A: call ebx // FF D3 = call ebx // FF D0 = call eax MALICIOUSCODE On a real machine
Delusion Attacks - INVD lea eax, BENIGNCODE lea ebx, MALICIOUSCODE lea esi, A incesi wbinvd mov byte ptr [esi], 0xD0 invd esi = A+0x0 esi = A+0x1 The modification is directly written to memory A: call ebx // FF D3 = call ebx // FF D0 = call eax call eax BENIGNCODE MALICIOUSCODE In QEMU
Delusion Attacks - LEAVE leave movesp, ebp pop ebp
Binary Analysis with Branch Tracing • On x86/64 architectures from Intel and AMD, the branch tracing (BT) facilities can record all pairs of the source address and the destination address of branch operations • The information can be used to reconstruct the execution/decision path taken during execution
Experiments 1: Binning of Malicious PDF Documents • “Fuzzing” which produces a large number of crash reports is a kind of automated vulnerability analysis • Binning: a technique to group similar root causes in the crash reports • This technique can also be used to group a set of exploits by the categories of exploited vulnerability • By comparing with the control path generated from BT log, it is easy to realize binning
Experiments 1: Binning of Malicious PDF Documents • CWXDetector • A tool that is capable of detecting exploitation attempts and extracting shellcode • It does not become active before the execution of the first shellcode instruction no information can be gained about the cause vulnerability • By combining BT with CWXDetector, it is useful to trace back from the execution of the first shellcode instruction to the root cause of vulnerability • The experiment • 4,869 malicious PDF documents • Each file exploits some kind of vulnerability in Acrobat Reader 9.00
Experiments 1: Binning of Malicious PDF Documents • Normalization • Because of ASLR, the branch addresses are recorded in the form of relative addresses • Collapsing loops • Removing internal exception handling of the Windows system • Ignoring the shellcode part • Clustering algorithm • DBSCAN • Jaro-Winkler distance • Measure the difference between two strings • Similar string higher score • Similar prefix higher score
Experiments 1: Binning of Malicious PDF Documents k: minimum cluster size ε: maximum distance of two objects to belong to the same cluster
Experiments 1: Binning of Malicious PDF Documents • Comparing with Wepawet • 5 different vulnerability signatures (only addressing exploits of Acrobat Reader 9.00) • A small number of samples not detected to have exploits to Acrobat Reader 9.00 manually verified wepawet is wrong • Some samples are labeled incorrectly manually verified wepawet is wrong • Performance • Time from opening the documents to the execution of shellcode • Min: 11s (2s w/o BT) • Max: 406s (117s w/o BT) • Avg: 129s (11s w/o BT)
Experiment 3: Practical Delusion Attack with a PDF File • See T.R. Appendix B • This sample in Anubis behaved normally
Limitations • The data from BT logs is coarse • The prototype could be detected by timing measurements • The attacker in ring-0 is capable of disabling the BT • Could incorporate with a hardware-assisted hypervisor
Conclusion • Many analysis techniques utilize software emulators. • Attackers still have methods to evade the analysis under the emulation environment • A new approach for dynamic code analysis that uses CPU-assisted branch tracing offers a granularity between instruction- and function-level monitoring with reasonable overhead • Practical results show that the BT traces contain enough information to assist some tasks in malware and vulnerability analysis