390 likes | 635 Views
Hybrid Analysis and Control of Malware. Kevin A. Roundy roundy@cs.wisc.edu. Barton P. Miller bart@cs.wisc.edu. Computer Science Department. 1. Need for forensic analysis. Malware attacks cost billions of dollars annually [1] 65% of users feel effect of cyber crime [2]
E N D
Hybrid Analysis and Control of Malware Kevin A. Roundy roundy@cs.wisc.edu Barton P. Miller bart@cs.wisc.edu Computer Science Department Hybrid Analysis of Program Binaries 1
Need for forensic analysis Malware attacks cost billions of dollars annually[1] 65% of users feel effect of cyber crime[2] 28 days to resolve an average cybercrime[2] 90% of malware resists analysis[3] • Our approach • analyze codebefore executing it • CFG-based interface for instrumentation • bring malware under analyst’s control malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 [1] Computer Economics. 2007 [2] Norton. 2010 [3] McAfee. 2008
Malware analysis factory Controlflow graph showing code coverage Stack trace at 1st network communication malware binary 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 Trace of Win API calls code coverage instrumentation network call instrumentation Defensive tactics report • unpacked code • overwritten code • control flow obfuscations SD-Dyninst Hybrid Analysis of Program Binaries
Obfuscated control flow 40d002 CALL ptr[eax] storm worm Entry Point ? obfuscated control flow obfuscated control flow XOR eax,eax MOV ecx,*[eax] handler-based ctrl flow handler-based ctrl flow unpacked code exceptionhandler overwritten code ? Hybrid Analysis of Program Binaries
Unpacked code storm worm Entry Point obfuscated control flow 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01 handler-based ctrl flow 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01 unpacked code overwritten code Hybrid Analysis of Program Binaries
Overwritten code Entry Point Upack packer obfuscated control flow 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01 handler-based ctrl flow 7a 77 0e 20 e9 3d e0 09 e8 68 c0 45 be 79 5e 80 89 08 27 c0 73 1c 88 48 6a d8 6a d0 56 4b fe 92 57 af 40 0c b6 f2 64 32 f5 07 b6 66 21 0c 85 a5 94 2b 20 fd 5b 95 e7 c2 16 90 14 8a 14 26 60 d9 83 a1 37 1b 2f b9 51 84 02 1c 22 8e 63 01 unpacked code overwritten code Hybrid Analysis of Program Binaries
Factory results for Conficker A packed payload initial bootstrap code Hybrid Analysis of Program Binaries
Factory results for Conficker A unpacked block static block API func non executed block Hybrid Analysis of Program Binaries
Factory results for Conficker A Instrument select and perform a stack-walk Stack-walk of Conficker’s communications thread Frame pc=0x7c901231 func: DbgBreakPoint at 7x901230 [Win DLL] Frame pc=0x10003c83 func: DYNbreakPoint at 0x100003c70 [instrument.] Frame pc=0x100016f7 func: DYNstopThread at 0x100001670 [instrument.] Frame pc=0x71ab2dc0 func: select at 0x71ab2dc0 [Win DLL] Frame pc=0x401f34 func: nosym1f058 at 0x41f058 [Conficker] Hybrid Analysis of Program Binaries
Outline Related work Hybrid analysis algorithm Parsing Dynamic analysis components Results R.W. H.A. Par. D.A. Res. Hybrid Analysis of Program Binaries
Non-Defensive Binary Analysis R.W. program binary • CFG-based API for instrument-ation • e.g., • ATOM, Vulcan (static) • Dyninst (dynamic) Dynamic instrumenter Static tool static code CFG Process • parsing • value-set analysis • binary slicing • e.g., Dyninst, CodeSurfer-x86 pre-execution un-controlled execution Hybrid Analysis of Program Binaries
Non-Defensive Binary Analysis R.W. analysis resistant binary • CFG-based API for instrument-ation • e.g., • ATOM, Vulcan (static) • Dyninst (dynamic) Dynamic instrumenter Static tool static code CFG obfuscated code Process • parsing • value-set analysis • binary slicing • e.g., Dyninst, CodeSurfer-x86 dynamic code pre-execution un-controlled execution Hybrid Analysis of Program Binaries
Non-Defensive Binary Analysis R.W. analysis resistant binary Trace Dynamic instrumenter Trace analysis static code obfuscated code • Instruction-filter based API for instrument-ation • e.g.: PIN, Valgrind, DynamoRIO, DIOTA CFG Process dynamic code • e.g.: • Madou et al. 2005 • Quist, Liebrock. 2009 pre-execution un-controlled execution post-execution analysis Hybrid Analysis of Program Binaries
Our approach R.W. SD-Dyninst analysis resistant binary Parser CFG Dynamic instrumenter Parser static code (source,dest) obfuscated code • CFG-based API for instrument-ation CFG Process dynamic code pre-execution un-controlled execution Hybrid Analysis of Program Binaries
Outline Related work Hybrid analysis algorithm Parsing Dynamic analysis components Results R.W. H.A. P. D.A. Res. Hybrid Analysis of Program Binaries
Code discovery algorithm H.A. Hybrid algorithm: Parse from known entry points Instrument control flow that may lead to new code Resume execution ? ? instrument overwrite exception CALL ptr[eax] DIV eax, 0 Hybrid Analysis of Program Binaries
Code discovery algorithm H.A. Hybrid algorithm: Parse from known entry points Instrument control flow that may lead to new code Resume execution ? ? instrument overwrite exception CALL ptr[eax] DIV eax, 0 Hybrid Analysis of Program Binaries
Code discovery algorithm H.A. Hybrid algorithm: Parse from known entry points Instrument control flow that may lead to new code Resume execution ? ? instrument overwrite exception CALL ptr[eax] DIV eax, 0 Hybrid Analysis of Program Binaries
Code discovery algorithm H.A. Hybrid algorithm: Parse from known entry points Instrument control flow that may lead to new code Resume execution ? ? instrument overwrite exception CALL ptr[eax] DIV eax, 0 Hybrid Analysis of Program Binaries
Code discovery algorithm H.A. Hybrid algorithm: Parse from known entry points Instrument control flow that may lead to new code Resume execution ? instrument overwrite exception CALL ptr[eax] DIV eax, 0 Hybrid Analysis of Program Binaries
Outline Related work Hybrid analysis algorithm Parsing Dynamic analysis components Results R.W. H.A. P. D.A. Res. Hybrid Analysis of Program Binaries
Accurate parsing P. • Standard control-flow traversal[1] • start from known entry points • follow control flow to find code • New conservative assumption • un-analyzed calls (pointer-based) may not return • Newstack tamper detection • backwards slice at • return instruction call 40d00a garbage pop ebp inc ebp push ebp ret [1] Sites et al., Binary Translation. 1993. Hybrid Analysis of Program Binaries
Outline Related work Hybrid analysis algorithm Parsing Dynamic analysis components Results R.W. H.A. P. D.A. Res. Hybrid Analysis of Program Binaries
Instrumentation-based discovery Invalid control transfers Indirect jumps/calls Abnormal return instructions D.A. call 401000 Invalid Region jmp eax call ptr [eax] ? ? push eax ret Hybrid Analysis of Program Binaries
Instrumentation-based discovery D.A. findTarget (ptr[eax]) call ptr[eax] SD-Dyninst new target 0x402d8a resume execution process findTarget (ptr[eax]) call ptr[eax] call ptr[eax] ? Hybrid Analysis of Program Binaries
Overwritten code discovery D.A. SD-Dyninst • Overwrite Detection • Possible strategies • Check each executed instruction for changes [1] • Monitor writes to code • Page-level write detection [2] • Remove write permissions from code pages • Write to code causes exception • Handle exception code write handler write [1] Royal et al. PolyUnpack. ACSAC ’06 [2] Maebe, De Bosschere. AADEBUG ’03 RWE R E RWE R E RWE R E Hybrid Analysis of Program Binaries
Overwritten code discovery D.A. SD-Dyninst • When to update • Cases to consider • large incremental overwrites • writes to data • writes to own page code write handler CFG update routine write R E R E R E Hybrid Analysis of Program Binaries
Overwritten code discovery D.A. SD-Dyninst • When to update • Cases to consider • large incremental overwrites • writes to data • writes to own page • Delaying the update • until write routine terminates code write handler CFG update routine write R E R E R E Hybrid Analysis of Program Binaries
Overwritten code discovery D.A. SD-Dyninst • Delayed updates • Two components • Handle overwrite signal • instrument write loop • copy overwritten page • restore write permissions • Update CFG when writes end • remove overwritten and unreachable blocks • parse at entry points to overwritten regions • remove write permissions • Delayed updates • Two components • Handle overwrite signal • instrument write loop • copy overwritten page • restore write permissions • Update CFG when writes end • remove overwritten and unreachable blocks • parse at entry points to overwritten regions • remove write permissions code write handler CFG update routine write cb cb R E RWE R E R E Hybrid Analysis of Program Binaries
Overwritten code discovery D.A. SD-Dyninst • Delayed updates • Two components • Handle overwrite signal • instrument write loop • copy overwritten page • restore write permissions • Update CFG when writes end • remove overwritten and unreachable blocks • parse at entry points to overwritten regions • remove write permissions code write handler CFG update routine write cb cb R E R E RWE R E Hybrid Analysis of Program Binaries
Handler-based CF obfuscations[1] D.A. [1] Popov, Debray, Andrews. Usenix 2007. Danekhar. http://www.codeproject.com/KB/system/inject2exe.aspx 2005. Monitored Program access violation handler xor eax,eax mov ecx,*[eax] push eax ... … mov *[ebp+10],eax mov 402d8a,edx mov edx,*[eax+b8] eip 402d8a Operating System Hybrid Analysis of Program Binaries
D.A. Resolving handler-based CF [1] Popov, Debray, Andrews. Usenix 2007. Danekhar. http://www.codeproject.com/KB/system/inject2exe.aspx 2005. Monitored Program access violation handler access violation handler xor eax,eax mov ecx,*[eax] push eax ... … mov *[ebp+10],eax mov 402d8a,edx mov edx,*[eax+b8] … mov *[ebp+10],eax mov 402d8a,edx mov edx,*[eax+b8] analyze code at new target eip 402d8a instrument exit SD-Dyninst Operating System Hybrid Analysis of Program Binaries
Outline Related work Hybrid analysis algorithm Parsing Dynamic analysis components Results R.W. H.A. P. D.A. Res. Hybrid Analysis of Program Binaries
Res. Fully analyzed packed programs Packer Malware market share[1] Obfuscated Self-modifying Exception-based ctrl Self check-summing UPX 9.45% PolyEnE 6.21% yes EXECryptor 4.06% yes yes yes yes yes yes yes yes Themida 2.95% yes yes yes yes PECompact 2.59% Upack 2.08% yes yes nPack 1.74% Aspack 1.29% yes yes FSG 1.26% yes yes Nspack 0.89% yes Asprotect 0.43% yes yes yes Armadillo 0.37% yes yes yes yes Yoda's Protector 0.33% yes yes yes yes WinUPack 0.17% yes yes MEW 0.13% yes [1] Packer (r)evolution. Panda Research, 2008. Two-month average Feb-March 2008.
Res. Fully analyzed packed programs Time to unpack 0.5 1.2 3.2 23.5 Self-checksumming techniques 1.5 4.4 1.4 unoptimized overwrite detection expensive overwrite detection 2.7 23.6 3.9 Packer Malware market share[1] SD-Dyninst uninstrumented times are about .02 secs UPX 9.45% yes PolyEnE 6.21% yes EXECryptor 4.06% Themida 2.95% PECompact 2.59% yes Upack 2.08% yes nPack 1.74% yes Aspack 1.29% yes FSG 1.26% yes Nspack 0.89% yes Asprotect 0.43% Armadillo 0.37% Yoda's Protector 0.33% [1] Packer (r)evolution. Panda Research, 2008. Two-month average Feb-March 2008. WinUPack 0.17% yes MEW 0.13% yes Hybrid Analysis of Program Binaries
Instrumentation costs Res. Hybrid Analysis of Program Binaries
Conclusion • Analysis before execution allows for • Understanding & control of before execution • Selective monitoring • Build-your-own analysis factory • Ongoing work • Handling self-checksumming code • Releasing Dyninst w/ SD-Dyninst inside • http://www.paradyn.org/ Hybrid Analysis of Program Binaries