1 / 32

Finding Diversity in Remote Code Injection Exploits

Finding Diversity in Remote Code Injection Exploits. Justin Ma , John Dunagan , Helen J. Wang , Stefan Savage , Geoffrey M. Voelker *University of California, San Diego *Microsoft Research. Encountering new malware. Have I seen this before?

kenny
Download Presentation

Finding Diversity in Remote Code Injection Exploits

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Finding Diversity in Remote Code Injection Exploits Justin Ma, John Dunagan, Helen J. Wang, Stefan Savage, Geoffrey M. Voelker *University of California, San Diego *Microsoft Research

  2. Encountering new malware Have I seen this before? How closely related is it to what I have seen before?

  3. Practical considerations New defense? ?

  4. Theoretical considerations Evolutionary relationship? ? ?

  5. Grouping similar malware together… • Ultimately, construct malware families • Anti-virus industry is active in this area

  6. Motivation 710 new families 40,000 new variants Family and variant defined in ad-hoc fashion… Is there a systematic way to determine the nature of this diversity?

  7. Exploit diversity MS RPC Request Exploit Attacker

  8. Polymorphism Encrypted Attacker

  9. Behind the encryption… Attacker

  10. Differing constants Different IP address Attacker

  11. Functional differences Waiting for a connection Attacker

  12. Different code base Calling “tftp.exe” Attacker

  13. ISystemActivator vulnerability How different are they? 1,561 exploit attempts 90 unique payloads

  14. Our goal • Automatically construct phylogeny, or family tree of exploits

  15. Outline for this talk • On classifying shellcodes • Steps for systematically studying shellcodes • Trace collection • Shellcode extraction • Shellcode decryption • Comparing samples • Cluster analysis • Post-hoc manual inspection to validate • Look at the code!

  16. Why shellcodes? • Our study focuses on exploits • They are packaged with the exploit • First foreign code that executes on a newly infected machine • Part of exploit with most leeway for variation • Primary challenge: collecting and analyzing shellcodes

  17. Remote code injection attacks low MS RPC Request Exploit Vulnerable buffer Flow of execution Shellcode Decrypted shellcode Victim high Victim’s stack memory

  18. Trace collection • Studying 5 vulnerabilities • Residential • 2-day trace • Windows XP SP2 • 29 unused DSL IP addresses • 4,400 exploit samples • Enterprise Trace • 1 Hour • Active responders • 5x /24 subnets • 1,500 exploit samples

  19. Shellcode extraction • Shield (Sigcomm’04) • Framework for specifying network-based protocols and vulnerabilities • Extracts shellcodes from raw network packets

  20. Shellcode decryption • Shellcode is encrypted • Use shellcode’s own decryption loop! • Limited emulation • Similar to generic decryption technique used for viruses

  21. Comparing samples:Candidate metrics • Edit distance • Too specific: non-code portions of payload made related exploits unnecessarily distant • Structural distance • Control flow graph over basic blocks • Basic blocks summarized with a color/hash • Too general: did not capture subtle instruction variations between exploit families

  22. Comparing samples:Final metric • Exedit distance metric • Edit distance over executed parts of shellcode • Distinguishes code from data • Maintains instruction-level details Canonical string for shellcode

  23. Cluster analysis • Need to group samples using the exedit distance metric • Agglomerative clustering • Each iteration, merge closest pair of clusters • Cluster distance = distance of furthest samples between two clusters

  24. Results • Caught exploits for 5 vulnerabilities over traces • Summary for residential trace

  25. ISystemActivator 10% clustering threshold Need to manually verify this… 6 families

  26. ISystemActivator 4-byte decoding key Kernel-address loading function Function-finding block

  27. 4-byte encoding key Kernel base loader Function finder ISystemActivator 4-byte decoding key Kernel-address loading function Function-finding block

  28. ISystemActivator Longest payload Many function blocks in middle of payload

  29. ISystemActivator Command-line call to “tftp.exe”

  30. ISystemActivator Different instructions in parts, otherwise very similar

  31. ISystemActivator “Connect-back” version “Bind” version

  32. Conclusions • Systematic method for classifying exploits • Exploit collection • Shellcode extraction and decryption • Shellcode comparison using exedit distance • Group exploits with clustering • Similarity between samples in computed phylogenies corresponded well with observed differences • Useful step toward automating malware classification

More Related