1 / 33

Detection of ASCII Malware

Detection of ASCII Malware. Parbati Kumar Manna Dr. Sanjay Ranka Dr. Shigang Chen. Internet Worm and Malware. Huge damage potential Infects hundreds of thousands of computers Costs millions of dollars in damage Melissa, ILOVEYOU, Code Red, Nimda, Slammer, SoBig, MyDoom

benard
Download Presentation

Detection of ASCII Malware

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Detection of ASCII Malware Parbati Kumar Manna Dr. Sanjay Ranka Dr. Shigang Chen

  2. Internet Worm and Malware • Huge damage potential • Infects hundreds of thousands of computers • Costs millions of dollars in damage • Melissa, ILOVEYOU, Code Red, Nimda, Slammer, SoBig, MyDoom • Mostly uses Buffer Overflow • Propagation is automatic (mostly)

  3. Recent Trends • Shift in hacker’s mindset • Malware becoming increasingly evasive and obfuscative • Emergence of Zero-day worms • Arrival of Script Kiddies

  4. Motivation for ASCII Attacks • Prevalence of servers expecting text-only input • Text-based protocols • Presumption of text being benign • Deployment of ASCII filter for bypassing text

  5. IDS Detecting ASCII Attack? • Disassembly-based IDS • All jump instructions are ASCII • Higher proportion of branches • Exponential disassembly cost • High processing overhead for IDS • Frequency-based IDS • PAYL evaded by ASCII worm

  6. Buffer Overflow

  7. Constraints of ASCII Malware • Opcode Unavailability • Shellcode requires binary opcodes • Here only xor, and, sub, cmp etc. • Must generate opcodes dynamically • Difficulty in Encryption • No backward jump • Can’t use same decrypter routine for each encrypted block • No one-to-one correspondence between ASCII and binary ASCII binary

  8. Creation of ASCII Malware

  9. Buffer Overflow using ASCII Overflowing a buffer using an ASCII string:

  10. Detection of ASCII Malware • Opcode Unavailability • Dynamic generation of opcodes needs more ASCII instructions for each binary instruction • Difficulty in Encryption • No backward jump means decrypter block for each encrypted block must be hardcoded • Long sequence of contiguous valid instructions likely  high MEL What is this MEL?

  11. Maximum Executable Length • Indicates maximum length of an execution path • Need to disassemble (and execute) from all possible entry points • All branching must be considered • Abstract payload execution • Used for binary worms with sled • Effectiveness dwindled presently

  12. Benign Text has Low MEL • Contains characters that correspond to invalid instructions • Privileged Instruction (I/O) • Arbitrary Segment Selector • More Memory-accessing instructions – may use uninitialized registers • Long sequence of contiguous valid instructions unlikely  low MEL

  13. Proposed Solution • Find out the maximum length of valid instruction sequence • If it is long enough, the stream contains a malware • Question: • How long is “long”?

  14. Probabilistic Analysis • Toss a coin n times • What is the probability that the max distance between two consecutive heads is ? Head (H) Invalid Instruction (I) Tail (T) Valid Instruction (v) THTTHTTTTTHTTT VIVVIVVVVVIVVV

  15. Probabilistic Analysis n = number of coin tosses p = probability of a head Xi= R.V.s for inter-head distances Xmax= Max inter-head distance C.D.F of Xmax= Prob [Xmax≤x] = [1 – p(1-p)x]n F.P. rate = 1 - Prob [Xmax≤τ] = 1 - [1 – p(1-p)τ]n

  16. Probabilistic Analysis For a fixed N = k (exactly k invalid instructions)

  17. Probabilistic Analysis For all possible values of N:

  18. Threshold Calculation n,p ,(false positive rate) Known (max inter-head distance) Unknown Threshold

  19. Independence Assumption • Validity of an instruction is an independent event • All the Xi’s are independent (while  Xi = n)

  20. Threshold Calculation With increasing n, we must choose a larger  to keep the same rate of false positive 

  21. Threshold Calculation With decreasing p, we must choose a larger  to keep the same rate of false positive 

  22. Determinen E[I] = E[Prefix chain length] + E[core instruction length] Obtained from character frequency of input data

  23. Determinep • Privileged instructions • Wrong Segment Prefix Selector • Un-initialized memory access Invalid Instructions Only 1. and 2. can be determined on a standalone basis

  24. Experimental Setup

  25. Implementation

  26. Experimental Setup • Benign data setup • ASCII stream captured from live CISE network using Ethereal • Malicious data setup • Existing framework used to generate ASCII worm by converting binary worms • Promising experimental results for max valid instruction length • Benign: all max values all below threshold  • Malicious: values significantly higher than 

  27. Experimental Results (DAWN)

  28. Experimental Results (APE-L)

  29. Contrasting with APE • Full content examination • Threshold calculation • Sled Vs. malware • Exploiting text-specific properties

  30. Multilevel Encryption Encryption binary ASCII ASCII Only Visible decrypter Decryption ASCII ASCII binary

  31. Multilevel Encryption Text 0x20 – 0x3F  Binary Binary     Text 0x40 – 0x5F Text 0x60 – 0x7E 

  32. Questions

  33. Thank you

More Related