280 likes | 419 Views
DAWN: A Novel Strategy for Detecting ASCII Worms in Networks. Parbati Kumar Manna Sanjay Ranka Shigang Chen Department of Computer and Information Science and Engineering, University of Florida IEEE INFOCOM 08. Outline. Introduction ASCII Worm Detection Strategies Probabilistic Analysis
E N D
DAWN: A Novel Strategy for Detecting ASCII Worms in Networks Parbati Kumar Manna Sanjay Ranka Shigang Chen Department of Computer and Information Science and Engineering, University of Florida IEEE INFOCOM 08
Outline • Introduction • ASCII Worm • Detection Strategies • Probabilistic Analysis • Implementation • Evaluation • Conclusions
Introduction • Almost any ASCII string translates into a syntactically correct sequence of instructions • The proportion of branch instructions for ASCII data is significantly higher than that of binary data • Prune the number of path to be inspected
ASCII Worm • ASCII data: 0x20 ~ 0x7E • Maximal valid instruction sequence • LMVI: Length of Maximal Valid Instruction sequenece
ASCII Worm • Intel opcodes in ASCII • Dual-operand register/memory manipulation • sub, xor, inc, imul • Single-operand register manipulation • inc, dec • Stack-manipulation • push, pop, popa • Jump • jo, jno, jb, jae, je, jne, jbe, ja, js, jns, jp, jnp, jnge, jnl, jng
ASCII Worm • I/O operation • insb, insd, outsb, outsd • Miscellaneous • aaa, daa, das, bound, arpl • Operand and Segment override prefixes • cs, ds, es, fs, gs, ss, a16, o16 • Move eax, ebx push ebx pop eax
ASCII Worm • Both the decrypter and the encrypted payload should be ASCII • The size of the decrypter should be small • There should not be a significant size discrepancy between the encrypted payload and the cleartext
Detection Strategies • Constraints of an ASCII Worm • Opcode Unavailability • Difficulty in Encryption • Control Flow Constraints • Self-mutation is a mandatory constraint • n bytes instructions O(n) bytes decrypter
Detection Strategies • Prevalence of Privileged Instructions • l, m, n, o insb, insd, outsb, outsd • Illegal Memory Access • Uninitialized register • Wrong Segment selector • Explicit Memory Address
Probabilistic Analysis • Assumptions: • The characters in the traffic are independently distributed • Bernoulli trial
Probabilistic Analysis • Invalid instruction • Privileged instruction • Memory-accessing instructions
Probabilistic Analysis • Notation: • p: the probability of invalid instruction • n: the total num of instructions • N: total num of invalid instructions (the num of valid instruction sequences) • Instruction stream (S1S2S3…SN) • Xi: the length of Si • Xmax: max{X1,X2,…,XN}
Probabilistic Analysis • p.m.f of N: • p.m.f of Xi: • c.d.f of Xi:
Probabilistic Analysis • For a instance of exactly N sequences
Probabilistic Analysis • The c.d.f of Xmax
Probabilistic Analysis • The p.m.f of Xmax
Probabilistic Analysis • Verifying Model • Using Monte-Carlo Simulation
Probabilistic Analysis • Threshold τ
Implementation • Instruction Disassembly • Instruction Sequence Analysis
Evaluation • Creation of the Test Data • Benign data: 100 cases, each containing nearly 4K printable ASCII characters
Evaluation • Determining Appropriate Thresholds for the Test Data • Determining p • 0.227 • Determining n • 1540 • Determining the threshold τ • 40 (when α = 0.01)
Evaluation • Experimental Results and Assessing the Effectiveness of the Detection Method
Conclusions • An ASCII worm must self-mutate to generate binary opcodes • This mutation requires a lots of memory-writing instructions • The size of a decrypter is relatively big for ASCII worm
Conclusions • Benign ASCII data does not have such a long executable instruction sequence • The length of the maximal valid instruction sequence can be used to differentiate between benign and malicious data
Determining p • Prob[I/O instruction] +Prob[wrong-Segment-override memory-accessing-instruction] = 18.5% + 4.2% = 22.7%
Determining n • E[length of instruction] = E[length of prefix chain] +E[length of actual instruction] = 2.6 • n = Total num of input characters / E[instruction size] = 4000/2.6 = 1540