160 likes | 322 Views
Anagram: A Content Anomaly Detector Resistant to Mimicry Attack. Ke Wang; Janak J. Parekh; Salvatore J. Stolfo; Proc. Recent Advances in Intrusion Detection, 2006. Reporter: Luo Sheng-Yuan 2009/08/06. Outline. Introduction Related Work Proposed Scheme Experiments Result Conclusion.
E N D
Anagram: A Content Anomaly Detector Resistant to Mimicry Attack Ke Wang; Janak J. Parekh; Salvatore J. Stolfo; Proc. Recent Advances in Intrusion Detection, 2006 Reporter: Luo Sheng-Yuan 2009/08/06
Outline • Introduction • Related Work • Proposed Scheme • Experiments Result • Conclusion
Introduction • Generality for broad application to any service • Detect for zero-day attacks • Against mimicry attacks • High-order n-gram analysis
Related Work • Byte Frequency Distribution • Wang, K. and S.J. Stolfo. Anomalous Payload-based Network Intrusion Detection. in Symposium on Recent Advances in Intrusion Detection. 2004.
Related Work • PAYL’s Scheme Normal Packet Incoming Packet Training Normal Abnormal Compute Mahalanobis Distance
Related Work • Euclidean Distance & Mahalanobis Distance
Related Work • Evading PAYL • Kolesnikov, O., D. Dagon, and W. Lee, Advanced Polymorphic Worms: Evading IDS by Blending in with Normal Traffic, in USENIX Security Symposium. 2006.
Proposed Scheme • N-gram Analysis • An n-gram is a subsequence of n items from a given sequence. • 5-gram example Given a sequence of letters(“worl”), what is the next letter? (a=0.001, b=0.001, c=0.001, d=0.8, ......)
Proposed Scheme • N-gram Analysis • Frequency-based • All element's value is probability • Binary-based • All element's value is zero or one • N-gram model size • 256^N in ASCII
Proposed Scheme • Training phase • Storing all of the distinct n-grams observed during training. • Test phase
Proposed Scheme • Bloom Filter • BF is a convenient tool to represent the binary model.
Proposed Scheme • Randomization against mimicry attack
Experiments Result • Train for 500 hours of traffic data
Experiments Result • False positive rate
Conclusion • The core hypothesis is that any new, zero-day exploit will contain a portion of data that has never before been delivered to the application. • Anagram raises the bar for attackers making mimicry attacks harder.
Comment • The binary-based approach is not tolerant of noisy training. • Computation time is longer than PAYL.