590 likes | 608 Views
This thesis proposal outlines the design of a RAIDM system for network-based attack defense, focusing on router-based anomaly detection and mitigation. The proposed work includes research questions on achieving online anomaly detection, responding to zero-day polymorphic worms, protecting high-speed networks from exploits, and providing network situational awareness.
E N D
RAIDM: Router-based Anomaly/Intrusion Detection and Mitigation Zhichun Li EECS Deparment Northwestern University 2008-04-29 Thesis Proposal
Outline Motivation RAIDM System Design Finished Work Proposed Work Research Plan
Motivation Attackers Botnets Worms
Motivation • Network security has been recognized as the single most important attribute of their networks, according to survey to 395 senior executives conducted by AT&T. • Many new emerging threats make the situation even worse. RAIDM Network-based attack defense system
Network Level Defense • Network gateways/routers are the vantage points for detecting large scale attacks • Only host based detection/prevention is not enough for modern enterprise networks. • Enterprises might not only want to reply on their end user for security protection • User might not want to stop their work to reboot machines or applications for applying patches.
Outline Motivation RAIDM System Design Finished Work Proposed Work Research Plan
Research Questions • How can we achieve online anomaly detection for high-speed networks? • How can we respond to zero-day polymorphic worms in their early stage? • Given vulnerabilities, how to protect the high-speed networks from exploits, accurately and efficiently? • How can we provide quality information for network situational awareness?
Current Status • Part I: Sketch based monitoring & detection • Result in [Infocom06,ToN,ICDCS06] • Part II: Polymorphic worm signature generation • Result in [Oakland06,ICNP07] • Part III: Signature matching engines • Work in progress, will be focus of this talk • Part IV: Network Situational Awareness • Work in process
Outline Motivation RAIDM System Design Finished Work Proposed Work Research Plan
Part I: Sketch based monitoring & detection • Reversible Sketches (include for completeness) • Use intelligent hash function design to recover the aggregated value of a series (key,value) updates for the popular keys. • Publications: • Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reversible sketches: Enabling monitoring and analysis over high speed data streams, in the IEEE/ACM Transaction on Networking, Volume 15, Issue 5, Oct, 2007 • Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluations, and Applications, in the Proc. Of IEEE INFOCOM 2006 (252/1400=18%)
Part I: Sketch based monitoring & detection • Sketch-based Anomaly Detection • Build anomaly detection engines based on reversible sketches to detect horizontal scan, vertical scan, and TCP SYN flooding attacks. • Further proposed 2D sketches to differentiate the different types of attacks. • Publications • Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient Flow-level Intrusion Detection Approach for High-speed Networks, In Proc. Of IEEE International Conference on Distributed Computing Systems (ICDCS) 2006 (75/536=14%) (Alphabetical order)
Part II: Polymorphic worm signature generation • TOSG (Token-Based Signature Generation) • Use token (substring) conjunction as the signature for polymorphic worms • Advantage • Do not require protocol knowledge or the information about the vulnerable program • Fast and noise tolerant • Have analytical attack resilience bound under certain assumptions. • Limitation • Do not have good attack resilience to the deliberate noise injection attack [Perdisci 2006] • PublicationZhichun Li, Manan Sanghi, Brian Chavez, Yan Chen and Ming-Yang Kao, Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience, in Proc. of IEEE Symposium on Security and Privacy, 2006 (23/251=9%)
Part II: Polymorphic worm signature generation • LESG (Length-Based Signature Generation) • Propose to use a set of field lengths of the protocol of vulnerable program as signatures. • Mainly work for buffer overflow worms • Advantage: • Fast and noise tolerant • Have analytical attack resilience bound under certain assumptions • The bound hold under all the recently proposed attacks. • PublicationZhichun Li, Lanjia Wang, Yan Chen and Zhi (Judy) Fu, Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorohic Worms, in the Proc. of IEEE International Conference on Network Protocols (ICNP) 2007 (32/220=14%)
Outline Motivation RAIDM System Design Finished Work Proposed Work Research Plan
Proposed Work • Part III: Signature Matching Engine • NetShield, a protocol semantic vulnerability signature matching engine. (focus on this talk) • ReportZhichun Li, Gao Xia, Yi Tang, Ying He, Yan Chen and Bin Liu, NetShield : Towards High Performance Network-based Semantic Signature Matching
Proposed Work • Part IV: Network Situational Awareness • Botnet Inference: • Infer scan properties based on honeynet traffic: trend, uniform, hitlist, and collaboration • Extrapolate the global scan scope and global number of bots based on limited local observation. Can be used to detect target attacks. • ReportZhichun Li, Anup Goyal, Yan Chen and Vern Paxson, Towards Situational Awareness of Large-Scale Botnet Events using Honeynets • P2P Misconfiguration Diagnosis • Found P2P misconfiguration traffic is one of the major source of Internet background radiation • eMule P2P misconfiguration is due to byte ordering • For BitTorrent, we found anti-P2P company deliberately inject bogus peers • ReportZhichun Li, Anup Goyal, Yan Chen and Aleksandar Kuzmanovic, P2P Doctor: Measurement and Diagnosis of Misconfigured Peer-to-Peer Traffic
NetShield Overview • Goal • Feasibility Study: a Measurement Approach • High Speed Parsing • High Speed Matching for Large Rulesets • Preliminary Evaluation • Discussion
Signature Matching Engine • Accuracy (especially for IPS) • False positive • False negative • Speed • Coverage: Large ruleset
Reason Regular expression is not power enough to capture the exact vulnerability condition! Shield RE X Cannot express exact condition Can express exact condition
Feasibility Study • Protocol semantic can help (Shield project [SIGCOMM04]) • How much for NIDS/IPS? • Given a NIDS/NIPS has a large ruleset • What percent of the rules can use protocol semantic vulnerability signature to improve?
Measure Snort rules • Semi-manually classify the rules. • First by CVEID • Manually look at each vulnerability • Results • 86.7% of rules can be improved by protocol semantic vulnerability signatures. • 9.9% of rules are web DHTML and scripts related which are not suitable for signature based approach. • On average 4.5 Snort rules reduce to one vulnerability signature • Binary protocols have large reduction ratio than text based protocols.
Towards high speed parsing • Protocol parsing problem formulation • Given a PDU and the previous states from previous PDU, output the set of fields which required by matching. • Observation • Parsing State Machine
PDU array Observation • PDU parse tree • Leaf nodes (basic fields ) are integer or string • Vulnerability signature mostly based on basic fields Only need to parse out the field related to signatures
Parsing State Machine • Studied eight popular protocols: HTTP, FTP, SMTP, eMule, BitTorrent, WINRPC, SNMP and DNS and vulnerability signatures. • Protocol semantics are context sensitive • Common relationship among basic fields.
Example for WINRPC • Nodes • States: S1 .. Sn • 0.61 instruction/byte for BIND PDU
High speed matching • Problem formulation • Observation • Candidate Selection Algorithm • Algorithm Refinement
Matching Problem Formulation • Data presentation • For all the vulnerability signartures we studied we need integers and strings • Integer operator: ==, >, < • String operator: ==, match_re(.,.), len(.), • Buffer constraint • The string fields could be too long to buffer. • Influence whether we can change the matching order • Field dependency • Array (e.g., DNS_questions, or RR records) • Associate array (e.g., HTTP headers) • Mutual exclusive fields.
Matching Problem Formulation (2) • PDU level protocol state machine • For complex stateful protocols • For most stateful protocols the state machine is quite simple WINRPC example
Matching problems (cont.) • Example signature for Blaster worm • Single PDU matching problem (SPM) • Multiple PDU matching problem (MPM)
Single PDU Matching • Suppose we have n signatures, each is defined on k matching dimensions (matchers) • Matcher is a two tuple (field, operation) or four tuple for the associate array elements. • For example: • (Filename, RE) • (Version, Range_check) • Version > 3 • Version == 1 • k is all possible matchers for the n signatures.
Table Representation • We use a n×k table to represent the rules. k matchers n row signatures
Requirement for SPM • Large number of signatures n • Large number of matchers k • Large number of “don’t cares” • Cannot reorder the matchers arbitrarily (buffer constraint) • Field dependency • Array • Associate Array • Mutually exclusive Fields.
Compare to packet classification • Similarity: both problem define on k matching dimensions and allow wildcards • Differences: • Large k and large number of “don’t cares” • Buffer constraint • Regular expression matcher • Field dependency • Related work on packet classification • Exhaustive search • Decision tree • Tuple space • Divide and Conquer (Decomposition)
Difficulty • A more complex problem than packet classification • Packet classification theoretical worst case bound • Based on computational geometry • O ((logN)k-1) worst case time or O (Nk) worst case memory • Solution: use the characteristics from real traces
Observation • Observation 1: most matchers are good. • After matching against them, only a small number of signatures can pass (candidates). • String matchers are all good, most integer matchers are good. • We can buffer the bad matchers to change the matching order • Observation 2: real world traffic mostly does not match any signature. Actually even stronger in most case no matcher will match any rule. • Observation 3: the NIDS/IPS will report all the matched rules regardless the ordering. Differ from firewall rules.
Basic idea • Decide the matcher order at pre-computation, buffer the bad ones to the end if possible • When a PDU comes, match again each matcher (column) for all the signatures simultaneously and get the possible candidates for next step • Combine the candidate sets together to get the final matched signatures
Match single matcher • Integer range checking: Binary search tree • String exact matching: Trie • String regular expression matching: DFA. • String length checking: Binary search tree
Candidate Selection for SPM • Basic algorithm: pre-computation
Matching Illustration A2 candidates B2 candidates
Matching Illustration • Compute the operations • Explicit calculation • Based on a n×k Bitmap decide the whether an element in Si requires next matchers. • For those requires next matchers, search whether it is also in Ai+1 • Implicit calculation (for bad matchers) • Do not calculate Ai+1 , since it could be large • Check whether the candidates in Si can match matcher (i+1) sequentially • When buffer bad matchers to the end, the B will be small.
Refinement • SPM improvement • Allow negative conditions • Handle array case • Handle associate array case • Handle mutual exclusive case • Report the matched rules as early as possible • Extend to MPM • Allowing checkpoints.
Results • Traces from Tsinghua Univ. (TH) and Northwestern Univ. (NU) • After TCP reassembly and preload the PDU in memory • For DNS we only evaluate parsing. • For WINRPC we have 45 vulnerability signatures which covers 3,519 Snort rules • For HTTP we have 791 vulnerability signatures which covers 941 Snort rules.
Discussion • Currently we found the candidate selection algorithm works well in practice • Further thoughts • How to rely more on hardware assistance? • TCAM? • Use bitmap to express set operations? • Whether we can consider the traffic statistics to further improve efficiency?
Outline Motivation RAIDM System Design Finished Work Proposed Work Research Plan
Publications • Zhichun Li, Lanjia Wang, Yan Chen and Zhi (Judy) Fu, Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorohic Worms, in the Proc. of IEEE ICNP 2007. • Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reversible sketches: Enabling monitoring and analysis over high speed data streams, in the IEEE/ACM Transaction on Networking, Volume 15, Issue 5, Oct, 2007 • Zhichun Li, Manan Sanghi, Brian Chavez, Yan Chen and Ming-Yang Kao, Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience, in Proc. of IEEE Symposium on Security and Privacy, 2006 • Zhichun Li, Yan Chen and Aaron Beach, Towards Scalable and Robust Distributed Intrusion Alert Fusion with Good Load Balacing, in Proc. of ACM SIGCOMM LSAD 2006 • Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient Flow-level Intrusion Detection Approach for High-speed Networks, In Proc. Of IEEE ICDCS 2006 • Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluations, and Applications, in the Proc. Of IEEE INFOCOM 2006
Research Time Plan • Apr 2008 – Jun 2008: • Finish remaining experiments of network situational awareness • Sep 2008 – Mar 2008: • Refine the vulnerability signature matching algorithm • Fully implement, deploy and evaluate the Netshield prototype • Prepare job application and interview • Apr 2009 – Jun 2009: • PhD dissertation writing • Thesis Defense
Q & A Thanks!
Outline Motivation Feasibility Study: a measurement approach Problem Statement High Speed Parsing High Speed Matching for massive vulnerability Signatures. Evaluation Conclusions