400 likes | 409 Views
This paper discusses the evolution of network intrusion detection signatures and introduces protomatching as a more efficient approach for signature enforcement.
E N D
Protomatching Network Traffic for High Throughput Network Intrusion Detection
Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. NIDS Attacker Network Signature database
Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. GET <URL>/cmd.exe HTTP/1.1\n NIDS Attacker Network • “cmd.exe” is the attack pattern Signature database cmd.exe
Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. Be aware of the “cmd.exe” attack NIDS Shai Network • “cmd.exe” is the attack pattern Signature database cmd.exe
Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. GET <URL>/cmd.exe HTTP/1.1\n NIDS Attacker Network • “cmd.exe” is the attack pattern, • but only if it is part of a URL Signature database cmd.exe
Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. POST <URL>/cmd.exe HTTP/1.1\n NIDS Attacker Network • “cmd.exe” is the attack pattern, • but only if it is part of a URL, • and the HTTP method is GET Signature database cmd.exe
Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. GET <URL>/CMD.exe HTTP/1.1\n NIDS Attacker Network • “cmd.exe” is the attack pattern, • but only if it is part of a URL, • and the HTTP method is GET, • and takes into account upper-lower case characters, Signature database cmd.exe
Signature evolution • Informally, a signature is usually defined as “a characteristic pattern of the attack”. GET <URL>/%43MD.exe HTTP/1.1\n NIDS Attacker Network • “cmd.exe” is the attack pattern, • but only if it is part of a URL, • and the HTTP method is GET, • and takes into account upper-lower case characters, • and takes into account HTTP encodings Signature database cmd.exe
Problem in This Talk TCP streams • What we specify: a traditional signature that exposes: • false negatives • false positives cmd attack A traditional signature Goal: Develop a signature that is cheaper to enforce What we enforce: a signature that inherently fits the attack. TCP streams cmd.exe attack A traditional signature
Contributions • Conceptual: Protomatching signature • Practical: Superset Protomatcher • Real world impact: 25% improvement in Snort performance
Protomatching Signature • It is a regular expression with two properties: • Ensures that the characteristics pattern of an attack appears in the context that is necessary for the attack to succeed. • Second, a protomatching signature matches both normalized and encoded versions of an attack.
Superset protomatcher • It recognizes a superset of the traffic matched by a full-coverage protomatcher. • Three properties: • A superset protomatcher consumes less memory. • Traffic that matches the superset protomatcher may do not match any NIDS signatures • Traffic that does not match the superset protomatcher also does not match any signature in the NIDS database.
Related work • Protocol analysis and traffic normalization • Modern NIDS are based on the ANM methodology. • Ptacek and Newsham were the first to recognize that a NIDS that does not perform normalization is susceptible to evasion. • The problem of alternate encodings is particularly painful for HTTP traffic.
Related Work II • Fast pattern matching for NIDS • Previous work does not solve encodings problem, and does not consider protocol analysis in matching algorithm • Researchers have proposed using regular expression matching • To match regular expressions, Sommer and Paxson used a DFA. However, they performed matching on already-normalized traffic.
Related Work III • Dealing with high-speed links. • To deal with high-speed links, researchers have suggested a distributed NIDS that balances the network traffic such that each sensor monitors a different portion of the protected network • Our work focuses on the performance of a single sensor. It can perform better with cooperating distributed design.
Analyze-normalize-match (ANM) approach • First, a NIDS encodes its signatures in a normalized form • During runtime, NIDS parses the traffic according to the protocol the attack uses and normalizes the traffic • Last, the NIDS matches the normalized traffic against its normalized signatures.
Current conversion and signature matching • Naively, each phase requires traversing the input • In practice (e.g., Snort) two traversals: • Protocol analysis + normalization • Matching • Notice that all traffic, benign and malicious, requires all three phases GET <…>/%43MD.exe HTTP/1.1\n Protocol analysis Method = GET URL = <…>/%43MD.exe Version = HTTP/1.1 Normalization Sig=CMD.EXE URL=CMD.EXE String matching No Yes Benign Malicious
Protomatching GET <…>/%43MD.exe HTTP/1.1\n GET <…>/%43MD.exe HTTP/1.1\n Protocol analysis Sig=???? Method = GET URL = <…>/%43MD.exe Version = HTTP/1.1 • Goal: • Single traversal on the input • Protomatching= • Protocol analysis+ Normalization+ • Matching Normalization Sig=CMD.EXE URL=CMD.EXE Pattern matching No No Yes Yes Benign Malicious Benign Malicious
Protomatching GET <…>/%43MD.exe HTTP/1.1\n GET <…>/%43MD.exe HTTP/1.1\n Protocol analysis Sig=Regular expression Method = GET URL = <…>/%43MD.exe Version = HTTP/1.1 Single pass implies: use a Deterministic Finite State Machine Normalization Sig=CMD.EXE URL=CMD.EXE Pattern matching No No Yes Yes Benign Malicious Benign Malicious
Converting a traditional signature into a protomatching signature • Let S be a traditional signature • Expand S to conform to the protocol specification
Traditional signature • *[c|C][m|M][d|D].[e|E][x|X][e|E] • 8 states • size = 8*256=2048 bytes
Add a little bit of context • *”GET”*[c|C][m|M][d|D].[e|E][x|X][e|E] • 12 states • size = 12*256=3072 bytes
And even more context • (*\n\n)*”GET”[SP]+(PN)*[c|C][m|M][d|D].[e|E][x|X][e|E] • 18 states • size = 18*256=4608 bytes • SP denotes white space characters, and PN denotes characters • that can appear in a URL according to the HTTP specification • (e.g., ‘\n’ cannot appear in a URL).
Converting a traditional signature into a protomatching signature • Let S be a traditional signature • Expand S to conform to the protocol specification, obtaining S’ • Expand S’ to account for all possible encodings, obtaining S’’
Representing encodings The character c can be represented as: C, c, %43, %63, %U0043, %U0063, %u0043, %u0063 Replace every instance of the small machine with the large machine
And even more context • (*\n\n)*”GET”[SP]+(PN)*[c|C][m|M][d|D].[e|E][x|X][e|E] • 18 states • size = 18*256=4608 bytes
*\n\n”GET”[SP]+(PN)*[c-C][m-M][d-D].[e-E][x-X][e-E]and HEX encoding and Uencoding • 53 states • size = 53*256=13,568 bytes
Building a protomatcher • Let S be a traditional signature • Expand S to conform to the protocol specification, obtaining S’ • Expand S’ to account for all possible encodings, obtaining S’’ • Perform 1-3 for every traditional signature in your database, obtaining S1’’, S2’’,…,Sn’’ • Build the protomatcher: an FSM that identifies S1’’S2’’,…,Sn’’ Problem: we increased each signature by factor of 7 (at least). A full protomatcher does not fit into 2GB (or 4GB) of memory
Superset protomatching signature • Assumption: the majority of the benign traffic is not only benign, but also not even similar to malicious traffic. • For example, most benign traffic not only does not contain “cmd.exe”, but also does not contain “cmd.” • Note that is a request does not contain “cmd.”, then it also does not contains “cmd.exe” • “cmd.” is a superset signature because it matches the attack and more
Full protomatching signature for cmd.exe • *\n\n”GET”[SP]+(PN)*[c-C][m-M][d-D].[e-E][x-X][e-E]and HEX encoding and Uencoding • 53 states • size = 53*256=13,568 bytes
Superset protomatching signature for cmd.exe • *\n\n”GET”[SP]+(PN)*[c-C][m-M][d-D].[e-E][x-X][e-E]and HEX encoding and Uencoding • 29 states • size = 29*256=7,424 bytes
Building a superset protomatcher • Let S be a traditional signature • Trim S into a superset signature (e.g., “cmd.exe” into “cmd.”) obtaining S’ • Expand S to conform to the protocol specification, obtaining S’’ • Expand S’’ to account for all possible encodings, obtaining S’’’ • Perform 1-3 for every traditional signature in your database, obtaining S1’’’, S2’’’,…,Sn’’’ • Build the protomatcher: an FSM that identifies S1’’’S2’’’,…,Sn’’’
Superset Protomatching GET <…>/%43MD.exe HTTP/1.1\n GET <…>/%43MD.exe HTTP/1.1\n Sig=superset protomatching signature Protocol analysis Method = GET URL = <…>/%43MD.exe Version = HTTP/1.1 Superset Protomatcher: match a superset protomatching signature Yes Normalization Sig=CMD.EXE URL=CMD.EXE Pattern matching No No Yes Yes Benign Malicious Benign Malicious
Implementation • Implemented a compiler that converts a traditional signature into a protomatching signature • The compiler also builds the protomatcher • Incorporated the protomatcher into Snort • Used traditional Snort as the second phase of a superset protomatcher
Two ways to implement Protomatcher • Using a deterministic FSM. That is what we do in the examples used. • Using a hierarchical FSM. It has two parts: a matcher and a normalizer. • The matcher is responsible for protocol analysis and pattern matching. • The normalizer is responsible for processing multiple encodings. • Unlike ANM which first normalizes the whole http request, it uses the normalizer only when necessary. • Can help reduce memory needed.
Performance improvement ApPPT: Average per Packet Processing Time (cycles)
Sensitivity to Cache Poisoning Attack • We assumed that the attack would have a larger effect on a protomatcher-based Snort than on vanilla Snort. • But the result contradicts the assumption. There might be two reasons for this result: • First, the attack was ineffective in increasing the number of cache misses. It means that a more sophisticated cache poisoning attack is needed. • Second, the attack was effective, but cache performance is only a minor component of the ApPPT.
Conclusion • Optimize for the common case is a known method • In this talk we presented develop a technique that uses this method to improve matching efficiency • Our technique is based on formal methods • These methods enable automation, therefore efficiency, and facilitates accuracy
Discussion on shortcomings • Failure due to Cache-poisoning attacks • Converting a Protomatching signature to a superset signature should be done manually. Better methods?