ECE 526 – Network Processing Systems Design

ECE 526 – Network Processing Systems Design Network Security 11/10-12/2008

What is network security Confidentiality: only sender, intended receiver should “understand” message contents Authentication: sender, receiver want to confirm identity of each other Message integrity: sender, receiver want to ensure message not altered (in transit, or afterwards) without detection Access and availability: services must be accessible and available to valid users only ECE 526 2

Attack Types eavesdrop: intercept messages actively insert messages into connection impersonation: can fake (spoof) source address in packet (or any field in packet) hijacking: “take over” ongoing connection by removing sender or receiver, inserting himself in place denial of service: prevent service from being used by others (e.g., by overloading resources)

Security in Exiting Internet “At least we understand cryptography now…” ECE 526 4

K K A B Cryptography for Confidentiality symmetric key crypto: sender, receiver keys identical Pros and cons: public-key crypto: encryption key public, decryption key secret (private) Pros and cons: Alice’s encryption key Bob’s decryption key encryption algorithm decryption algorithm ciphertext plaintext plaintext

Cryptography for Message Integrity H: hash function H: hash function large message m large message m H(m) Bob’s private key + - digital signature (decrypt) digital signature (encrypt) K K B B Bob’s public key encrypted msg digest encrypted msg digest + - - KB(H(m)) KB(H(m)) H(m) H(m) equal ? Alice verifies signature and integrity of digitally signed message Integrity using the hash function Signature is the encryption key, encrypt h(m) instead of messages.

- K (R) A + + K K A A - - + (K (R)) = R K (K (R)) = R A A A Cryptography for Authentication number (R) used only once –in-a-lifetime Potential problem: in the middle attack “I am Alice” Bob computes R and knows only Alice could have the private key, that encrypted R such that “send me your public key”

What NP systems can Do to improve Access and availability

firewall Firewalls isolates organization’s internal net from larger Internet, allowing some packets to pass, blocking others. public Internet administered network firewall

Firewalls: Why prevent denial of service attacks: • SYN flooding: attacker establishes many bogus TCP connections, no resources left for “real” connections prevent illegal modification/access of internal data. • e.g., attacker replaces CIA’s homepage with something else allow only authorized access to inside network (set of authenticated users/hosts) three types of firewalls: • stateless packet filters • stateful packet filters • application gateways

Credential-based Networks to improve Access and availability

Setup Credentials ECE 526 12

Credentials Data Structure • m: # of bit in the array • n: # of hash functions • r: # of hops Two steps of bloom filter • Programming • Query ECE 526 13

False Positive Probability • false negative is impossible ->legal packet will be forwarded • false positive is possible -> how big the chance • Refer to http://en.wikipedia.org/wiki/Bloom_filter ECE 526 14

Intrusion detection systems • multiple IDSs: different types of checking at different locations application gateway firewall Internet internal network Web server IDS sensors DNS server FTP server demilitarized zone

Intrusion detection systems • packet filtering: • operates on TCP/IP headers only • no correlation check among sessions • IDS: intrusion detection system • deep packet inspection: look at packet contents (e.g., check character strings in packet against database of known virus, attack strings) • examine correlation among multiple packets • port scanning • network mapping • DoS attack

NIDS Techniques • Signature-based • Anomaly-based • Stateful detection • Application-level detection

Signature-base NIDS Similar to the traditional anti-virus applications Example: Martin Overton, “Anti-Malware Tools: Intrusion Detection Systems”, European Institute for Computer Anti-Virus Research (EICAR), 2005 Signature found at W32.Netsky.p binary sample Rules for Snort:

Signature matching Used in intrusion prevention/detection, application classification, load balancing Input: byte string from the payload of packet(s) Hence the name “deep packet inspection” Output: the positions at which various signatures match. challenges thousand of possible signature high performance requirement easy to update the new patterns

Initial State • h • e • s • S • Transition Function • h • S • State • h • s • h • r • e • i • h • S • Accepting State • h • i • S • h • s • S • h • h • S • 7 • 4 • 3 • 6 • 8 • 2 • 1 • 0 • 5 • 9 • r • h • S DFA construction • Example: P = {he, she, his, hers} ECE 526

h • e • s • 0 • S • h • S • r • s • e • h • h • i • h • S • h • i • S • s • h • S • h • h • S • 3 • 5 • 4 • 1 • 7 • 9 • 8 • 6 • 2 • r • h • S DFA Searching • Matching String • Input stream: • Scanning input stream only once • Complexity: linear time • . • h • x • h • e • r • s ECE 526

Network Attack Patterns

256 entries for each state Snort Dec. 2005 has 2733 patterns Needs 27000 states Memory size – 13 MB DFA mapped to Traditional Memory

SAM-FSM • Traditional – 13 MB; Ours – 16KB

Overall System

Anomaly-based NIDS • Signature-based NIDS can’t detect zero-day attacks • Anomaly: Operations deviate from normal behavior. • What could cause anomaly? • Malfunction of network devices • Network overload • Malicious attacks, like DoS/DDoS attacks • Other network intrusions • Two main kinds of network anomalies. 1. Related to network failures and performance problems. 2. Security-related problems: (1) Resource depletion (2) Bandwidth depletion 26

“Mining needle in a haystack. So much hay and so little time” Key Technical Challenges • Large data size • Millions of network connections are common for commercial network sites • High dimensionality • Hundreds of dimensions are possible • Temporal nature of the data • Data points close in time - highly correlated • Skewed class distribution • Interesting events are very rare  looking for the “needle in a haystack” • High Performance Computing (HPC) is critical for on-line analysis and scalability to very large data sets

Anomaly detection meets troubles • There are many schemes based on checking abrupt traffic changes. • E.g. apply signal processing technique to detect out traffic’s abrupt change • However, this kind of anomaly does not always mean illegitimate. • Abrupt change of traffic does not mean an attack has exactly happened • We call this case as: Legitimately-abrupt-change(LAC) 28

Legitimately abruptchanges • Example 1: • Famous information gateway websites, e.g. Yahoo. • When bombastic news is announced, it would appear. • Example 2: • Special information announce center, e.g. the website of national meteorological agency • When a nature disaster issaid to be coming, it would occur. • Typhoon, Earthquake, Tsunami • Important outdoor holidays 29

Anomaly Detection • Already used by industry --Protocol Anomaly --Statistical/Threshold based • In Research --Data mining

Protocol Anomaly Detection Based on the well established RFCs Focus on the packet header Example: --All SMTP commands have a fixed maximum size. If the size exceeds the limit, it could be a buffer overflow or malicious code inserting attack --SYN flood attack: attacker sends SYN with fake source address --Teardrop attack: fragmented IP packets with overlapped offset

Threshold based Using training data to generate a statistical model, then select proper thresholds for network environment (traffic volume, TCP packet count, IP fragments count, etc.) -- usually used as an complementary tool

Stateful IDS • No practical Solutions • Very simple implementing Example: Snort uses patter matching in continuous Packets. Traditional signature rules: “pattern1” “pattern1 || pattern2” The rule now can be defined as: “pattern1.*pattern2”

Application-level IDS Focus on specific services or programs (Web Server, Database, etc.) Example --Monitoring all invocation for Microsoft RPCs --Analyze HTTP request for malicious query strings Products: --mod_security: an optional IDS component for Apache Web Server

Current NIDS Challenges • High false positives -- FP of 0.1% means a normal packet will be misclassified as an alert for every 1000 normal packets, which is about one error alert per minute on a 100M network • Zero day attack (unknown attack) --Most current products rely on signature-based detection, difficult to detect new attacks. • Poor at automatically preventing ability --Human interaction is required when attack is detected

IDS Today Products • Snort • McAfee Intrushield • ISS RealSecure • Cisco IPS • Symantec IDS

Snort • Open Source, since 1998 • Used by many major network security products • Signature-based (more than 3000) • Simple IP header protocol anomaly detection • Simple stateful pattern matching

McAfee • Profile-based anomaly detection --Manually create profile --Create profile by self-learning through a training period • Using profile plus threshold for defending against DOS and DDOS • Inspect encrypted traffic by collecting the server side private keys

ISS RealSecure • About 2000 signatures • Application-based approach --identifying any possible exploit to the published vulnerabilities of MS RPC, IIS, Apache, Lotus, etc. • Additional support for P2P,Instant Messengers • Virtual Prevention System --a virtual environment to examine the execution of a file in order to find any possible malicious behaviors • Support for IPv6 --Detect possible backdoors which enable the IPv6 of a system (usually off)

Cisco IPS produtcs • Protocol decoding • Threshold based property checking • Signature matching • Protocol Anomaly Detection • Checking file behaviors by intercepting all calls to the system resources

Symantec • Multi-steps (protocol, vulnerability, signature, DOS, traffic, evasion check) • Unique feature: evasion check e.g. request “/index.html” can be replace with “/%69nd%65x.html” to evade the signature matching

Summary of Current Products

Academia on Anomaly Detection • Columbia University --Data mining based (since 1997) • University of California at Santa Barbara --Service Specific (HTTP) --Stateful IDS • Florida Institute of Technology --Protocol Anomaly (Statistical based) • University of Minnesota --MIND (Minnesota Intrusion Detection System)

Columbia Univ. IDS • 1997, Applied RIPPER rule learning algorithm on UNIX system calls monitoring for malicious events detection • 1998, Applied the algorithm on off-line network traffic data (clean training data) • 2000, Applied EM and clustering algorithm for dealing with noisy dataset • 2001, Developed an complete experiment NIDS based on those algorithms. • 2004, New approach towards payload anomaly detection

Implementing Procedure • Wenke Lee, Sal Stolfo, and Kui Mok., “A Data Mining Framework for Building Intrusion Detection Models”, Proceedings of the 1999 IEEE Symposium on Security and Privacy, Oakland, CA, May 1999 • Pre-Processing • Process raw packet data • Feature construction • Create statistic features • Apply RIPPER algorithm • Rule learning

Pre Processing • SYN flood attack

Feature Construction (service=http, flag=S0, dst_host=victim), (service=http, flag=S0, dst_host=victim) -> (service=http, flag=S0, dst_host=victim) [0.93, 0.03, 2] 93% of the time, after two http connections with S0 flag are made to host victim, within 2 seconds from the first of these two, the third similar connection is made, and this pattern occurs in 3% of the data

RIPPLE Rules smurf :- service=ecr_i, host_count >= 5, host_srv_count>=5 ( if the service is icmp echo request, and connections with the same destination host are at least 5, and connections with the same service are at least 5,then it is a smurf/DOS attack) satan :- host_REJ_%>=83%, host_diff_srv_% >= 87% ( for connections with the same destination host, if the rejection rate is at least 83%, and the percentage of different services is at least 87%, then it is a santa/PROBING attack)

Experiment Results • Applied on DARPA’98 Intrusion Detection Evaluation Data Set

Payload based Approach K. Wang, S. J. Stolfo, “Anomalous Payload-based Network Intrusion Detection”, RAID 2004 • Construct the statistical model for all bytes in the header • Use Mahananobis distance to measure the difference Problems: • Clean training data is required • False positive (unacceptable)

ECE 526 – Network Processing Systems Design