280 likes | 446 Views
Survey of Intrusion Detection Systems. Motivation. The worldwide impact of malicious code attacks is estimated to be over $10 Billion annually.
E N D
Motivation • The worldwide impact of malicious code attacks is estimated to be over $10 Billion annually. • The CERT center at CMU reported 73,359 security incidents between 1/1/02 and 9/31/02, equal to all of the security incidents reported in 2000-2001 combined. • Novice attackers can easily acquire and use automated denial-of-service attack software. • Human security analysts can't keep up with it all
Intrusion Detection Attempts to detect unauthorized or malicious activities in a network or on a host system • Signature-based - looks for patterns that are known to be intrusive in packets or audit logs • Anomaly-based - looks for 'abnormal' activity, usually requires a template of 'normal' activity Determining 'who' is much harder than just detecting that an intrusion occurred.
Early Work on Security • Saltzer and Schroeder (1974) - established security design principals and mechanisms • Orange Book (1985) - DoD specifications • Formal Models • Bell -LaPadula (1976) - supported formal proofs of conformance to security policies • Denning (1987) - described the requirements for designing an intrusion detection system
Early Systems • IDES - statistical anomaly detection • Haystack - also added signature detection • Wisdom & Sense - automatically created a profile of 'normal' behavior from past user and host activities • ISOA - uses both real-time monitoring and post-session analysis to detect suspicious behavior, developed profiles at both levels
Recent Research in ID • NIDES - distributed collection of host data, centralized analysis (extension of IDES) • NSM - network traffic monitoring for anomalous packets • DIDS - combines host-based (Haystack) and network monitoring (NSM) • CSM - peer-to-peer distributed analysis
Recent Research (continued) • Bro - analyzes packet contents • GrIDS - builds graphs of network activity and looks for anomalies • STAT and NetSTAT - model attack with state machine. if accepted, attack occurred • EMERALD - framework for building an ID system with distributed collection and analysis, modular design (extended NIDES)
Additional IDS Projects • Data-mining for ID - numerous projects mining host audit data, captured packets • Autonomous Agents - independent agents monitor specific activities/resources and report to hierarchy of analyzers • Open source projects - (e.g. SHADOW and Snort) - performance comparable to commercial and research systems
Major Problems • High False-Alarm Rates - real-world tests show overwhelming numbers of false alarms, little success in filtering them out • Availability of Training Data - most anomaly-based ID systems need attack-free datasets. Currently, no clear way to create or certify realistic attack-free data
New Methods of Intrusion Detection William Allen
Proposed Work • a new intrusion detection technique that has been shown to detect certain types of denial-of-service attack without a traffic template • a new intrusion detection technique, based on an attack's effect on the correlation fractal dimension of network traffic. This method, while requiring a template, appears to be more sensitive to attacks than the above technique. • an attack generation tool that will aid researchers in testing intrusion detection systems
Intrusion Detection Research • Initially used statistical analysis and pattern recognition techniques to detect misuse or intrusions on individual systems. • Later, network traffic analysis was added. • Even later, distributed data collection and analysis and more sophisticated detection algorithms. • The most recent ID systems provide a modular framework which can employ a range of sensor types and detection techniques and analyze both host data and network traffic.
The Most Important ProblemsFaced by ID System Designers • High False-Alarm Rates - real-world tests show overwhelming numbers of false alarms, little success in filtering them out • Availability of Training Data - most anomaly-based ID systems need attack-free datasets. Currently, no clear way to create or certify realistic attack-free data
The Lincoln Labs Evaluation of DARPA-Funded ID Systems The Lincoln Labs evaluation ran 1998-99 • modeled real traffic and collected attack code • generated attack-free traffic from model • inserted attacks and logged their location • 1999 evaluation produced five weeks of data: • Two weeks of attack-free traffic and one week of attack training data • Two weeks (4 and 5) were the evaluation datasets which contained many different types of attacks • Attack databases listed attack occurrences
The Lincoln Labs Data and Self-similarity • Leland et al. (1994) demonstrated that LAN traffic tends to be self-similar • Weeks 1 and 3 of the Lincoln Labs dataset were consistently self-similar between 8am and 6pm • Only 50% of the evening data were self-similar • See "On the Self-similarity of Synthetic Traffic for the Evaluation of ID Systems" (SAINT ’03) • Question: can the loss of self-similarity be a reliable indicator of an attack?
---- Packet Arrivals ---- … X1 X2 X3 X4 XN Prior to 1994, traffic was modeled as Poisson with independent increments • which implies: • Self-similarity implies: • Historically, we use to characterize the degree of self-similarity
Qualitative Characteristics of Self-Similar Data (Beran) • Long periods where observations remain at high level AND long periods at low level. • Looking at long periods there is no apparent “persisting” trend; rather, “cycles of all frequencies occur, superimposed and in random sequence.” • The arrival pattern looks the same over time
Types of Denial-of-Service Attack Traffic Exploit (DoS-TE) - uses a continuous stream of traffic to overwhelm the target examples: mailbomb, neptune, apache System Exploit (DoS-SE) - takes advantage of a software vulnerability to cause a system to fail, usually only needs a few packets examples: land, teardrop, ping o' death
Accountability Criteria • We developed two criteria for DoS-TE • attack duration must be 5 min • attack intensity = and must be 1.0 over the attack’s duration • 6 attacks in the LL data met these criteria and we detected 5 of them (83% success) • There were 8 additional DoS-TE attacks, but they failed to meet the above criteria
Further Analysis (1) Can we detect similar attacks in a different background dataset? • we extracted each attack from its original 'day' • each extracted attack was rescaled to have the same attack/background intensity as before • the attacks were merged into new (self-similar, non-attack) background data and they were detected at the same rate as before
Further Analysis (2) How sensitive is the detector to attack intensity? • we scaled the intensity of the extracted attacks from 0% to 200% and attempted detection • all 5 of the originally detected attacks were detected at 60% or lower intensity and the 6th attack was detected at 150% of its original level
Fractals and Fractal Dimension • Mandelbrot developed the concept of fractals to describe non-Euclidean shapes • the correlation fractal dimension is a measure of the local structure of the fractal curve • defined (Grassberger 1983) as the probability that two points xi and xj are within distance r : where: if x < 0 then(x)=0, else (x)=1
Intrusion Detection Based on the Correlation Fractal Dimension • We performed an experiment which shows that the correlation fractal dimension is affected by the presence of a network attack. • We compared the fractal dimension of data containing an attack with a traffic template that did not contain any attacks • We used the algorithm given by Ayedemir but intend to also investigate other algorithms for calculating the correlation fractal dimension
Experimental Procedure • We extracted attacks from their original location and merged them with an attack-free, self-similar background • Using a 5 minute sliding window that advanced in 1 minute steps, we calculated the correlation fractal dimension of both the template data and the merged attack+background • We used the F-test to determine if the difference between the two correlation dimensions was significant