170 likes | 418 Views
Intrusion Detection Using Data Mining. By Anshu Veda(04329022) Prajakata Kalekar(04329008) Anirudha Bodhankar(04329003) Under the Guidance of Prof Sunita Sarawagi. Problem Definition.
E N D
Intrusion Detection Using Data Mining By Anshu Veda(04329022) Prajakata Kalekar(04329008) Anirudha Bodhankar(04329003) Under the Guidance of Prof Sunita Sarawagi
Problem Definition • An Intrusion Detection System is an important part of the Security Management system for computers and networks that tries to detect break-ins or break-in attempts. • Approaches to Solution • Signature-Based • Anomaly Based.
Types of Intrusion Detection • Classification I • Real Time • After-the-fact (offline) • Classification II • Network Based • Host Based
Recommended Approach • None provides a complete solution • A hybrid approach using HIDS on local machines as well as powerful NIDS on switches
Attack Simulation • Types of attacks • NIDS • SYN-Flood Attack • HIDS • ssh Daemon attack.
NIDS – Data Preprocessing • Input data • tcpdump trace. • Huge • One data record per packet • Features extracted(Using Perl Scripts) • Content-Based Group records and construct new features corresponding to single connection • Time-Based Adding time-window based information to the connection records (Param: Time-window) • Connection-Based Adding connection-window based information (Param: Time-window)
Preprocessing on tcpdump • From the tcpdump data we extracted following fields • src_ip ,dst_ip • src_port, dst_port • num_packets_src_dest / num_packets_dest_src • num_ack_src_dst/ num_ack_dst_src • num_bytes_src_dst/ num_bytes_dst_src • num_retransmit_src_dst/ num_retransmit_dst_src • num_pushed_src_dst/ num_pushed_dst_src • num_syn_src_dst/ num_syn_dst_src • num_fin_src_dst/ num_fin_dst_src • connection status
Preprocessing on tcpdump cont… • Time-Window Based Features • Count_src/count_dst • Count_serv_src/ count_serv_dest • Connection-Window Based • Count_src1 /count_dst1 • Count_serv_src1/ count_serv_dest1
NIDS- Datamining Technique • Outlier Detection • Clustering Based Approach(K-Means) • Outlier Threshold • Preprocessed dataset • K-NN Based Approach • distance threshold • Preprocessed dataset • Results • Clustering did not give good results. • Limited Data • K-NN • Giving Alarms
HIDS – Data Preprocesing • Input data • “strace” system call logs for a particular process(sshd) • One data record per system call • Sliding-Window Size for grouping. • Features extracted(Using Perl Scripts) • Sliding the window over the trace to generate possible sequences of system calls.
HIDS – Data Preprocessing cont… a d f g a e d a e b s d e a a d f g d f g a f g a e g a e d a e d a e d a e d a e b a e b s e b s d b s d e s d e a
Datamining Technique Used • Learning to predict system calls • Predict ith system call for each test record<p1, p2,p3> • Done using Classification (Decision Trees) • Anomaly Detection • Use of misclassificationscore to detect anomalies
Literature Survey • Types of attacks (Host and Network Based) • Techniques • Association rules and Frequent Episode Rules over host based and network based • Outlier Detection using clustering • classification
Future Work • NIDS • To incorporate threshold distance as a configurable parameter for K-Means Algorithm used • HIDS • Try out meta-learning algorithms for classification • A small user Interface for configuring parameters.
References • “Mining in a data-flow Environment: Experience in Network Intrusion Detection”, W. Lee, S. Stolfo, K. Mok. • “Mining audit data to build intrusion detection models”, W. Lee, S. Stolfo, K. Mok. • “Data Mining approaches for Intrusion Detection”, W. Lee S. Stolfo. • “A comparative study of anomaly detection schemes in network intrusion detection”, A. Lazarevic, A ozgur, L. Ertoz, J. Srivastava, Vipin Kumar. • “Anomaly Intrusion detection by internet datamining pf traffic episodes” Min Qin & Kai Gwang. • “A database of computer attacks for the evaluation of Intrusion Detection System”, Thesis by Kristopher Kendall.