Anomaly Detection using Curious Agents

Anomaly Detection using Curious Agents A Case Study in Network Intrusion Detection Kamran Shafi Research Associate DSARC Kathryn Merrick Lecturer ITEE

Presentation Outline • Anomaly Detection • Curious Agents • Intrusion detection using curious agents • Metrics, results and analysis • Future work

Anomaly Detection • The process of finding patterns that deviate from the known or expected behaviour of a monitored system. • Assumptions: • Normal is prevalent • Anomalous is significantly different • A word on novelty detection…

Anomaly Detection Challenges • A burglar alarm detects anomalies… • Computational models are faced with much more difficult task • What is ‘normal’ anyway? • Listing all possibilities is infeasible • Concepts change over time • Labelled data may be unavailable, noisy • Anomalies can be very similar to normal • Anomalies are domain dependent

Anomaly Detection Techniques • Classification • Nearest Neighbour (NN) • Clustering • Statistical • Information Theoretic Approaches • Spectral Analysis Based on Varun et al., 2009

Research Questions in Anomaly Detection • Streaming data • Concept drift • Data labelling • Contextual anomaly detection

Curious Agents • Online, single-pass, unsupervised learners • Currently used in robotics and character animation • Programmed to seek out and focus on ‘curious’ stimuli UNSW@ADFA Sony CSL, Paris

Curiosity • In humans and animals: • A motivation to seek an ‘optimal’ level of stimulation • Stimuli that are similar-yet-different to what we already know • In artificial systems: • A scalar value for environmental stimuli based on: • Similarity • Frequency of similar stimuli • Recency of similar stimuli

Curious Agent Models Curious reflex agents Proposed for anomaly detection in networks Curious reinforcement learning agents Robotics and character animation Curious supervised learning agents Intelligent sensed environments (Merrick and Maher, 2009)

Case StudyNetwork Intrusion Detection • The two fundamental approaches to ID: • Misuse detection • Anomaly detection • Anomaly detection for ID classified in two ways: • How normal data is interpreted and modelled • Host based, network based… • How similarity is measured • Statistical profiling, pattern matching, classification, clustering…

Intrusion Detection Challenges • Intrusions need to be detected in real time, before they can damage the system • Concepts change over time • New (legitimate) users • New applications • Novel attacks • Attacks are stealthy and disguised as normal

Advantages of Curious Agents for Network Intrusion Detection • Curious agents combinethree measures to analyse stimuli (network data): • Similarity: clustering layer • Recency: habituating layer • Frequency: interest layer • Online, single-pass learners: • Potential for real-time operation • Unsupervised learners: • Potential to adapt to changes in network usage • Don’t require labelled data

Curious Reflex Agents for Intrusion Detection • Approaches tested: • Self-organising map • K-means clustering • Simplified ART network

Experimental Data: KDD Cup Dataset • 1999 KDD Cup intrusion detection dataset: • 38 attack categories (14 only in the test data) + normal data • Approx. 40 features • Used commonly to test algorithms for intrusion detection • Critiques of KDD Cup data: • Parent dataset contains simulation artefacts – affect to KDD Cup data is not known • Not suitable to evaluate supervised learning methods – curious agents approach overcome this problem • Labelling issues, outdated

Experimental Design • Data pre-processing • Mapping categorical features • Normalisation (Formula ??) • Number of runs • Validation on training set • Validation on test set • Algorithm parameters • ?

Metrics • True positive rate: • A weighted measure • Percent of first instances of an attack that trigger agent curiosity above some threshold C • Higher the better • False positive rate: • Percent of all normal data that triggers agent curiosity above threshold C • Lower the better

Results and Analysis: Overview C = 0.7

Results in Detail

Aggregative Detection Rates

Strengths and Limitations • Strengths: • High detection rates for rare attack types • Potential for Real time intrusion detection • Allowance for alarm aggregation • Allowance for tuning detection – false alarm tradeoff • Limitations: • False positive rate is too high to be practical • Parameter settings – can be made adaptive

Conclusions • Curious agents show potential as an approach to ID: • Online, single-pass, unsupervised learning • High detection rate for attacks in KDD data set • False positive detection rate needs to be decreased before practical application possible • Future Directions • Parameter adaptation • Semi-supervised feedback • Evaluation with other datasets • Real time implementation

Anomaly Detection using Curious Agents

Anomaly Detection using Curious Agents

Presentation Transcript

Anomaly Detection

Data Mining Anomaly Detection

Population-Wide Anomaly Detection

Anomaly Detection

Anomaly Detection Systems

Anomaly Detection Using Call Stack Information

ERROR DETECTION using AGENTS

Traffic Anomaly Detection

Anomaly Detection Using “Normal” Data

A Hybrid Anomaly Detection Model using G-LDA

Anomaly Detection Using Data Mining Techniques

Anomaly Detection Systems

Volume Anomaly Detection

Multi-Route Anomaly detection using Principal Component Analysis

Anomaly Detection: A Tutorial

Global Anomaly Detection Market

Anomaly Detection Using Data Mining Techniques

Anomaly Detection Using Call Stack Information

Anomaly Detection in Fruits using Hyperspectral Images

Anomaly Detection Industry