Anomaly Detection Using “Normal” Data

Anomaly Detection Using “Normal” Data Lynn Jones Stottler Henke Associates, Seattle, WA Lockheed Martin, Gaithersburg, MD • 12/20/01

“Table of Contents” • Introduction • Related work • Project overview • SHAI’s anomaly detection • Key component: CVFDT • Why this will work • What else we need • Other application areas

ChAD: Change and Anomaly Detection • Model-based Change and Anomaly Detection system. • Has nothing to do with voting in Florida. • Models normal behavior by observing normal behavior. • Detects departures from normal. • Does not require profiles or signatures of abnormal behavior (faults, attacks).

ChAD: Change and Anomaly Detection • Learns a model unique to the monitored network and host. • Robust when faced with noisy data. • Adapts to fluctuations in network usage. • Detects when characteristics change. • Reports on the rate and significance of observed changes.

Related work by SHAI • Network management and security • Athena: Mixed-initiative Defensive Information Warfare. • ICE: Intelligent Correlation of Evidence (for network intrusion detection).

Related work by SHAI • Datamining • CASAD: Clustering Activity Streams for Anomaly Detection. • MediMiner, IKODA (Intelligent Knowledge Discovery Assistant) data mining algorithms and frameworks.

Related commercial work • Aprisma’s SPECTRUM suite • event correlation and model-based reasoning. • SNMP MIB and other data. • SRI’s Emerald (eBayes) • hybrid signature-based / anomaly detection monitoring. • tcpDump data and derived events.

Related research • Cabrera, et.al. • look for differences in behavior of selected “key variables.” • INBOUNDS • statistical modeling using “abnormality factors” and “standardization factors.” • Eskin, et.al. • Automatic outlier partitioning and learned model replacement.

ChAD is a component of MASRR • MASRR: Multi-Agent System for Network Resource Reliability • Decentralized monitoring and response. • Prediction and detection of attacks, faults, misconfigurations, etc. • Network steering to maintain performance. • Funded as a DARPA SBIR Phase II.

Detection of events not previously seen. Adaptation to changing usage characteristics. Operation in heterogeneous environments. Real-time performance. Scalability in deployment and operation. Autonomous / semi-autonomous operation. Robustness. MASRR goals

MASRR current focus • We have chosen to focus our efforts on Anomaly Detection using normal data.

SHAI’s anomaly detection • Use data mining methods to build a descriptive model that detects changes in the data stream. • We believe we can overcome specific issues and problems...

SHAI’s anomaly detection

Key component: CVFDT • Concept-adaptive Very Fast Decision Tree • on-line decision tree model. • does not have to see “all” the data first. • accuracy converges to offline models. • Network usage changes over time. Rather than a stationary concept, data is “generated by a series of concepts.” • Hulten, Spencer, & Domingos, “Mining Time-Changing Data Streams”, KDD ’01

Decision Trees, in general A decision tree built from some engine data shows that the life of oil seals depends on the operating temperature, and, less definitively, the pressure. This model might be used in making a maintenance schedule.

Adaptive Decision Trees • The company changes the supplier of its oil seals, and begins seeing early failures of seals when operating pressure is around 15, with a wide variance in temperature. The adaptive tree starts an alternate tree...

Adaptive Decision Trees New records are processed by both trees. As the alternate tree grows, it eventually becomes more accurate than the original. The alternate is promoted and the original tree is pruned.

CVFDT - more detail • Each node keeps “sufficient statistics” on the examples seen. • Sliding window of examples. • Nodes maintain statistics, forget examples as the window slides. • Structure of the tree is periodically evaluated, using statistics.

CVFDT - more detail • Alternate tree is started using different split attribute. • After every n examples, trees are tested for accuracy. • If alternate is better, replace original. • If alternate fails to improve, it is pruned.

CVFDT reveals: • That system behavior is changing. • How it’s changing – which variable(s). • The degree to which it’s changing – how dramatically, how rapidly, whether transient or permanent. • ChAD applies CVFDT in a novel way to perform anomaly detection using normal, unlabeled data.

MASRR agents use ChAD • Segment usage and model different periods of normal activity. • Manage the library of normal models. • Interpret results of ChAD models. • Share their observations. • Adjust parameters to tune model sensitivity.

Why this will work • Utilizes routine fluctuations to create more precise periodic models. • Each agent is sensitive to small changes (slow changes, changes across few variables) on the element(s) it monitors.

Why this will work • When the network is compromised in some area, absence of data or agent response is also used as information. • Combines general anomaly detection with root cause analysis.

Why this will work • More general than eBayes: can detect various kinds of anomalies across different variables. • “Key variable” signatures not required as in Cabrera, et. al. (similar rules might be used for fault/attack identification). • Decentralized analysis more sensitive than INBOUNDS’ centralized system.

Known issues • Overhead - processing, disk space. • Getting the sensitivity parameters right. • Are parameters universal? Or do they depend on the data? • Amount of data needed. • What about pre-existing conditions? • Feature selection.

What (else) will it take? • Testing and refinement with real data. • Implementation of the agent reasoning system. • Implementation of heuristics. • Feature selection experiments.

Other applications • Manufacturing processes monitoring. • Condition-based monitoring (military and commercial) - e.g., fault and wear prediction for maintenance scheduling.

Conclusion • SHAI is developing an anomaly detection system that we believe: • is scalable, • works in real-time, • detects attacks or faults not previously observed, • learns in-place using normal, unlabeled data.

General info on SHAI • Artificial Intelligence R&D firm, founded in 1988. • Extensive experience • Hundreds of fielded systems. • Variety of AI techniques and application areas.

Contact info Lynn Jones lwjones@shai-seattle.com http://64.81.14.30/ReliabilityWeb/ SHAI 1107 NE 45th St. Suite 427 Seattle, WA 98105

Anomaly Detection Using “Normal” Data

Anomaly Detection Using “Normal” Data

Presentation Transcript

Static Race Detection for C

Chapter 6 ~ Normal Probability Distributions

CS294-32: Dynamic Data Race Detection

SQL Unit 17 Normalization

Large-Scale Copy Detection

Misuse detection systems

History

Chapter 19

Ebstein’s Anomaly

Network Payload-based Anomaly Detection and Content-based Alert Correlation

The normal ECG

ecs236 Winter 2006: Intrusion Detection #2: Vulnerability Analysis

Ch. 6: Face detection

Pacer: Proportional Detection of Data Races

Chapter 5: The Data Link Layer

Chapter 5: The Data Link Layer (last updated 19/04/05)

DATA MINING FOR INTRUSION DETECTION

Chapter 5: The Data Link Layer

Monte F. Hancock, Jr. Chief Scientist Celestech, Inc.

Misuse detection systems

Detection of Fraud