230 likes | 405 Views
URCA: Pulling out Anomalies by their Root Causes. Fernando Silveira and Christophe Diot. URCA: Pulling out Anomalies by their Root Causes. Presenter: Fernando Silveira UPMC and Technicolor. Joint work with Christophe Diot. Presented at INFOCOM 2010 – San Diego, USA.
E N D
URCA: Pulling out Anomalies by theirRoot Causes Fernando Silveira and Christophe Diot
URCA: Pulling out Anomalies by theirRoot Causes • Presenter: Fernando Silveira • UPMC and Technicolor Joint work with Christophe Diot Presented at INFOCOM 2010 – San Diego, USA
Traffic Anomaly Detection Alarm Anomaly Detector Traffic Data Anomaly Anomaloustraffic Packet counts Time
Root Cause Analysis of Traffic Anomalies • Obtaining information about an anomaly’s cause. • Automating root cause analysis is important… • Manual analysis is tedious and error prone • Study from Arbor Networks with 67 ISPs • Average ISP observes ~ 19 anomalies/day • … but it is also a hard problem. • Most detectors do not provide any information beyond an alarm
Related Work • Anomaly detection methods with properties that facilitate root cause analysis tasks • Anomaly classification • Lakhina et al. - SIGCOMM’05 • Based on clustering entropy residuals • Limited to anomalies found in entropy • Anomalous flow identification • Schweller et al. - IMC’04, Li et al. - IMC’06 • Based on reversible sketches • Complexity of choosing and computing sketches • Limited to anomalies found in sketches
Our Contribution • URCA (Unsupervised Root Cause Analysis) • a tool that finds an anomaly’s root cause • can be used with different anomaly detectors • It provides accurate and fast results: • anomalies are analyzed as fast as they are detected (1-5 minutes)
Outline Algorithms for URCA Performance Evaluation
Our Approach URCA has two steps: • anomalous flow identification • root cause classification Our methods rely on flow features
Step 1: Anomalous Flow Identification Alarm Filter Anomaly Detector Traffic Data Destination Port
Flow Identification - Example Destination AS (3 values) Output Interface (2 values) AS 2108 Anomaly Packet counts Candidate flows Anomalous flows AS 3354 eth0 AS 1277 Normal flows Normal flows eth1 Time
Visualizing Root Cause Flows Network scan Routing change
Step 2: Root Cause Classification a a a a b b b ? c c c • We compute metrics from each anomaly • number of source IP’s, ASN’s, flow sizes, packet sizes, etc. • Hierarchical Clustering • known anomalies + 1 unknown • Bootstrapping labels • helped by visualization
Outline Algorithms for URCA Performance Evaluation
Experimental Methodology Traces from links in GEANT2 Anomalies obtained with the ASTUTE anomaly detector
Identification Accuracy - Traces B-F * 90-percentile averaged across traces
Classification Accuracy - Trace A 80% Correct 5% require visualization 15% Misclass. 5% first occurrences of an event type + 10% routing changes mistaken for link failures 15% Misclassified =
Wrapping Up • What you’ll find in the paper: • Algorithms for both identification and classification • Experimental evaluation with 6 traces • URCA can be applied to other anomaly detectors • Ongoing and Future Work: • URCA with an EWMA-based detector • Using other sources of data (e.g., routing data)
The End • Special thanks to: • DANTE / GEANT2 - http://www.geant2.net/ • Ricardo Oliveira @ UCLA - http://irl.cs.ucla.edu/~rveloso/ • More information at: • http://www.thlab.net/~fernando/papers/urca.pdf • http://www.thlab.net/~fernando/papers/astute.pdf