1 / 26

Taming Internet Traffic Some notes on modeling the wild nature of OD flows

Explore the study and architecture of an OD flow modeling system for efficient network monitoring. Learn about traffic matrices, Kalman filtering, traffic dynamics, modeling techniques, and recalibration for improved performance and anomaly detection.

tkohler
Download Presentation

Taming Internet Traffic Some notes on modeling the wild nature of OD flows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Taming Internet TrafficSome notes on modeling the wild nature of OD flows Augustin Soule Kavé Salamatian Antonio Nucci Nina Taft Univ. Paris VI Univ. Paris VI Sprintlabs Intel Berkeley

  2. What’s next • Definition of the problem • Overview of the approach • Study of the modeling part • Study of the Tracking part

  3. Network monitoring (1) • Network state results from • Traffic demand • OD matrix • Capacity offer • Routing matrix, link capacity, traffic engineering, etc… • Objective of the network operator • To drive the equilibrium point to the most beneficial • By managing the capacity offer • Traffic engineering is the art of managing capacity offer

  4. Network monitoring (2) • Monitoring • Capacity offer • Pings, failure monitoring, SNMP reports • Traffic demand ? • Is not observable per se • At least in real time • Have to infer it indirectly • Traffic counts

  5. Network monitoring (3) • Monitoring ? • Being able to separate • What is predicted • Expected, under control, normal, … • What is unpredicted • Unexpected, Out of range, abnormal, … • Occam razor view • Express what is predictable by a short model • Describe fully what is unpredictable • Interpretation view • Only what is unpredictable have to be given a sense • What is predictable give no information

  6. Architecture of a network monitoring system

  7. Overview of the solution • Model the normal behavior of traffic demand • At sufficient granularity level • Relevant granularity for operator ? • Compare observation with prediction made by model • Rise an alarm if a divergence is seen • Wow, I just rediscovered Kalman Filter!

  8. destination origin City A City B City C City A City B City C What’s a traffic matrix? • Can define variety of matrices • Select timescale • Select node granularity: router, prefix, POP, etc. • Application wise ! 25 Mbps

  9. Link1 Link2 Link3 . Link L ODAB ODAC ODAD . . . 0 1 1/2 0 0 0 0 0 1 0 0 . . Y = A X Notation: Problem Formulation routing matrix = Y A from SNMP link counts X from IGP link weights issue: # links < < # OD pairs=> underconstrained system=> infinite # of solutions Have linear system:

  10. OD Traffic Dynamics (1)

  11. OD traffic dynamics (2) • Temporal correlations • Diurnal, weekly, monthly, etc.. • Spatial correlation • Same Origin Pop • Same destination PoP • Create a dynamic LTI model for OD flows capturing temporal and spatial dependences • X(t+1) = C*X(t)+W(t) • W(t) account for model unprecision

  12. Traffic Model • State space model : • How to calibrate C, Q and R? • EM method • Find the value of C, Q and R such that the observations are most likely to be observed • Observations might be OD traffic itself or the link count • OD traffic is better , Sometimes no other choice  • Good initial point are needed. • Use OD traffic first, link count next • Multi-linear Method • X(t+1) is expressed as a multi-linear relation of X(t) • Lead to a diagonal matrix Q

  13. Raw data • Let’s suppose we have gathered over one day the full OD matrix • Sampled Aggregate NetFlow (Cisco) used on all routers inside Sprint’s European network. • Flow = 5-tuple (@src,@dst,port src, port dst, proto) • Each flow is sampled every 250th packet. • Downloaded BGP tables and configuration files from all routers: Used to determine egress points within Sprint’s AS => yielding the FULL traffic matrix. • Three weeks of data from August 2003. • Many thanks to Anukool Lakhina to collect/process the raw data :)

  14. Inside the modelImpulse response of the filter • At time t=1 • OD 1 is set to 1 • See the propagation of this impulse on all the other OD pairs • 24 h Periodicity • Exponentially decreasing Sinusoid

  15. Inside the model Pole diagram r q Radius : Amplitude of the eigenvalue Angle : Frequency of the eigenvalue

  16. Inside the modelFiltering the eigenvalues • Filter out the over learning -Remove small timescale fluctuations -Remove Fast oscillations • Keep the White area

  17. Kalman filtering • Filter out what is compatible with the model from what is incompatible • Do it by comparing what is predicted by the model with what is observed • Innovation process: • two steps • Prediction • Correction

  18. Example of fitting

  19. Monitoring information • Confidence interval can be made on innovation process • If then something out of prediction has happened • Raise an alarm ! • Is every change a problem ? • Same approach for OD pairs • Ability to track changes on each OD • Might be useful for DDoS attack detection and management

  20. Innovation on the link

  21. Need to recalibrate the model For these OD pairs Innovation on the OD

  22. Recalibration ! • Need to find out the new model ! • Several way • Do a netflow acquisition for all changing OD flows. Mix with previous OD flow. Recalibrate the model • Use traffic count for recalibrating the model using EM method with previous model as starting point • Develop a continuous time adaptive mechanism • Use LMS or RMS algorithm • Use a sliding windows

  23. Example of fittingAfter recalibrations

  24. Innovation After Recalibrations

  25. L2-Norm over time

  26. Contributions • New tracking approach for network monitoring • Using Time and Spatial correlation • OD flows model • Able to detect deviations from the model • Thanks to Kalman Filter • Really Fast and Scalable. • Whole process in less than 2 minutes for 14 days • Validated using real Traces.

More Related