1 / 41

The Sprint IP Monitoring Project and Traffic Dynamics at a Backbone POP

The Sprint IP Monitoring Project and Traffic Dynamics at a Backbone POP. Supratik Bhattacharyya Sprint ATL http://www.sprintlabs.com. The IP Group at Sprintlabs. Charter : Investigate IP technologies for robust, efficient, QOS-enabled networks

troyn
Download Presentation

The Sprint IP Monitoring Project and Traffic Dynamics at a Backbone POP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Sprint IP Monitoring Project and Traffic Dynamics at a Backbone POP Supratik Bhattacharyya Sprint ATL http://www.sprintlabs.com

  2. The IP Group at Sprintlabs Charter : • Investigate IP technologies for robust, efficient, QOS-enabled networks • Anticipate and evaluate new services and applications Major Projects : • Monitoring Sprint’s IP Backbone • Service Platform

  3. Talk Overview • The IPMon Project • Routing and Traffic Dynamics

  4. IP Backbone : POP-to-POP view POP OC-48 OC-12 OC-3 POP : Point of Presence, typically a metropolitan area

  5. Motivation: Need for Monitoring Current network is over-provisioned, over-engineered, best-effort… • Diagnosis: • detect and report problems at IP level • Management • configuration problems, traffic engineering • resource provisioning, network dimensioning • Value-added service • feedback to customers (performance, traffic characteristics) • Detect attacks and anomalies

  6. Existing Measurement Efforts • Passive measurements • SNMP-based tools • Netflow (Cisco proprietary) • OC3MON, OC12MON • Active Measurements • ping, traceroute, NIMI, MINC, Surveyor • Skitter, Keynote, Matrix • Integrated Approach • AT&T Netscope • Network topology and routes • Traffic at flow level granularity • Delay and loss statistics

  7. Our approach • Passive monitoring • Capture header (44 bytes) from every packet • full TCP/IP headers, no http information • Use GPS time stamping - allows accurate correlating of packets on different links • Day long traces • Simultaneously monitor multiple links and sites. • Collect routing information along with packet traces. • Traces archived for future use

  8. Applications • Data from a commercial Tier-1 IP backbone • Applications of data: • traffic modeling • traffic engineering • provisioning • pricing, SLAs • hardware design in collaboration with vendors • denial-of-service

  9. Measurement Facilities • IPMON System • Collects packet traces by passively tapping onto the fiber using optical splitters • supports OC-3 to OC-48 data rates • Data Repository • Large tape library to archive data • Analysis Platform • Initially 17 nodes computing cluster • SAN under deployment

  10. IPMON Architecture Linux PC with multiple PCI buses

  11. Monitoring links at a POP

  12. Current Status of IPMONs • Currently operational in one major west coast POP on OC3 links • Under way in two major east coast POPs for OC3 and OC12 -- (we hope by July 2001) • OC48 in preparation for 1 east coast POP and 1 west coast POP -- summer 2001 • Future: Sprint Dial-Up Network, more POPs, European network

  13. Practical Constraints • Difficult to monitor operational network : • Complex procedure for deploying equipment  • POPs evolve too fast  • Too costly to be ubiquitous • Technology limitations (PCs, disks, etc.) • Only off-line analysis is possible • Are 44 bytes enough?

  14. Ongoing Projects • Routing and Traffic Dynamics • Delay measurement across a router • TCP flow analysis • Denial of service • Bandwidth provisioning and pricing

  15. Routing and Traffic Dynamics Project • Part 1: what are the traffic demands between pairs of POPs? • How stable is this demand? • Part 2: what are the paths taken by those demands? • Are link utilizations levels similar throughout the backbone? • Part 3: is there a better way to spread the traffic across paths? • At what level of traffic granularity should traffic be split up?

  16. Motivation Understand traffic demands between POP pairs

  17. City A City B City C City A City B City C POP-to-POP Traffic Matrix For every ingress POP : Identify total traffic to each egress POP Further analyze this traffic Measure traffic over different timescales Divide traffic per destination prefix, protocol, etc.

  18. Applications • Intra-domain routing • Analyzing routing anomalies • Verify BGP Peering • Capacity planning and dimensioning • POP architecture

  19. Generating POP-POP traffic matrices

  20. The Mapping Problem What is the egress POP for a packet entering the a given ingress POP?

  21. Recursive BGP lookup to find last Sprint hop Mapping BGP destinations to POPs (Dst,Next-Hop) Find best Next-Hop BGP table (Next-Hop, POP map) Map Dst to POP Get Unique Next-Hops Unique Next-Hops Map to POP (BGP Dst,POP) (Next-Hop, Last Sprint Hop)

  22. Data Processing • Step 1: Use BGP tables to generate [prefix, egress POP] map • Step 2: Run IP lookup software on packet trace using above map • Output : single trace file for each egress-POP, e.g. all packets headed to POP k from monitored POP • Step 3: Use our traffic analysis tool for statistics evaluation.

  23. Access Access Access Access Monitored links at a single POP Peer 2 Peer 1 Core Core Core ISP web hosting

  24. Trace Length (hours) Access Link Type Webhost 1 19 13 Webhost 2 24 Peer 1 Peer 2 15 8 ISP Data • 5 traces collected on Aug 9, 2000

  25. Traffic Fanout: POP level granularity

  26. Fanout: web host links

  27. Time-of-Day for POP level granularity

  28. Day-Night Variation : Webhost #1 % reduction at night between 20-50% depending upon access link

  29. Summary • Wide disparity in “traffic demands” among egress POPs • POPs can be roughly categorized as : small, medium, large; and they maintain their rank during the day. • Traffic is heterogeneous in space yet stable in time. • Traffic varies by (access link, egress POP pair) • Hard to characterize time-of-day behaviour • 20-50% reduction at night

  30. Routing and Traffic Dynamics Project • Part 1: what are the traffic demands between pairs of POPs? • How stable is this demand? • Part 2: what are the paths taken by those demands? • Are link utilizations levels similar throughout the backbone? • Part 3: is there a better way to spread the traffic across paths? • At what level of traffic granularity should traffic be split up?

  31. IS-IS Routing Practices

  32. Is backbone traffic balanced?

  33. What we’ve seen so far Wide disparity in traffic demands between (ingress, egress) POP pairs + Wide disparity in link utilization levels, plus many underutilized routes + Routing Policies concentrate traffic on few paths Question: Can we divert some traffic to the lightly loaded paths?

  34. Routing and Traffic Dynamics Project • Part 1: what are the traffic demands between pairs of POPs? • How stable is this demand? • Part 2: what are the paths taken by those demands? • Are link utilizations levels similar throughout the backbone? • Part 3: is there a better way to spread the traffic across paths? • At what level of traffic granularity should traffic be split up?

  35. Creating traffic aggregates • To address issues of splitting traffic over multiple paths, need to define “streams” within traffic • How should packets be aggregated into streams? • Coarse granularity: POP-to-POP • Very fine granularity: use 5-tuple • Initial criterion : destination address prefix

  36. Elephants and Mice among /8 streams Traffic grouped by egress POPs Stream : all packets in a group with same /8 destination address prefix Ingress : Webhost Link

  37. Stability of prefix-based aggregates

  38. Observations about prefix-based streams Recursive : /8 elephant has a few /16 elephants and many mice, likewise at /24 level Phenomenon is less pronounced at /24 level Qn : Are elephants stable? • Definition: • Ri(n) = the rank of flow i at time slot n • Di,n,k= | Ri(n) - Ri(n+k) | • each time slot corresponds to 30 minutes

  39. Frequency of Rank Changes Conclusion : For load balancing, route elephants along different paths

  40. Conclusions • Monitoring and measurement is key to better network design • IPMon : a passive monitoring system for packet-level information • We have used our data to build components of traffic matrices for traffic engineering • Backbone traffic can be better load-balanced : destination-prefix is a possible (simple) criterion

  41. Ongoing Work • Intra-domain Routing : • Choosing ISIS link weights • Load balancing in the backbone • Flow Characterization • Building Traffic Matrices • POP modeling

More Related