210 likes | 216 Views
On a New Internet Traffic Matrix (Completion) Problem. Local Traffic Matrices. At an individual router Gives traffic volumes (number of bytes per time unit: 5 min, 1 hour, 1 day) between every input port and output port on a router
E N D
Local Traffic Matrices • At an individual router • Gives traffic volumes (number of bytes per time unit: 5 min, 1 hour, 1 day) between every input port and output port on a router • Typical routers have a small number of ports, from 16 to at most 256 • Available measurements • Netflow-enabled routers provide direct measurements • Routing data • No need for inference!
Intra-Domain Traffic Matrices • For an individual network • Gives traffic volumes (number of bytes per time unit: 5 min, 1 hour, 1 day) between every ingress router/PoP and egress router/PoP in a network • Some of the larger networks can have 1000’s of routers or 100’s of PoPs • Available measurements • SNMP data provide indirect measurements (per link) • Routing data
Intra-Domain TM Inference Problem • Network-wide availability of SNMP data (link loads) • Relying only on SNMP data, solve AX=Y A: routing matrix; Y: link measurements • In real networks, this is a massively underconstrained problem • Active area of research in 2000-2010 • Zhang, Roughan, Duffield, and Greenberg (2003) • Zhang, Roughan, Lund, and Donoho (2003, 2005)
Intra-Domain TM Inference Problem • Applications • Network engineering (capacity planning) • Traffic engineering (what-if scenarios) • Anomaly detection • Enormously useful for daily network operations • Textbook example of theory impacting practice • Things changed around 2010 … • Netflow-enabled routers are now deployed network-wide and provide direct measurements • Can measure the intra-domain TM directly! • Inference approach is no longer needed!
Example: Abilene Network High speed Education Network 28 links 10 Gbps Capacity on each link 11 Points of Presence (POPs) with NetFlow measurement capabilities
Intra-Domain TM: Open Problems • Synthesis of realistic TMs • Can’t be agnostic about the underlying network! • What information about the underlying network is needed? • Network-related root causes for observed properties of measured TMs • Low-rank, deviations from low-rank • Sparsity • Which measurements are more critical than others for my network?
What can Intra-Domain TMs tell us? • How much of the traffic that enters my network in NYC is destined for ATL (per hour, per day)? • How much of the daily traffic on my network is coming from (which) CDNs? • How much of the hourly traffic that enters my network in NYC and is destined to ATL is coming from Netflix? • How much traffic does my network carry (per hour, per day)?
A Different Set of Questions • How much traffic do Sprint and Verizon exchange with one another (per hour, day)? • How much traffic does Verizon get from Netflix (per day, month)? • What are the networks that exchange the most traffic with Google? • How much does Facebook’s traffic increase on a monthly basis? • How much traffic does the Internet carry per day?
New Problem: Inter-Domain TM • The Internet is a “network of networks” • Individual networks are also called Autonomous Systems (ASes) • Today’s Internet consists of about ~30K-40K actively routed ASes • We are getting a clearer picture of the AS-level topology (i.e., which networks exchange routing information with one another and hence presumably also IP traffic) • Inter-domain (or AS-level) traffic matrix • Gives traffic volumes between ASes • Completely unknown …
Inter-Domain TM: Highly Structured • Some numbers … • In 2010 the Internet carried some 20 EB/month • In late 2009, AT&T carried some 20PB/day in 2009 • There are some 20 AT&T-like large transit providers in today’s Internet • Some caveats … • Large transit providers use multiple networks to run their business (e.g., Verizon has some 230 ASes) • Need to know how to map ASes to companies
On Inter-Domain TM Completion • Today’s formulation • About 1% of the inter-domain TM elements are responsible for a majority of all the traffic • Inter-domain TM has low rank (does it?) • (Non)standard TM completion problem • Towards tomorrow’s formulation • How to insist on strong validation criteria? • What sort of new measurements are feasible and can be used to check the validity of a solution to today’s formulation of the inter-domain TM completion problem?
Internet eXchange Points (IXPs) Content Provider 1 Content Provider 2 AS2 AS1 AS3 layer-2 switch AS5 AS4
Inter-Domain TM and IXPs • Some numbers … • There are some 300 IXPs worldwide that see some 10-20% of all Internet traffic • They involve some 4K ASes • Most IXPs publish their hourly/daily total traffic volume • We are getting more and more accurate peering matrices for these 300 IXPs • New Twist … • How to infer the local TM at each IXP? • How to measure the local TM at each IXP?
Back to Inter-Domain TM Completion • Tomorrow’s formulation • Start with today’s formulation • Accounts for large transit providers • Incorporate IXP-specific information • Accounts for large content providers • New (non)standard TM completion problem • … and repeat • What other sources of new measurements? • Promising candidates: CDNs (Akamai & co.) • What types of measurements are more critical than others?
Summary • Intra-domain TM research • Beautiful example of innovative research with enormous practical benefits for network operators • The intra-domain TM of an AS is a basic ingredient for a first-principles approach to understanding the AS’s router-level topology (forget “Network Science” …) • Reminder that “change changes things” • Inter-domain TM research • Enormous practical value • Adds new twist to generic matrix completion problem • The inter-domain TM as critical ingredient for a first-principles approach to understanding the Internet’s AS-level topology (TBD)