460 likes | 568 Views
Locating Internet Bottlenecks: Algorithms, Measurement, and Implications. Ningning Hu (CMU) Li Erran Li (Bell Lab) Zhuoqing Morley Mao (U. Mich) Peter Steenkiste (CMU) Jia Wang (AT&T). SIGCOMM’04. Goal. Locate network bottleneck along end-to-end paths
E N D
Locating Internet Bottlenecks: Algorithms, Measurement, and Implications Ningning Hu (CMU) Li Erran Li (Bell Lab) Zhuoqing Morley Mao (U. Mich) Peter Steenkiste (CMU) Jia Wang (AT&T) SIGCOMM’04
Goal • Locate network bottleneck along end-to-end paths • With such information, network operators can improve routing
Difficulties • End users cannot gain information of network internals • High measurement overhead
Proposed algorithm – Pathneck • Pathneck is an active probing tool • Low overhead (i.e., in order of 10s-100s KB) • Fast (i.e., in order of seconds) • Single-end control (sender only) • High accuracy
Outline • Algorithm • Internet validation • Testbed validation • Internet measurement • Applications • Conclusion
R1 R2 L2 R3 L1 Definition • Bottleneck link • Link with smallest available bandwidth • Available bandwidth • Residual bandwidth • Choke link • Link has lower available bandwidth than the partial path from source to that link • Choke point • Upstream router of choke link
Definition • Last choke link is bottleneck link R1 R2 R3 R4 R5 R6 R7 L1 L4 L5 L6 L2 L3 Choke link Choke link / Bottleneck
measurement packets measurement packets 1 2 30 30 2 1 30 pkts, 60 B 30 pkts, 60 B Recursive Packet Train (PRT) in Pathneck Load packets Load packets are used to measure available bandwidth Measurement packets are used to obtain location information 255 255 255 255 60 pkts, 500 B TTL UDP packets
Gap value Sender Router Packet train Time axis
Gap value Sender Router Drop m. packet Send ICMP
Gap value Sender Router Drop m. packet Send ICMP Recv ICMP
Gap value Sender Router Drop m. packet Send ICMP Drop m. packet Send ICMP Recv ICMP
Gap value Sender Router Drop m. packet Send ICMP Drop m. packet Send ICMP Recv ICMP Recv ICMP Gap value
Train length • Link capacity • train_rate > a_bw train_length increases • train_rate ≤ a_bw train_length keeps same • Traffic load • Heavily loaded train_length increases • Lightly loaded train_length keeps same
1 2 3 4 255 255 255 255 255 4 3 2 1 g1 0 0 1 2 3 254 254 254 254 254 3 2 1 g2 0 0 1 2 253 253 253 253 253 2 1 1 2 253 253 253 253 253 2 1 g3 0 0 1 252 252 252 252 252 1 gap values are the raw measurement Transmission of RPT S R1 R2 R3
Inference Model – Step 1 • Label gap sequence • Remove data if cannot get both ICMP • Remove the entire probing data if cannot get more than half routers on path • Fix hill and valley point • Given a certain of steps, minimize the total distance between individual values and the average step values
Inference Model – Step 2 • Confidence Threshold (conf) • Percentage change of available bandwidth • To filter out the gap measurement noise • Default: conf ≥ 10% available bandwidth change • Detection Rate (d_rate) • # positive probing / # total probing • A hop must appear as a choke point for at least M times (d_rate ≥ M/N) • To select the most frequent choke point • Default: d_rate ≥ 5/10 = 50%
Inference Model – Step 3 • Rank choke points • Bottleneck is the choke point with largest gap value
Pathneck – configuration • Each probing set contains 30 - 100 packets • Probe the same destination 6 - 10 times • Each probing set take one RTT (wait for 3 seconds, max RTT) • conf ≥ 10% filtering • d_rate ≥ 50% filtering
Output from Pathneck • Bottleneck location (last choke point) • Upper or lower bound for the link available bandwidth • Based on the gap values from each router (details in the paper)
Limitations • Cannot measure the last hop • Limited ICMP rate • ICMP packet generation time and reverse path congestion can introduce measurement error • Generation time is insignificant • Filter out measurement outliers Send ICMP Send ICMP Drop m. packet Drop m. packet Send ICMP Recv ICMP Recv ICMP True Gap value Measured Gap value
Limitations • Packet loss and route change will disable the measurements • Multiple probings can help • Cannot pass firewalls • Similar to most other tools and usually not bottleneck • Bias towards early choke points • If change is insignificant, filtered out by confidence threshold
Validation • Internet validation • Abilene network • Testbed validation • Emulab, a fully controlled environment
Internet validation (Abilene) • Source: CMU and University of Utah • 22 probing destination for each source • Each 11 major routers on the Abilene backbone is included in at least one probing path • Each destination, probe 100 times with a 2-second interval between consecutive probing
Internet validation (Abilene) • Detect only 5 non-first hop bottleneck • Abilene paths are over-provisioned • Detected bottleneck are outside Abilene network, so it cannot be verified
Testbed validation (Emulab) • 100 probing sets • Use the result received all ICMP • Entire probing interval is about 1 min
Comparing impact of capacity and load • Left figure • Fix X to 50Mbps • Vary Y from 21 to 30 Mbps with step size 1Mbps • Right figure • Set X and Y to 50Mbps • Very CBR loads to Y from 29 to 20 Mbps • Bottleneck available bandwidth change from 21 to 30Mbps
Testbed validation (Emulab) • Probing set can identify Y as bottleneck • 86 individual probing: 7 X (correct), 65 Y (correct), 14 X (incorrect) • Due to small difference
Testbed validation (Emulab) • 67 X (correct), 2 Y (correct), 8 X (incorrect) • Due to small difference
Measurement Methodology • Probing sources • 58 probing sources (from PlanetLab & RON) • Probing destinations • Over 3,000 destinations from each source • Covers as many distinct AS paths as possible • 10 probings for each destination • conf 10%, d_rate 50% • Duration is within 2 days
Popularity • <2% paths report more than 3 choke links • Popularity = # positive probe of link b / # probe that traverse link b • Half of choke links are detected in 20% or less • Cannot detect sometimes due to bursty traffic (filtered)
Bottleneck Distribution • Common Assumption: bottlenecks are most likely to appear on the peering and access links, i.e., on Inter-AS links • Identifying Inter/Intra-AS links • Only use AS# is not enough (Mao et al [SIGCOMM03]) • We define Intra-AS links as links at least one hop away from links where AS# changes • Two types of Inter-AS links: Inter0-AS & Inter1-AS links • We identify a subset of the real intra-AS links
Bottleneck Distribution (cont.) • Up to 40% of bottleneck links are Intra-AS • Consistent with earlier results [Akella et al IMC03]
Stability • Sample 30 destination randomly • Divide 3 hour measurement into 9 epochs of 20 minute each • Each epoch, run 5 probing trains
Conclusion • Pathneck is effective and efficient in locating bottlenecks • Sender modified, low overhead • Up to 40% of bottleneck links are Intra-AS • 54% of the bottlenecks can be inferred correctly • Guide Overlay and multihoming
References • http://www.cs.cmu.edu/~hnn/pathneck • Ningning Hu and et. al., “Locating Internet Bottleneck: Algorithms, Measurements, and Implications,” SIGCOMM’04 • Related technical report
Inference • Help to reduce the measurement overhead • 54% of inferences are successful for 12,212 paths with “enough information” S R R R R R D
Inference • Take lowest upper bound and highest lower bound • Include upper bound if standard deviation is less than 20% of average • Divide into training set and testing set • Exclude if testing set cannot identify bottleneck