190 likes | 327 Views
Performance Measurement on Large Bandwidth-Delay Product Networks. 3rd e-VLBI Workshop October 6, 2004 Makuhari, Japan. Masaki Hirabaru masaki@nict.go.jp NICT Koganei. An Example How much speed can we get?. a-1). High-Speed Backbone. GbE. 100M. GbE. L2/L3 SW. Receiver. Sender.
E N D
Performance Measurement on Large Bandwidth-Delay Product Networks 3rd e-VLBI Workshop October 6, 2004 Makuhari, Japan Masaki Hirabaru masaki@nict.go.jpNICT Koganei
An ExampleHow much speed can we get? a-1) High-Speed Backbone GbE 100M GbE L2/L3SW Receiver Sender RTT 200ms a-2) High-Speed Backbone GbE GbE 100M 100M Receiver SW SW Sender RTT 200ms
This is TCP’s fundamental behavior. Average TCP Throughput less than 20Mbps In case we limit the sending rate at 100Mbps
An Example (2) b) High-Speed Backbone GbE GbE Sender Receiver Only 900 Mbps available RTT 200ms
Purposes • Measure, analyze and improve end-to-end performance in high bandwidth-delay product, packet-switched networks • to support for networked science applications • to help operations in finding a bottleneck • to evaluate advanced transport protocols (e.g. Tsunami, SABUL, HSTCP, FAST, XCP, [ours]) • Improve TCP under easier conditions • with a signle TCP stream • memory to memory • bottleneck but no cross traffic • Consume all the available bandwidth
TCP on a path with bottleneck queue overflow loss bottleneck The sender may generate burst traffic. The sender recognizes the overflow after the delay > RTT/2. The bottleneck may change over time.
Web100 (http://www.web100.org) • A kernel patch for monitoring/modifying TCP metrics in Linux kernel • We need to know TCP behavior to identify a problem. • Iperf (http://dast.nlanr.net/Projects/Iperf/) • TCP/UDP bandwidth measurement • bwctl (http://e2epi.internet2.edu/bwctl/) • Wrapper for iperf with authentication and scheduling • tcpplot • visualizer for web100 data
1st Step: Tuning a Host with UDP • Remove any bottlenecks on a host • CPU, Memory, Bus, OS (driver), … • Dell PowerEdge 1650 (*not enough power) • Intel Xeon 1.4GHz x1(2), Memory 1GB • Intel Pro/1000 XT onboard PCI-X (133Mhz) • Dell PowerEdge 2650 • Intel Xeon 2.8GHz x1(2), Memory 1GB • Intel Pro/1000 XT PCI-X (133Mhz) • Iperf UDP throughput 957 Mbps • GbE wire rate: headers: UDP(8B)+IP(20B)+EthernetII(38B) • Linux 2.4.26 (RedHat 9) with web100 • PE1650: TxIntDelay=0
2nd Step: Tuning a Host with TCP • Maximum socket buffer size (TCP window size) • net.core.wmem_max net.core.rmem_max (64MB) • net.ipv4.tcp_wmem net.tcp4.tcp_rmem (64MB) • Driver descriptor length • e1000: TxDescriptors=1024 RxDescriptors=256 (default) • Interface queue length • txqueuelen=100 (default) • net.core.netdev_max_backlog=300 (default) • Interface queue descriptor • fifo (default) • MTU • mtu=1500 (IP MTU) • Iperf TCP throughput 941 Mbps • GbE wire rate: headers: TCP(32B)+IP(20B)+EthernetII(38B) • Linux 2.4.26 (RedHat 9) with web100 • Web100 (incl. High Speed TCP) • net.ipv4.web100_no_metric_save=1 (do not store TCP metrics in the route cache) • net.ipv4.WAD_IFQ=1 (do not send a congestion signal on buffer full) • net.ipv4.web100_rbufmode=0 net.ipv4.web100_sbufmode=0 (disable auto tuning) • Net.ipv4.WAD_FloydAIMD=1 (HighSpeed TCP) • net.ipv4.web100_default_wscale=7 (default)
TransPAC/I2 Test: High Speed TCP (60 mins) From Tokyo to Indianapolis
Test in a Laboratory – with Bottleneck PE 2650 L2SW (FES12GCF) PE 1650 Sender Receiver GbE/T GbE/T GbE/SX Network Emulator Bandwidth 800Mbps Delay 88 ms Loss 0 2*BDP = 16MB BGP: Bandwidth Delay Product
Laboratory Tests: 800Mbps Bottleneck TCPNewReno (Linux) HighSpeedTCP (Web100)
BIC TCP buffer size100packets buffer size1000packets
FAST TCP buffer size100packets buffer size1000packets
Identify the Bottleneck • existing tools: pathchar, pathload, pathneck, etc. • Available bandwidth along the path • How much the bottleneck (router) buffer size? • pathbuff (under development) • measuring buffer size at the bottleneck • sending a packet train then detect a loss and delay
A Method of Measuring Buffer Size network with bottleneck T packet train Sender Receiver n packets Capacity C
Typical cases of congestion points Congestion Point with small buffer(~100 packets) Congestion Point with large buffer(>=1000 packets) Switch Router Router Inexpensive, but…Poor TCP performancefor high BW delay path Better TCP performancefor high BW delay path
Summary • Performance measurement to get a reliable result and identify a bottleneck • Bottleneck buffer size impact on the result Future Work • Performance measurement platform in cooperation with applications
Network Diagram for e-VLBI and test servers Seoul XP 10G Korea Kashima 100km Daejon bwctl server JGNII KOREN perf server 1G (10G) Taegu Tokyo XP SWITCH 2.5G Kwangju Busan Koganei e-vlbi server 1G 1G(10G) 250km GEANT 2.5G SONET TransPAC / JGN II 10G APII/JGNII 7,000km 10G 2.5G Kitakyushu 1,000km 9,000km Chicago MIT Haystack 1G 1G (10G) Fukuoka Abilene 2.4G (x2) Genkai XP Fukuoka Japan 10G 4,000km Washington DC Los Angeles Indianapolis *Performance Measurement Point Directoryhttp://e2epi.internet2.edu/pipes/pmp/pmp-dir.html