Network Measurements

Network Measurements Les Cottrell – SLAC Lecture # 3 presented at the Workshop on Scientific Information in the Digital Age: Access and Dissemination ICTP, Trieste, Italy October , 2009 www.slac.stanford.edu/grp/scs/net/talk09/ictp-measure.ppt 1

Overview • Why is measurement difficult yet important? • LAN vs WAN • SNMP • Effects of measurement interval • Passive • Active • Tools • Trouble shooting • Tools, how to find things & who to tell

Why is measurement difficult? • Internet's evolution as a composition of independently developed and deployed protocols, technologies, and core applications • Diversity, highly unpredictable, hard to find “invariants” • Rapid evolution & change, no equilibrium so far • Findings may be out of date • Measurement not high on vendors list of priorities • Resources/skill focus on more interesting an profitable issues • Tools lacking or inadequate • Implementations are flaky & not fully tested with new releases • ISPs worried about providing access to core, making results public, & privacy issues • The phone connection oriented model (Poisson distributions of session length etc.) does not work for Internet traffic (heavy tails, self similar behavior, multi-fractals etc.)

Why is measurement important? • End users & network managers need to be able to identify & track problems • Choosing an ISP, setting a realistic service level agreement, and verifying it is being met • Choosing routes when more than one is available • Setting expectations: • Deciding which links need upgrading • Deciding where to place collaboration components such as a regional computing center, software development • How well will an application work (e.g. VoIP)

LAN vs WAN • Measuring the LAN • Network admin has control so: • Can read MIBs from devices • Can within limits passively sniff traffic • Know the routes between devices • Manually for small networks • Automated for large networks • Measuring the WAN • No admin control, unless you are an ISP • Cant read information out of routers • May not be able to sniff/trace traffic due to privacy/security concerns • Don’t know route details between points, may change, not under your control, may be able to deduce some of it • So typically have to make do with what can be measured from end to end with very limited information from intermediates equipment hops.

SNMP (Simple Network Management Protocol) • Example of an Application, usually built on UDP • Defacto standard for network management • Created by IETF to address short term needs of TCP/IP • Consists of: • Management Information Bases (MIBs) • Store information about managed object (host, router, switch etc.) – system &status info, performance & configuration data • Remote Network Monitoring (RMON) is a management tool for passively watching line traffic • SNMP communication protocol to read out data and set parameters • Polling protocol, manager asks questions & agent responds

SNMP Model Agent MIB • NMS contains manager software to send & receive SNMP messages to Agents • Agent is a software component residing on a managed node, responds to SNMP queries, performs updates & reports problems • MIB resides on nodes and at NMS and is a logical description of all network management data. Agent MIB Agent MIB TCP/IP net Agent MIB Agent MIB Agent MIB Network Management Station(NMS)

SNMP Examples • Using MRTG to display Router bits/s MIB variable CERN trans- Atlantic traffic

Averaging intervals • Typical measurements of utilization are made for 5 minute intervals or longer in order not to create much impact. • Interactive human interactions require second or sub-second response • So it is interesting to see the difference between measurement made with different time frames.

Averages vs maxima • Maximum of all 5 sec samples can be factor of 2 or more greater than the average over 5 minutes

Passive vs. Active Monitoring • Active injects traffic on demand • Passive watches things as they happen • Network device records information • Packets, bytes, errors … kept in MIBs retrieved by SNMP • Devices (e.g. probe) capture/watch packets as they pass • Router, switch, sniffer, host in promiscuous (tcpdump) • Complementary to one another: • Passive: • does not inject extra traffic, measures real traffic • Polling to gather data generates traffic, also gathers large amounts of data • Active: • provides explicit control on the generation of packets for measurement scenarios • testing what you want, when you need it. • Injects extra artificial traffic • Can do both, e.g. start active measurement and look at passively

Passive tools • SNMP • Hardware probes: e.g. Sniffer, can be stand-alone or remotely access from a central management station • Software probes: snoop, WireShark, tcpdump, require promiscous access to NIC card, i.e. root/sudo access • Flow measurement: SFlow, OCxMon/CoralReef, Cisco/Netflow

Example: Passive site border monitoring • Use Cisco Netflow in Catalyst 6509 on SLAC border • Gather about 200MBytes/day of flow data • The raw data records include source and destination addresses and ports, the protocol, packet, octet and flow counts, and start and end times of the flows • Much less detailed than saving headers of all packets, but good compromise • Top talkers history and daily (from & to), tlds, vlans, protocol and application utilization • Use for network & security

IN2P3 E.g.SLAC Traffic by collaboration site CNAF MPI 1.0 0.0 1.0 OUT IN Gbits/s BNL (LHC ATLAS) Last 2 weeks in May 2009

E.g. Top talkers by protocol Hostname 1 100 10000 Volume dominated by single Application - bbftp MBytes/day (log scale)

Flow sizes SNMP Real A/V AFS file server Heavy tailed, in ~ out, UDP flows shorter than TCP, packet~bytes 75% TCP-in < 5kBytes, 75% TCP-out < 1.5kBytes (<10pkts) UDP 80% < 600Bytes (75% < 3 pkts), ~10 * more TCP than UDP Top UDP = AFS (>55%), Real(~25%), SNMP(~1.4%) Just 2 parameters power law slope & intercept characterize traffic flows

Flow lengths • 60% of TCP flows less than 1 second • Would expect TCP streams longer lived • But 60% of UDP flows over 10 seconds, maybe due to heavy use of AFS

Some Active Measurement Tools • Ping connectivity, RTT, loss, jitter, reachability • flavors of ping, fping • but blocking & rate limiting • Alternative synack, but can look like DoS attack • Traceroute • How it works, what it provides • Reverse traceroute servers • Traceroute archives • Combining ping & traceroute, • traceping, pingroute • Pathchar, pchar, pipechar, bprobe etc. • Iperf, netperf, ttcp, FTP …

Ping • ICMP client/server application built on IP • Client send ICMP echo request, server sends reply • Server usually in kernel, so reliable & fast • User can specify number of data bytes. Client puts timestamp in data bytes. Compares timestamp with time when echo comes back to get RTT • Many flavors (e.g. fping) and options • packet length, number of tries, timeout, separation … • Ping localhost (127.0.0.1) first, then gateway IP address etc. 0 8 16 24 31 Type=8 Code Checksum Identifier Sequence number Optional data

Ping example syrup:/home$ ping -c 6 -s 64 thumper.bellcore.com PING thumper.bellcore.com (128.96.41.1): 64 data bytes 72 bytes from 128.96.41.1: icmp_seq=0 ttl=240 time=641.8 ms 72 bytes from 128.96.41.1: icmp_seq=2 ttl=240 time=1072.7 ms 72 bytes from 128.96.41.1: icmp_seq=3 ttl=240 time=1447.4 ms 72 bytes from 128.96.41.1: icmp_seq=4 ttl=240 time=758.5 ms 72 bytes from 128.96.41.1: icmp_seq=5 ttl=240 time=482.1 ms --- thumper.bellcore.com ping statistics --- 6 packets transmitted, 5 packets received, 16% packet loss round-trip min/avg/max = 482.1/880.5/1447.4 ms 30cottrell@pinger:~>ping www.whitehouse.gov PING e2561.g.akamaiedge.net (72.246.106.135) 56(84) bytes of data. 64 bytes from a72-246-106-135.deploy.akamaitechnologies.com (72.246.106.135): icmp_seq=0 ttl=54 time=1.51 ms Packet size Remote host Repeat count RTT Missing seq # Summary Loss Names differ (why)

How to Diagnose with Ping • to localhost (127.0.0.1), • ping to gateway (use route or traceroute (tracert on Windows) to find gateway), • ping to well known host • & to relevant remote host • Use IP address to avoid nameserver problems • Look for connectivity, loss, RTT, jitter, dups • May need to run for a long time to see some pathologies (e.g. bursty loss due to DSL loss of sync) • Try flood pings if suspect rate limited • Use telnet- see if blocked; synack if ICMP blocked • www-iepm.slac.stanford.edu/tools/synack/

Main Ping Unreachable Messages Not ICMP but DNS not resolving name gives Unknown Host

12cottrell@pinger:~>ping www.lbl.gov PING www.lbl.gov (128.3.41.105) 56(84) bytes of data. --- www.lbl.gov ping statistics --- 4 packets transmitted, 0 received, 100% packet loss, time 3018ms Yet: 15cottrell@pinger:~>synack -p 80 -k 5 www.lbl.gov SYN-ACK to www.lbl.gov (128.3.41.105), 5 Packets connected to www.lbl.gov : Seq = 0 , RTT = 2.241 ms Also: 16cottrell@pinger:~>telnet www.lbl.gov 80 Trying 128.3.41.105... Connected to www.lbl.gov. Escape character is '^]'. 9cottrell@pinger:~>ping mail.unza.zm PING impala.unza.zm (196.46.196.4) 56(84) bytes of data. 64 bytes from 196.46.196.4: icmp_seq=0 ttl=38 time=404 ms Yet 10cottrell@pinger:~>ping impala.unzm.za ping: unknown host impala.unzm.za Anomalies: Pings blocked Confused DNS Try> ping 198.46.223.4 and ping www.cern.ch: explain

Ping from your own host to the world • www-iepm.slac.stanford.edu/tools/pingworld • Linux: • Windows: • Unless paranoid push Run on certificate warning

Traceroute • UDP/ICMP tool to show route packets take from local to remote host 17cottrell@flora06:~>traceroute -q 1 -m 20 lhr.comsats.net.pk traceroute to lhr.comsats.net.pk (210.56.16.10), 20 hops max, 40 byte packets 1 RTR-CORE1.SLAC.Stanford.EDU (134.79.19.2) 0.642 ms 2 RTR-MSFC-DMZ.SLAC.Stanford.EDU (134.79.135.21) 0.616 ms 3 ESNET-A-GATEWAY.SLAC.Stanford.EDU (192.68.191.66) 0.716 ms 4 snv-slac.es.net (134.55.208.30) 1.377 ms 5 nyc-snv.es.net (134.55.205.22) 75.536 ms 6 nynap-nyc.es.net (134.55.208.146) 80.629 ms 7 gin-nyy-bbl.teleglobe.net (192.157.69.33) 154.742 ms 8 if-1-0-1.bb5.NewYork.Teleglobe.net (207.45.223.5) 137.403 ms 9 if-12-0-0.bb6.NewYork.Teleglobe.net (207.45.221.72) 135.850 ms 10 207.45.205.18 (207.45.205.18) 128.648 ms 11 210.56.31.94 (210.56.31.94) 762.150 ms 12 islamabad-gw2.comsats.net.pk (210.56.8.4) 751.851 ms 13 * 14 lhr.comsats.net.pk (210.56.16.10) 827.301 ms Max hops Remote host Probes/hop No response: Lost packet or router ignores

Traceroute technical details Rough traceroute algorithm ttl=1; #To 1st router port=33434; #Starting UDP port while we haven’t got UDP port unreachable & ttl<max { send UDP packet to host:port with ttl get response if time exceeded note roundtrip time else if UDP port unreachable quit print output ttl++; port++ } • Can appear as a port scan • SLAC gets about one complaint every 2 weeks.

Reverse traceroute servers • Reverse traceroute server runs as CGI script in web server • Allow measurement of route from other end. Important for asymmetric routes. See e.g. • www.slac.stanford.edu/comp/net/wan-mon/traceroute-srv.html • Also cities.lk.net/trlist.html#Lists • Visual Traceroute server: visualroute.visualware.com/ • Map at www.caida.org/research/routing/reversetrace/

Warning • Some Linux versions have bug that incorrectly IDs cksum error on MPLS links. Make Pkt length>=140 • Traceroute <host> 140

Ping along the route • Run traceroute, then ping each router n times • helps identify where in route the problems start to occur • Routers may not respond to pings, or may treat pings directed at them, differently to other packets • Get Matt’s TraceRoute MTR from http://www.bitwizard.nl/mtr/ or pathping (built into windows)

How is my host doing? • www.speedtest.net,also • www.bandwidth-test.net • For problem diagnosis also: • netspeed.stanford.edu • Special TCP kernel on server, Java on client • Up & down link speeds + IDs: • Duplex mismatch, excessive loss from faulty cables, checks for middle boxes, FWs; needs Java on client • Also hints on setting TCP buffer sizes

Path characterization • Pathchar • sends multiple packets of varying sizes to each router along route • measures minimum response time • plot min RTT vs packet size to get bandwidth • calculate differences to get individual hop characteristics • measures for each hop: BW, queuing, delay/hop • can take a long time • Pipechar (many derivatives) • Also sends back-to-back packets and measures separation on return • Much faster • Finds bottleneck Bottleneck Min spacing At bottleneck Spacing preserved On higher speed links

Network throughput • Iperf (& thrulay, netperf, ttcp…) • Client generates & sends UDP or TCP packets • Server receives receives packets • Can select port, maximum window size, port , duration, Mbytes to send etc. • Client/server communicate packets seen etc. • Reports on throughput • Requires sever to be installed at remote site, i.e. friendly administrators or logon account and password

Iperf example 25cottrell@flora06:~>iperf -p 5008 -w 512K -P 3 -c sunstats.cern.ch ------------------------------------------------------------ Client connecting to sunstats.cern.ch, TCP port 5008 TCP window size: 512 KByte ------------------------------------------------------------ [ 6] local 134.79.16.101 port 57582 connected with 192.65.185.20 port 5008 [ 5] local 134.79.16.101 port 57581 connected with 192.65.185.20 port 5008 [ 4] local 134.79.16.101 port 57580 connected with 192.65.185.20 port 5008 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.3 sec 19.6 MBytes 15.3 Mbits/sec [ 5] 0.0-10.3 sec 19.6 MBytes 15.3 Mbits/sec [ 6] 0.0-10.3 sec 19.7 MBytes 15.3 Mbits/sec • Total throughput =3*15.3Mbits/s = 45.9Mbits/s 3 parallel streams TCP port 5006 Max window size Remote host

PingER • Monitors >40 in 23 countriesPI • 1 @ ICTP, 4 in Africa, • Algeria, Burkina Faso, South Africa, Zambia, • Beacons ~ 90 • Remote sites (~740) • 50 African Countries • ~ 99% of world’s population, >160 countries • Measurements go back to Jan-95 • Reports on RTT, loss, reachability, jitter, reorders, duplicates … • Uses ubiquitous “ping”

PingER Methodology very Simple >ping remhost Uses ubiquitous ping Internet Monitoring host Remote Host (typically a server) 10 ping request packets each 30 mins Once a Day Ping response packets Data Repository @ SLAC Measure Round Trip Time & Loss 35 35

Example PingER Output ICTP>Kenya • Uses Smokeping • Blue median RTT, background color = loss • Smokiness = jitter • Median RTT drops 780ms to 225ms, i.e. cut by 2/3rds (3.5 times improvement)

Trouble shooting • Ping to localhost, ping to gateway & to remote host • Use IP address to avoid nameserver problems • Look for connectivity, loss & RTT • May need to run for a long time to see some pathologies (e.g. bursty loss dues to DSL loss of sync) • Use synack if ICMP blocked • Traceroute to remote host • Reverse traceroute from remote host to you • Ping routers along route (pingroute helps) • Look at history plots (PingER, AMP, Surveyor), when did problem start, how big an effect is it? • Look at own connectivity NDT (netspeed.stanford.edu)

“Where is” a host – cont. • Find the Autonomous System (AS) administering • Use reverse traceroute server with AS identification, e.g.: • www.slac.stanford.edu/cgi-bin/nph-traceroute.pl … 14 lhr.comsats.net.pk (210.56.16.10) [AS7590 - COMSATS] 711 ms (ttl=242) • Get contacts for ISPs (if know ISP or AS): • http://puck.nether.net/netops/nocs.cgi • Gives ISP name, web page, phone number, email, hours etc. • Review list of AS's ordered by Upstream AS Adjacency • www.telstra.net/ops/bgp/bgp-as-upsstm.txt • Tells what AS is upstream of an ISP • Look at real-time information about the global routing system from the perspectives of several different locations around the Internet • Use route views at www.antc.uoregon.edu/route-views/

“Where is” a host • Look at name (use dig or nslookup if have IP addr) 47cottrell@netflow:~>nslookup 210.56.16.10 Server: localhost Address: 127.0.0.1 Name: lhr.comsats.net.pk Address: 210.56.16.10 • Look at TLD and use www.iana.org/domain-names.htm • Sometimes name (e.g. construct.haifa.ac.il)gives hint • Use www-wanmon.slac.stanford.edu/cgi-bin/nph-traceroute.pl to traceroute to www.ictp.it, what cities does the route go thru? • Visit site’s www server, often location in home page • Use whois, e.g. www.allwhois.com • May be able to get lat & long (GeoIPTool): • www.geoiptool.com/ or via: geotool.flagfox.net/

Who are you gonna tell • Local network support people • Internet Service Provider (ISP) usually done by local networker • Usually will know immediate one, e.g. trouble@es.net • Use puck.nether.net/netops/nocs.cgi to find ISP • Use www.telstra.net/ops/bgp/bgp-as-upsstm.txt to find upstream ISPs • Well managed sites and ISPs maintain a list of email addresses such as abuse@ or postmaster@, that one can send email to, for example to complain about spam etc. • This follows an Internet recommendation (RFC 2142). • Some less helpful sites do not provide such services, for more on these, see RFC-ignorant.org • Give them the ping & traceroute results

More Information • Tutorial on monitoring (getting a bit dusty) • www.slac.stanford.edu/comp/net/wan-mon/tutorial.html • RFC 2151 on Internet tools • www.freesoft.org/CIE/RFC/Orig/rfc2151.txt • Network monitoring tools • www.slac.stanford.edu/xorg/nmtf/nmtf-tools.html • Ping • http://www.ping127001.com/pingpage.htm • IEPM/PingER home site • www-iepm.slac.stanford.edu/pinger • IEEE Communications, May 2000, Vol 38, No 5, pp 130-136

IP Addresses pingable June 2003 • Grey= not allocated • Black= not pingable • Companies own class A

Growth 2003-2006 • More areas allocated, • Existing areas more colorful June 2003 Nov 2006

Not your normal Internet site Ames IXP: approximately 60-65% was HTTP, about 13% was NNTP Uwisc: 34% HTTP, 24% FTP, 13% Napster

SNMP version 1 limitations • Authentication is inadequate: • Password (community string) placed in clear in SNMP messages • MIB variables must be polled separately, i.e. entire MIB cannot be fetched with single command • SNMPv2 and v3 attempt to address these and other limitations • Despite limitations, SNMP has been a big success • Provides device and link utilization (byte, packets) and errors • Lot of facilities/tools built around SNMP to provide reports for sites • Security concerns limit access typically to very limited set of owner/admins • E.g. ISPs won’t let you poll their devices

Lot of heavy FTP activity • The difference depends on traffic • Only 20% difference in max & average

Simplified SLAC DMZ Network, 2001 Dial up &ISDN 2.4Gbps OC48 link NTON (#) rtr-msfc-dmz 155Mbps OC3 link(*) Stanford Swh-dmz ESnet Internet2 slac-rt1.es.net OC12 link 622Mbps swh-root Etherchannel 4 gbps SLAC Internal Network 1Gbps Ethernet (*) Upgrade to OC12 has been requested (#) This link will be replaced with a OC48 POS card for the 6500 when available 100Mbps Ethernet 10Mbps Ethernet

Time series UDP TCP Cat 4000 802.1q vs. ISL Incoming Outgoing

Flow lengths • Distribution of netflow lengths for SLAC border • Log-log plots, linear trendline = power law • Netflow ties off flows after 30 minutes • TCP, UDP & ICMP “flows” are ~log-log linear for longer (hundreds to 1500 seconds) flows (heavy-tails) • There are some peaks in TCP distributions, timeouts? • Web server CGI script timeouts (300s), TCP connection establishment (default 75s), TIME_WAIT (default 240s), tcp_fin_wait (default 675s) ICMP TCP UDP

Power law fit parameters by time Just 2 parameters provide a reasonable description of the flow size distributions

Network Measurements