530 likes | 719 Views
IP Network Performance Measurements. Bruce Morgan AARNet Pty Ltd. Just checking…. Why metrics? Metrics are important to identify network related issues especially performance Metrics can be diverse No one metric is suitable for all needs. Types of Measurement. Active Measurement
E N D
IP Network Performance Measurements Bruce Morgan AARNet Pty Ltd
Just checking… • Why metrics? • Metrics are important to identify network related issues especially performance • Metrics can be diverse • No one metric is suitable for all needs
Types of Measurement • Active Measurement • Injecting measurement data into the network • E.g. UDP, TCP, ICMP packets • Passive Measurement • Measuring what is there already
The Problem • Measurement of the network cloud is difficult – but is essential if we are to gauge user perception of the internet
The World Wide Wait Some problems are host based, while others are network based: • Physical latency • Network queuing and delays • Server processing delay • Timeouts and packet loss • TCP protocol delays
The Dark Cloud • Diverse network paths • Asymmetric paths • Policy routing • Committed Access Rates • Firewalls and filters
IP Performance Metrics • Framework spelt out in RFC 2330 from the IPPM Working Group • Goal: “to achieve a situation in which users and providers of Internet transport service have an accurate common understanding of the performance and reliability of the Internet component 'clouds' that they use/provide.”
On the Standards track… • RFC 2678 IPPM Metrics for Measuring Connectivity • RFC 2679 A One-way Delay Metric for IPPM. • RFC 2680 A One-way Packet Loss Metric for IPPM. • RFC 2681 A Round-trip Delay Metric for IPPM.
A One-way Delay Metric • Type-P-One-way-Delay • The P is for protocol • A Poisson distribution is chosen to inject packets • Both source and destination require time synchronisation
A Round-trip Delay Metric • Many applications do not perform well with large end to end delays • Ease of deployment compared to one-way metrics • Ease of interpretation
Ping • Two way path measurement based on RTTs (return trip times) • Choice of monitored address • Host • Router interface • Router Loopback address
Packet Loss on ICMP • Loss Asymmetry • Loss = 1 – ((1 – Lossfwd).(1-Lossrcv)) • Path Asymmetry • Possibility of Internet Service Providers (ISPs) or sites or even hosts rate limiting (including complete blocking) ICMP echo and thus giving rise to invalid packet loss measurements.
PingER • (Ping End-to-end Reporting) is the name given to the Internet End-to-end Performance Measurement (IEPM) project to monitor end-to-end performance of Internet link • Uses ICMP RTT for measurement
Surveyor • Dedicated PC running Unix at key sites • GPS for clock synchronization • One way delay & loss measurements • Community is Internet 2 clients, • HEP sites collaborating with Surveyor
PingER/Surveyor Comparison • PingER uses the ICMP echo facility (ping) and thus only makes round trip measurements. • Surveyor uses a GPS system to synchronise time between sites and makes one way measurements.
PingER/Surveyor Comparison • Surveyor requires a dedicated platform (PC) to be installed at each site that is monitored, whereas PingER uses an existing host with no special software installed at the monitored site. • PingER cheaper!
PingER/Surveyor Comparison Surveyor is more accurate and better for short term measurement, especially for sites which have good connectivity. PingER is a more light weight solution, requires less management, uses less bandwidth, requires less storage, and nothing needs to be installed at the remotely monitored sites and is good for remote sites with poor connectivity.
PingER - Surveyor Complementarity • Agree well • Surveyor has one way measurements, PingER only round-trip • Surveyor dedicated platforms & strong central management • experience with PingER shows this has benefits. • PingER more parsimonious/lightweight (bandwidth, disk space, cpu) • but necessarily less accurate especially at small (hourly) time resolution on low loss links. • PingER good for looking at long term trends & grouping where statistics are less a problem
TCP SYN / ACK tools • In order to truly measure Web traffic, which is almost entirely TCP/IP traffic, it is best to probe using TCP/IP rather than ICMP • SYN/ACK mechanism proves useful for this purpose
TCP SYN/ACK tools3 way handshake Send SYN seq=x Receive SYN Send SYN seq=y, ACK x+1 Receive SYN +ACK Send ACK y+1 Receive ACK
TCP SYN/ACK • Connection request by a SYN and measures the time taken by the target to respond with an ACK • The connection is promptly cleared by another exchange of packets, this time containing the FIN control flag.
Sting • Sting is a TCP-based network measurement tool that measures end-to-end network path characteristics. sting is unique because it can estimate one-way properties, such as loss rate, through careful manipulation and observation of TCP behaviour. • Avoids increasing problems with ICMP-based network measurement (blocking, spoofing, rate limiting, etc). • http://www.cs.washington.edu/homes/savage/sting/
Current AARNet Measurements • MRTG • Perf • ICMP RTT measurements • ICMP Packet Loss measurements • Wa • Host/endpoint reachability • TCP HTTP file transfer measurements • Netflow data
MRTG • Uses SNMP interface statistics • Provides multi-functionality from router temperature to throughput • Visualisation package • Lacks granularity with time • Deployed at each RNO
MRTG graphs WARNO/ International traffic on June 18 WARNO / VRNO traffic on June 18
Perf Tool • Perfd – uses a bsd based ping for RTT and packet Loss calculation • Perf – web display tool of the data • Deployed at each RNO to measure all points of the mesh • Used to check SLA agreement with Cable and Wireless Optus
WA • “what’s alive” is based on nocol • Checks reachability of hosts/endpoints • Uses ICMP echo, but could be easily extended to check on service level availablity • Frequent check of all hosts
TCP based Measurements • Uses an active http file transfer • Measure at host • Measure from Netflow records • Can detect retransmissions • These may occur from packet loss/out of sequence packets in either direction
Load balancing impacts • Can use contiguous IP addresses on monitoring machine to monitor per destination load balancing • Monitoring machine can determine performance on link but unable to determine which link is used. • If a link fails then traffic will divert to other links
Flows… • A flow is taken to be either a bidirectional or unidirectional communication between a source and destination host. The communication shares an address/port correspondence. • The biggest indicator of scan/DOS attacks are generally flow records!
Netflow Records • We keep detailed Flow records • Timestamps and durations • Source/destination addresses • Protocol Types • Cumulative IP Flags • ICMP control types
Netflow Records • Useful for determining metric targets eg top 100 WWW hosts • Can derive useful measurements from the netflow data itself • Be wary on derived throughput – flows can take a long time.
What are the choices? • Various tools and methods are available • No one tool is good for everything • Combinations of tools, both passive and active, leads to interesting and more detailed analysis