290 likes | 306 Views
This case study discusses the measurement of IP performance on a large scale. It covers the setting, packet generation, measurement system and methodology, and drawing conclusions.
E N D
Measuring the Internet:A case study by Bob Mandeville and Andrew Corlett bob@iometrix.com andrew@iometrix.com
Agenda PART 1IP Performance Measurement Case Study (What we did) PART 2Measurement System and Methodology (How we did it) PART 3Drawing Conclusions (and going onto next steps)
PART 1IP Performance Measurement Case Study (What we did)
The setting • Large-scale test of seven of the world’s biggest ISPs • 28 measurement nodes (cNodes) on backbone core of: • Cable & Wireless (C&W) • Level 3 Communications • Qwest Communications • Savvis Communications • Sprint Corp. • Verio • Williams Communications
Measurement packet generation • Test ran 30 days; total project took more than a year to complete • cNodes generated 4,558,388,076 packets during the month of August 2002 • All told, we collected 156,050,656 discrete measurements • cNodes record more than 70 IP metrics but in this test we focused on just three: uptime, jitter, and packet loss
Packet types • The cNodes generated vectors of both 1,518 byte TCP and 256 byte UDP packets • With each cNode sending packets to three other cities there were a total of six vectors per cNode • cNodes configured to generate an aggregate transmit rate for all vectors not to exceed 512 kbit/s.
PART 2Measurement Systemand Methodology (How we did it)
BROWSER Database Service-Daemon cNode cNode cNode System Architecture OSS Traffic Engineering Application #3 Application #2
Service-Daemon • Central hub of Measurement System • Configures cNodes for measurements • Retrieves Results and stores into database • Sophisticated state-machines maintain measurement system automatically. For example: • Downloads results stored in cNodes but not stored in database • Configures cNodes that may have been power-cycled. • cNodes continue to measure and store results internally if connectivity to Service-Daemon is interrupted • CLI/Scripting engine allows for external and bulk configuration • Runs on Windows, Solaris, and Linux
Terminology • Vector • Basis of all measurements • Defines measurements from one cNode to another cNode • All packets are formatted the same (Service-Type) • Many different vectors can be executed simultaneously • HTTP, VoIP, FTP, etc. • Service-Type/Packet-Types • Defines the format of measurement packets • Example: TELNET, TCP Port 23, 1500 byte packets
Terminology • Vector Handler • Computes and stores measurement results • Located on the destination cNode • Measurement Period • Interval of time representing results data • 5 minute intervals • Can be combined to report or alarm on larger intervals • 10, 15, 30, 1hr, 1day
Service-Type/Packet-Type • Optional UDP or TCP headers • port numbers • TCP fields: Flags, Window, MSS option, Urgent Pointer • DSCP settings • Packet Length • Payload Type (all 0’s, all 1’s or Random) • TCP- Flags, Window Size, Urgent Pointer, MSS option • TTL • Loose, Strict, and/or Record Route options • VLAN tags
Database cNode cNode cNode Continuous Measurements 12:25 12:20 12:15 12:10 12:05 12:00 Measurement Period (5 minutes) Computed Results
Results • Every 5 minutes all of the packets received for a vector are processed through sophisticated algorithms and a ~1Kbyte results packet is created representing all of the metrics • The results packet is automatically sent to the service-daemon and stored into the internal memory of the cNode • Results packets can be combined so reports and alarms can be generated over time periods other than 5 minute intervals: e.g. 1 hour, 1 day, 1 week or even 1 year.
Optional Header (UDP/TCP) Metric Header Optional IP Header Payload (zeros/ones/random) Ethernet Header Ethernet CRC IP Header Timestamp Measurement Packet • Optional UDP or TCP headers • Source/Destination Port numbers • TCP fields: Flags, Window, MSS option, Urgent Pointer • DSCP settings • Packet Length • Payload Type (all 0’s, all 1’s or Random) • TCP- Flags, Window Size, Urgent Pointer, MSS option • TTL • Loose, Strict, and/or Record Route options • VLAN tags
Metric Header • Allows measurement packets to be formed as any protocol without interfering with manageability of cNodes • E.g. cNodes can measure Telnet traffic while Telnet sessions are in process on the cNode • Header Identifier and Version • Hardware Timestamp • UTC, 64-bit, 1ns units • Packet ID • 64-bit • Initial TTL, TOS, and IP Protocol fields • Payload Checksum • Metric Header Checksum • Vector and Measurement Period Identification
One-Way Measurements • Accurate • 64-bit hardware timestamps • 12.5 ns clock synchronized by GPS (internal), 1 PPS and IRIG-B, and/or NTP • All counters are 64-/128-/256-bit • Continuous • Send active measurements continuously • Calculate results every 5 minutes • Comprehensive • Over 65 IP Metrics • Delay (latency), jitter, loss, outages • Out-of-order, loss patterns, fragmentation, hop count and hop changes, DSCP changes, duplicates, corruptions
One-Way Measurements • Scalable • Highly distributed system • Results computed at cNodes • cProt allows minimal communication w/cNodes for configuration and data gathering • Operationally: system designed to be self-maintaining • Scientific • Methodology designed from years of test and measurement experience • Statistical accuracy – Pullin papers (CalTech) • Accountable • Event-lists account for power-failures, link failures, time synchronization changes, etc. • Comparable • Over time and topology
PART 3Drawing Conclusions (and going to next steps)
Some conclusions drawn from the experience… • We disagree with the NetworkWorld article conclusion: outages were too significant to qualify providers as ‘telco grade’ • One-way measurements hampered by lack of GPS clock sources on 85% of sites under test • Full set of 70 IP metrics used successfully to analyze anomalous behavior • Currently majority of ISPs do not have advanced IP measurement capabilities deployed on their networks