220 likes | 304 Views
NANOG panel, 6/11/2002. Mike Lloyd CTO. ISP D. ISP E. ISP F. ISP A. ISP B. ISP C. Internet. Who Buys This, and Why? Enterprise Web Sites & Hosters. Performance gains Brownout avoidance Transit ISP cost savings w/o performance sacrifice Operational cost savings. Headquarters.
E N D
NANOG panel, 6/11/2002 Mike Lloyd CTO
ISP D ISP E ISP F ISP A ISP B ISP C Internet Who Buys This, and Why?Enterprise Web Sites & Hosters • Performance gains • Brownout avoidance • Transit ISP cost savings • w/o performance sacrifice • Operational cost savings Headquarters
ISP D ISP E ISP F ISP G ISP A ISP B ISP C Internet Who Buys This, And Why?Enterprise VPN Customers • IP VPN deployment stalled over security and performance • Security has been “solved”; Smart Routing provides the performance Regional Office Headquarters Regional Office
ISP B ISP A Internet Who Buys This, And Why?Service Providers • Higher quality service at lower cost • Brownout reduction => lower operating costs • Efficient load control => delay in purchase of next b/w upgrade • Particularly attractive to Tier 2 & 3 (transit costs) and Hosters (higher quality service) ISP C Destinations End-users
Is There a Performance Problem? Keynote time (avg) 3.9 sec Porivotime (avg)8.2 sec Worst mean time 34.0 sec BrownoutsWorst 20 Percent (avg) 48.5 sec .048 .036 Probability .024 .012 Good Worst .000 5.31 27.07 48.82 70.58 92.33 When she was good, She was very good indeed, But when she was bad she was horrid. -- Henry Wadsworth Longfellow Typical Web Page circa 2001 Business User Load Time (seconds)Keynote 40 (115k) Source: NetForecast model of a major Web site with a specific distribution of users.
“The Answer, My Friend, Lies In Measurement”(apologies to kc & CAIDA) • The edges of the network have the strongest motivation to optimize • They also have the best data for it: their existing traffic! • It’s the end to end principle • “The function in question can completely and correctly be implemented only with the knowledge and help of the application standing at the end points of the communication system.” --- Saltzer et al, quoted in RFC 1958
RouteScience’s Approach to Measurement We have a unique approach, based on Web traffic • HTML isn’t all traffic, but it’s ubiquitous • For sources and sinks of content, there’s more than enough data in existing traffic • Just watching traffic come and go is fine, but what about alternative paths? • HTML has the curious property that clients ask servers where to find things • Therefore reserve one existing object (usu. a “spacer GIF”) to study performance • As users request content, direct them to the measurement device for the test object, and serve it over alternate paths
Why This Approach? Inline measurements give you: • Real time comparison of active and inactive paths, before any changes are made • Visibility into the benefits (and shortcomings) of each ISP • Save money, but without compromising performance • No Traceroute • If you can measure through firewalls, or test locations which accept probes, why use traceroute? • No Automatic Pings • We do not respond to observed events by increasing testing
Per Prefix, Per Link, Real Time Differences Original BGP Prefixes Sorted by Diff (in ms) Shows largest fixable problems Code “C”: Update sent, confirmed by router
Published Study of ISP Price/Performance Traffic DistributionBGP (before PathControl) Traffic DistributionPathControl (based on performance) Every pair of providers (even the cheapest) can perform significantly better than any pair (even the most expensive) with BGP Conclusion: Transit buyers armed with this technology can get more performance for less money Paper at http://www.routescience.com/cgi-bin/isp.cgi
Why Should ISP’s Care? • Carrot: • It’s good for you if people use this technology • Self-sufficient customers => less trouble tickets • More VPN traffic on the Internet • Get your fair share of traffic, not the BGP share • Our customers increase ISP diversity, because of reduced risk • At last; a reward for running a better network! • Stick: • Customers gain control they don’t exercise today • Customers armed with a new benchmark, based on what they want Use the technology!
BGP Beats Chance, But Not By Much • BGP is only slightly better than chance at selecting performance winners • Research presented at ISMA ‘01 • (What causes the small advantage?) Slides at http://www.caida.org/outreach/isma/0112/talks/mike/
1 Raw HRTTs (ms) 1 ISP A ISP B ISP C ISP D 2 User Experience 3 MOS Improvement ISP A ISP B ISP C ISP D 4 User Experience • 7pm midnight 5am 10am 3pm EST Internal Problem: ISP A not the best way to reach ISP A’s own address space! BGP PathControl BGP PathControl
Brownout Example: t – 1hour BGP Distribution Performance Distribution (adjusted for cost) Steady state prior to event Top 250 prefixes
Brownout Example: t - zero Event begins! BGP unchanged One provider drops to 3 prefixes 7.48 Times Faster than BGP
Brownout Example: t + 1hour Times Faster reduced Provider NOC intervention? If so, not good enough yet
Brownout Example: t + 2hours Two hours on Slight increase in traffic
Brownout Example: t + 4hours Event over Routing is back to steady state
Per Prefix History Of Brownout Event Up to 40x speedup per prefix Some prefixes on-net
Real World Application Result • User experience before and after use of our device • Results from customer’s own app monitoring, not just network time • Network well tuned by operators in advance, but application still gets significantly faster