410 likes | 582 Views
NANOG Panel on Smart Routing. June 11 th , AM. Smart Routing. I) What is Smart Routing or Route control? II) Why do Enterprise Customers want it? III) What BGP issues has Smart Routing uncovered?. Panel Introductions. Netvmg: Jeremy Johnson Opnix: Aaron Britt Proficient: Robert Bays
E N D
NANOG Panel on Smart Routing June 11th, AM
Smart Routing • I) What is Smart Routing or Route control? • II) Why do Enterprise Customers want it? • III) What BGP issues has Smart Routing uncovered?
Panel Introductions • Netvmg: Jeremy Johnson • Opnix: Aaron Britt • Proficient: Robert Bays • Route Science: Mike Lloyd • Sockeye: Brandon Ross
Introduction to Smart routing • Enterprise Myths and reality • What Smart Routing do? • Traffic Flow measurement • Databases • Algorithms • What might change for ISPs
Enterprise Myths and reality • Myth: “The Internet is designed to be highly reliable and efficient even in the event of a nuclear attack” • Reality (no surprise) • Routing problems occur • Failures are noticed by customers • Bandwidth is wasted due to over provisioning ISP 1 ISP 2 ISP 3 ISP 5 ISP 4
Enterprise Reality • Most Enterprise routing does not use BGP Policy • BGP left to tie breakers • Routing in ISPs impacts customers • Performance is in the eye of the user • Function(metrics) = goodness measure • Shortest path does not mean best performance
I) What is Smart Routing? • Introduction to the basics of Smart routing • 6 Basic Questions • Each Vendor’s technical approach
Smart Routing: • Measures and probes the network • Routes and Traffic flows at “important points” • UDP/TCP active probes • (“ping-like” class, “traceroute” class) • Traffic flows (snooped, sniffed, served) • Traffic level measurement (net flow, snmp) • BGP Routes data • Stores information (what measure = what keep) • Selects optimal routes • Routes and algorithms are the magic • Inserts better routes in IBGP • Reports
Quote from Bill Woodcock “- What to measure? Loss, latency, jitter and path length and changes are obvious metrics, but where do you measure to and from? Do you measure from the desktop machine of whoever buys your software, or do you measure from somewhere or some large set of somewheres which might be more representative of the Internet overall, at the risk of being less representative of the customer themselves? Do you measure to some set of generic frequently-viewed web sites, although this is likely to annoy the proprietors of those sites, if the tool becomes popular? Or to some set of routers within the backbone infrastructure, although someone may get wise and put them on private addresses or cause them to stop wasting cycles responding to your tool? Is there even a right answer to this? It may be that one size doesn't fit all.” recent nanog email thread (6/6/02) on Diagnostic tools
Path Monitoring Continuum
Path Evaluation Continuum
Monitoring mechanisms –1 ICMP RTT • Basic active tests • send ICMP, get ICMP gives you a RTT or a loss • send UDP, get UDP gives you a RTT or a loss, assuming far end is UDP responder • send ICMP (or any) with low ttl, get ICMP ttl exceeded gives you a RTT, a loss or a failure • Synthetic transaction tests • send TCP handshake and close • send full HTML transaction RTT UDP SYN HRRT SYN/ACK ACK html ACK data Full HTML FIN/ACK FIN/ACK ACK
Monitoring mechanisms – 2 • Basics Passive tests • monitor flows on span port • monitor flows on servers • Beyond the simple query/response • Netvmg – traffic flows • Opnix - network map queries • Proficient – packet train • Route science – HRTT trip (web site, TCP) • Sockeye – 1K pull from web site Smart Routing
Select the best route • Select best route based on Quality measures • Metrics you measure • Total Quality measure = Goodness-function(metrics measured) • Thresholds • Global threshold • goodness(metrics) > threshold - do something • Specific threshold • goodness(quality1) > threshold1 do something
Steering traffic NRLI 1 Local pref =10 Path 2 NRLI 1 Path 1 local_pref =5 No export NRLI 1 Path 1 local_pref =5 No export EBGP AS 1 NRLI 1 Local pref =12 Path 3 NRLI 1 Local_pref =10 Path 1 NRLI 1 Path 1 local_pref =5 No export EBGP EBGP • Options on the Route sent back: • Local_pref lower, no local_pref • Same prefix or more specific • AS path zeroed or same • Do or do not send to originator Healthy habits - No Export Community - zero AS Path
Impact on Service Providers • Enterprises will Route around problems • Complaints may get specific • SLAs may tighten • Enterprises may tune down their over provisioning
Panel Answers 6 questions • What is your product offering? • How do you define “quality” for traffic flows for the Enterprise customer? • What tools do you use to measure traffic flow quality? • What type of BGP routing data do you keep? • How does your product steer traffic the right direction? • How does your product interact with other products in the Enterprise (routers, firewalls, web caches, and packet optimizer?
3) Measurements and Metrics • Tools • Active Probes: Pings, traceroutes (ICMP, UDP), UDP proprietary probes, synthetic TCP probes • Passive probes: Data flow monitors, participating in html applications, port sniffing • Additional data sources: netflow, snmp • Data bases • Now casting (traffic statistics) • Trending databases • BGP Route information
Active Probes – “ping like” (1) “Ping like probes” Locations automatically probed
Active Probes – “ping like” (2) “ping like” Locations able to probe
Active Probes: “traceroute like” (1) Locations automatically probed Probe mechanism Customer sites Internet sites Vendor
Active Probes- “trace-route like” (2) Locations can probed Probe mechanism Customer sites Internet sites Vendor
Data flow monitoring Quantity Quality Locations monitored
Event and Quality Threshold Path Evaluation Active Probe sent Probe Query Interval Number
4) BGP Routing Data Use in the product or service R & D use Long term Now casting *1 user selected
5) Route steering Configurable Default
6) Interacts with other Enterprise Devices • Routers • gets routes and gives better routes • Snmp, netflow data • Firewalls • Snoops, sniffs, listens on wire • Other devices – no interaction • Web caches • Packet optimizers • DOS detection
Why Do Enterprise’s buy Smart Routing? • What improvements to Traffic Flow can be obtained using product? • How do you prove to customers that you are improving traffic? • Using the box, will the customer reduce the number of required connections?
Netvmg Opnix Proficient Route Science Sockeye Vendors Presentations
III) BGP Issues that Smart Routing Highlights • What % of the time does BGP pick the best route? • What BGP Policy does the Enterprise use to pick the best route? • What BGP tie breaking is used? • Why doesn’t BGP pick the best paths? • Where are the problems in the Internet? • Edge of the Enterprise • Tier 1 provider has interior problems? • IBGP convergence (or lack thereof) problems? • Exchange points • Others? • What impact does Multi-Protocol BGP have on Enterprise BGP?
Enterprise Configurations • Routing Policy • Those who use policy, beat it up • Those who don’t use policy, use tie breaking • Routers and Links to Internet • Common topology 1: • 2 routes, 1 link each • Grows until many links (1-8 links) • Common Topology 2 • 4-6 routers • 5 links per router
Vendor slides follow • Netvmg • Opnix • Proficient • Route Science • Sockeye