690 likes | 711 Views
Internet Routing Instability and it's Origins. Ilia Ferdman Lilia Tsvetinovich. Abstract. Problems discussed Internet Routing Instability Origins of Internet Routing Instability. Internet Routing Instability. Defined as rapid fluctuation of network reachability and topology information
E N D
Internet Routing Instability and it's Origins Ilia Ferdman Lilia Tsvetinovich
Abstract • Problems discussed • Internet Routing Instability • Origins of Internet Routing Instability
Internet Routing Instability • Defined as rapid fluctuation of network reachability and topology information • Also referred as “route flap”
Origins of Routing Instability • Router configuration errors • Transient physical and data link problems • Software bugs
Primary Effects • Instability can lead to • Increased packet loss • Delays in the time for network convergence • Additional resource overhead(memory, CPU) • Imminent “death of the Internet”
Internet Structure • Comprised of interconnected regional and national backbones • Large public exchange points are the “core” of the Internet
BSP 1 BSP 3 EP 2 BSP 4 BSP 5 EP 4 EP 1 BSP 6 BSP 2 EP 3 BSP 7 Internet Structure (cont.) • BSP – Backbone service provider • EP – Exchange points
Internet Structure (cont.) • Backbone service providers exchange • Traffic • Routing information • Backbones in the core maintain default-free routing table
Internet Structure (cont.) • Autonomous systems • Distinct routing policies • Connect to private or public exchange points • Peer border routers in AS exchange reachability information to prefixes • Prefixes – IP address blocks • Exchange information through BGP
BGP Incremental protocol Uses TCP Limits distribution of routing information IGRP, OSPF, etc Interior protocols Use datagram service Flood network with all known routing table entries Border Gateway Protocol • BGP vs. IGRP & OSPF
BGP (cont.) • Allows configuration for policy (MED) • MED – Multi Exit Descriptor • ASPATH - list of AS numbers
BGP (cont.) • Allows configuration for policy (MED) • MED – Multi Exit Descriptor • ASPATH - list of AS numbers • BGP updates • Announcements • Withdrawals
R1 R2 R3 R1 R2 BGP updates - Withdrawals • Explicit Withdrawals • Implicit Withdrawals
R1 R2 R3 BGP updates - Withdrawals • Explicit Withdrawals
R1 R2 BGP updates - Withdrawals • Implicit Withdrawals
BGP (cont.) • Allows configuration for policy (MED) • ASPATH • BGP updates • Announcements • Withdrawals • Stable wide-area networks performance expectations
Methodology • Since January 1996, 9 months • Routing Arbiter project • Public exchange points: AADS, Mae-East, Mae-West, PacBell, Sprint
Methodology • Mae-East backbone service providers: ANS, BBN, MCI, Sprint and UUNet • RAP – Routing Arbiter Project • Route Servers used to collect information • 12 gigabytes of compressed data
Types of Routing Instability • BGP updates Instability rate • Forwarding instability • Routing Policy Fluctuations • Pathological updates • Instability – instance of forwarding instability or policy fluctuations
Possible impacts • Increase in cache misses • CPU & memory problems • Route “flap storm” • Forwarding loops
Route Caching Architecture • Routing table cache of destination and next-hop lookups • Routing table is too big to keep it in main memory • Instability causes increase in cache misses • Load on CPU
Route Caching Architecture • Possible solution: • Full routing table in main memory
Possible impacts • Increase in cache misses • CPU & memory problems • Route “flap storm” • Forwarding loops
CPU & Memory Problems • Normally could manage the router’s computational needs • Instability places large demands on a router’s CPU • Keep-Alive packets delayed
Possible impacts • Increase in cache misses • CPU & memory problems • Route “flap storm” • Forwarding loops
Peers update their peers Overloaded router marked as unreachable Peer routers choose alternative paths “Down” router recovers and tries to re-initiate peering sessions Large state dump transmissions are generated Increased load causes more routers to fail Route “flap storm”
Route “flap storm” (cont.) • Possible solution: • Higher priority to Keep-Alive messages
Possible impacts • Increase in cache misses • CPU & memory problems • Route “flap storm” • Forwarding loops
Forwarding loops • Defined as steady-state cyclic transmission of user data between a set of peers • Loop verification by checking ASPATH • Unconstrained routing policies
BGP Update Types • WA Different – WADiff • AA Different – AADiff • WA Duplicate – WADup • AA Duplicate – AADup • WW Duplicate – WWDup
BGP Update Types - WADiff • Explicit withdrawal • Unreachable route is replaced by alternative route • ASPATH or next-hop attribute differs • Forwarding instability
BGP Update Types - AADiff • Implicit withdrawal • Route is unreachable • Alternative path becomes available • Forwarding instability
WADiff Explicit withdrawal Forwarding instability AADiff Implicit withdrawal Forwarding instability WADiff and AADiff • Route is replaced by alternative one
BGP Update Types - WADup • Explicit withdrawal • Route explicitly withdrawn and then re-announced a reachable • Transient topological problems (link or router) • Forwarding instability or Pathological behavior
BGP Update Types - AADup • Implicit withdrawal • Route is implicitly withdrawn and replaced by it’s duplicate • Duplicate route does not differ in ASPATH or next-hop attribute information • Policy fluctuations and Pathological behavior
WADup Explicit withdrawal Pathological behavior Forwarding instability AADup Implicit withdrawal Pathological behavior Policy fluctuations WADup and AADup
BGP Update Types - WWDup • Repeated BGP withdrawals for a prefix that is unreachable • Pathological behavior
Explicit Withdrawal Implicit Withdrawal Forwarding instability Policy Fluctuations Pathological Behavior WADiff V V AADiff V V WADup V V V AADup V V V WWDup – – V BGP Update Types - Summary
WW Duplicate • Transmitted by routers of AS that never previously announced reachability for the withdrawn prefixes
Internet Routing Instability and it's Origins Ilia Ferdman Lilia Tsvetinovich
Instability Origins • Hardware configuration problems • Software bugs problems • Multi – Homing sites • BGP implementation problems
Instability Origins –Hardware configuration • Internet growth -> Traffic growth -> New hardware need • Old Hardware -> Increase in number of updates : • CPU overload • Link failures • Small Service Providers use old hardware
Instability Origins –Hardware configuration • Cache architecture • Not all prefix table in memory • Increase in number of updates -> Increase in number of cache misses
Instability Origins –Software bugs • Use of old or not configured software is the reason for Routing Instability • Small Service Providers use old software
SP1 Site SP2 SP3 Instability Origins –Multi – Homing sites • End-sites connect to Internet via multiple Service Providers(SP) • Multi-Homed customer prefixes require global visibility • Routers maintain longer prefixes
Instability Origins – BGP implementation • Stateless BGP • Announcements or withdrawals are send without check • O(N*U) additional updates • N – number of routers • U – number of updates • There are better implementations
R1 BGP R2 OSPF Instability Origins – BGP implementation • Misconfigured interaction between different gateway protocols