320 likes | 342 Views
Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding. Andrey Ermolinskiy, Scott Shenker University of California – Berkeley and ICSI. What’s the problem?. One of the central goals of the Internet - continuous end-to-end connectivity
E N D
Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding Andrey Ermolinskiy, Scott Shenker University of California – Berkeley and ICSI
What’s the problem? • One of the central goals of the Internet - continuous end-to-end connectivity • BGP convergence is a major cause of connectivity disruption • Routers operate upon potentially inconsistent local views • Temporary inconsistencies give rise to anomalies such as loops and black holes that disrupt end-to-end packet delivery
Example: transient routing loop with BGP 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B withdraw BA A
Example: transient routing loop with BGP Routing loop between C and D incurs temporary loss of connectivity between {B, C, D, E, F} and A. 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B withdraw BA A
Related Work • Shrinking the convergence time window through BGP protocol extensions • Ghost flushing • Consistency assertions • Protecting end-to-end packet delivery from adverse effects of convergence • R-BGP • Forward packets on pre-computed failover paths, • Propagate root cause information to prevent loops • Consensus Routing • Enforce a globally-consistent view via distributed snapshots and strategically delay adoption of incoming BGP updates • Anomaly-Cognizant Forwarding
Anomaly-Cognizant Forwarding (ACF) • Approach • Accept routing anomalies as an unavoidable fact • Protect end-to-end packet delivery by detectingand recovering from anomalies on the forwarding path • Main hypothesis • Several simple and lightweight extensions to conventional IP forwarding enable us to sustain packet delivery during periods of BGP instability • without the use of pre-computed backup paths • without modifying the core routing protocol or altering its timing dynamics
ACF Overview • Domain S has anomalous forwarding state for destination D if S’s outgoing packets destined for D arrive back to S as result of a routing loop. • Main idea of ACF: • Detect occurrences of anomalous state • Avoid forwarding packets via domains that are known to have anomalous state. S Each packet carries a list of prior AS-level hops (pathTrace) Anomalous forwarding state D Each packet carries a blackList of domains with anomalous state Packet header pathTrace blackList
ACF Overview Forward (packet p) { if (localASNum in p.pathTrace) Move loop elements from p.pathTrace to p.blackList nextHoplookupNextHop(p.destAddr) if (nextHop in p.blackList) Invoke the control plane, look for alternate non-blacklisted routes in the RIB if (nextHop != NONE) { Append localASNum to p.pathTrace SendPacket(p, nextHop) } else Initiate recovery-mode forwarding for p }
ACF Recovery-mode forwarding • If a router is unable to forward a packet because it does not have a valid non-blacklisted route, it initiates recovery forwarding. • Chooses a recovery destination R from a static and well-known set of highly-connected Tier-1 domains. • Detours the packet through R. Recovery destinations R1 R2 nextHop=NONE • Intuition: R or some router along the path to R may know a working alternate route to the original destination. Normal-mode forwarding Recovery-mode forwarding
Anomaly-Cognizant Forwarding p.Header dst = A origDst = pathTrace = [ C ] blackList = { } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B A
Anomaly-Cognizant Forwarding p.Header dst = A origDst = pathTrace = [ C D ] blackList = { } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B A
Anomaly-Cognizant Forwarding p.Header dst = A origDst = pathTrace = [ C D ] blackList = {D } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B C initiates recovery forwarding through domain F A
Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ ] blackList = {C D } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B C initiates recovery forwarding through domain F A
Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ ] blackList = {C D } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B C initiates recovery forwarding through domain F A
Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ C] blackList = {C D } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B C initiates recovery forwarding through domain F A
Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ C] blackList = {C D } p 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A
Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ C] blackList = {C D E} p 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A
Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ C E] blackList = {C D E} p 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A
Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ C E] blackList = {C D E} p 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A
Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ ] blackList = {C D E} p 1. ECBA 2. GA 1. CBA 2. DBA F E F resumes normal-mode forwarding 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A
Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ F] blackList = {C D E} p 1. ECBA 2. GA 1. CBA 2. DBA F E F resumes normal-mode forwarding 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A
Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ F G] blackList = {C D E} 1. ECBA 2. GA 1. CBA 2. DBA F E F resumes normal-mode forwarding 1. BA 2. DBA 1. BA 2. CBA C D G p B C initiates recovery forwarding through domain F A
Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ F G] blackList = {C D E} 1. ECBA 2. GA 1. CBA 2. DBA F E F resumes normal-mode forwarding 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A p
Anomaly-Cognizant Forwarding F E C D G B A
ACF: Observations • ACF does not use pre-computed failover paths • Discovers alternate routes dynamically using state in the packet header • The two forwarding modes make use of the same forwarding table • Paths to recovery destinations are not assumed to be stable and anomaly-free • We protect recovery-mode forwarding using the same mechanism (pathTrace and blackList)
ACF: Preliminary Evaluation • Evaluation metrics • Effectiveness in eliminating transient disconnectivity • Efficiency of alternate paths • Packet header overhead
ACF: Preliminary Evaluation • Simulation methodology • CAIDA AS-level topology (27969 nodes) annotated with inferred inter-AS relationships • 12937 multihomed edge domains, 29426 adjacent provider links • Provider link failure experiment • For each multihomed domain D, and each provider link L • Fail L and simulate packet delivery from every other domain to D during convergence S2 S3 S1 S4 Packet TTL = 32 hops D Recovery destinations = 10 highly-connected Tier-1 ISPs
ACF: Preliminary Evaluation • Transient disconnection after a link failure • BGP with conventional forwarding • 51% of failures cases produce unwarranted disconnection • Widespread disconnection (>50% of ASes) in 17% of cases • BGP with ACF • No disconnection in 92% of failure cases • <1% of ASes see disconnection in 98% of failure cases
ACF: Preliminary Evaluation • Transient path efficiency • Causes of path dilation in ACF • Transient loops • Detouring via a recovery destination • In 65% of failure cases that produce disconnectivity, ACF recovers packets using ≤ 2 extra hops • 9% of cases require 7 hops or more F – failure cases that produce transient disconnection with conventional forwarding
ACF: Preliminary Evaluation • Packet header overhead Maximum number of pathTrace and blackList entries in a representative sample of failure cases. • Worst-case pathTrace – 20entries • 40 bytes of overhead assuming 16-bit AS numbers • Worst-case blackList – 16entries • 10 bytes of overhead for a Bloom filter with 1% error rate
Challenges / Concerns • Feasibility of deployment • ACF adds fields to packet header and modifies core IP forwarding logic. • Packet processing overhead • Control plane is invoked only during periods of instability • Common case: check pathTrace and blackList. Both operations admit efficient implementation in hardware and parallelization. • ACF and routing policies