170 likes | 264 Views
Seamless Detection of Link and Node Failures for Local Protection in MPLS. Zartash Afzal Uzmi Computer Science and Engineering Lahore University of Management Sciences (LUMS). Outline. Background Forwarding and Routing in IP and MPLS Networks Network Service Requirements
E N D
Seamless Detection of Link and Node Failures for Local Protection in MPLS Zartash Afzal Uzmi Computer Science and Engineering Lahore University of Management Sciences (LUMS) Lahore University of Management Sciences
Outline • Background • Forwarding and Routing in IP and MPLS Networks • Network Service Requirements • Protection Routing in MPLS • Terminology: Types of Backup Paths • Backup Bandwidth Sharing • Activation sets • Failures and Backup Path Activation • Distinguishable Failure Events: Ideal Case • Actual Failures • Control Plane Mechanism • Outline of Proof Lahore University of Management Sciences
Outline • Background • Forwarding and Routing in IP and MPLS Networks • Network Service Requirements • Protection Routing in MPLS • Terminology: Types of Backup Paths • Backup Bandwidth Sharing • Activation sets • Failures and Backup Path Activation • Distinguishable Failure Events: Ideal Case • Actual Failures • Control Plane Mechanism • Outline of Proof Lahore University of Management Sciences
Forwarding and Routing • Forwarding: • Passing a packet to the next hop router • Routing: • Computing the “best” path to the destination • IP routing – includes routing and forwarding • Each router makes the routing decision • Each router makes the forwarding decision • IP routing is hop-by-hop • MPLS routing • Only one router (source) makes the routing decision • Intermediate routers make the forwarding decision • An MPLS path or “virtual circuit” from source to destination is created and is called an LSP (label switched path) Lahore University of Management Sciences
Network Service Requirements • Bandwidth Guaranteed Primary Paths • MPLS can establish bandwidth-guaranteed paths • Bandwidth Guaranteed Backup Paths • BW remains provisioned in case of network failure • Two options for recovery from network failure: • Compute backup paths AFTER failures occur • Compute and install PRESET backup paths • Minimal “Recovery Latency” • Recovery latency is the time that elapses between: • “the occurrence of a failure”, and • “the diversion of network traffic on a new path” Preset backup paths needed for minimal latency Lahore University of Management Sciences
Outline • Background • Forwarding and Routing in IP and MPLS Networks • Network Service Requirements • Protection Routing in MPLS • Terminology: Types of Backup Paths • Backup Bandwidth Sharing • Activation sets • Failures and Backup Path Activation • Distinguishable Failure Events: Ideal Case • Actual Failures • Control Plane Mechanism • Outline of Proof Lahore University of Management Sciences
Protection in MPLS:Preset Backup Paths Local Protection Path Protection S 1 2 3 D This type of “path Protection” takes 100s of ms. We need “Local Protection” to quickly switch onto backup paths! Primary Path Backup Path Lahore University of Management Sciences
Primary Path Backup Path nhop and nnhop paths LOCAL PROTECTION (showing one LSP only) All links and all nodes are protected! nnhop A B D C E nhop PLR: Point of Local Repair nhop protects link only, e.g., (D,E) nnhop protects link (C,D) and node (D) Lahore University of Management Sciences
Opportunity cost of backup paths • Protection requires that backup paths are setup in advance • Upon failure, traffic is promptly switched onto preset backup paths • Bandwidth must be reserved for all backup paths • This results in a reduction in the number of Primary LSPs that can otherwise be placed on the network • Can we reduce the amount of “backup bandwidth” but still provide guaranteed backups? • YES: Try to share the bandwidth along backup paths Lahore University of Management Sciences
Primary Path Backup Path BW Sharing in backup Paths • Example: LSP1 BW: X Sharing is possible IF Links (A,B) and (C,D) do not simultaneously fail! A B X X max(X, Y) X E G F X+Y Y Y C D BW: Y LSP2 Lahore University of Management Sciences
Activation Sets Can backup paths always share the bandwidth? A A E E B B C C D D Activation set for node B Activation set for link (A,B) backup paths in the same activation set MUST not share the bandwidth! Lahore University of Management Sciences
Outline • Background • Forwarding and Routing in IP and MPLS Networks • Network Service Requirements • Protection Routing in MPLS • Terminology: Types of Backup Paths • Backup Bandwidth Sharing • Activation sets • Failures and Backup Path Activation • Distinguishable Failure Events: Ideal Case • Actual Failures • Control Plane Mechanism • Outline of Proof Lahore University of Management Sciences
Primary Path Backup Path Distinguishable Failure Events Point of local repair (PLR) somehow knows the type of failure! Focus on link (I,J) and Node J and recall: nhop protects link only i.e., (I,J) nnhop protects link (I,J) and node J nnhop: p1 A J I K nhop: p2 PLR: Point of Local Repair L p3 If node I finds that link (I,J) has failed: p1 and p2 are activated If node I finds that node J has failed: ONLY p1 is activated p2 may share bandwidth with other nnhops that protect node j Lahore University of Management Sciences
Actual Failures • Consider the failure of link (I,J) • Both p1 and p2 need to be activated, anyways! • Knowing that this is a link failure will not save anything • Consider the failure of node J • Only p1 needs to be activated (if failure type is known!) • What if node I doesn’t know the type of failure? • Two options: • Wait to “discover” if it was a link or node failure • High recovery latency (BAD!) • Activate both p1 and p2 instantaneously • Now p2 will not be able to share with p3 (BAD!) Lahore University of Management Sciences
Control Plane Mechanism • Routing strategy • Do not oversubscribe • Use sharing as if adjacent nodes can distinguish the node failures from the link failures • That is, provide sharing between p2 and p3 • In reality • PLRs will not be able to disambiguate link/node failures • Activate p1 and p2 (assuming link fail situation – worst case!) • If link had failed: • p1 and p2 really needed to be activated – we are okay! • If node had failed: • p2 (nhop) has been activated by mistake • You may notice reservation violation at some nodes (where the backup paths p2 and p3 were sharing) • Abort all nhop paths that are violating the reservations Lahore University of Management Sciences
Outline of Proof • Define: • Guv: Bandwidth reserved on link (u, v) for all backup LSPs • Iuv: Actual backup bandwidth that falls on link (u, v), after the occurrence of a failure • A reservation violation happens if Iuv > Guv • No oversubscription – sharing between p2 and p3: • Guv = max(bw(p1)+bw(p2), bw(p1)+bw(p3)) – worst case • When failure occurs, activate p1 and p2 • If it was link (I, J) that had failed, we are okay • If it was node J that had failed, p3 also gets activated • Worst case Iuv would have been bw(p1)+bw(p2)+bw(p3) • Our control plane mechanism ensures Iuv ≤ bw(p1)+bw(p3) • This implies that Guv ≥ Iuv in the worst case Lahore University of Management Sciences
Questions & Answers Lahore University of Management Sciences