490 likes | 619 Views
Routing Stability in Congested Networks: Experimentation and Analysis. Aman Shaikh, Anujan Varma Computer Engineering Department, University of California Lampros Kalampoukas, Rohit Dube High Speed Networks Research, Bell Laboratories Presented By Jinghua Hu 10/26/2000. Outline.
E N D
Routing Stability in Congested Networks: Experimentation and Analysis Aman Shaikh, Anujan Varma Computer Engineering Department, University of California Lampros Kalampoukas, Rohit Dube High Speed Networks Research, Bell Laboratories Presented By Jinghua Hu 10/26/2000
Outline • Introduction • Experimental Setup • Analytical Models • Protocols: OSPF, BGP • Measurement: U2D, D2U • Experimental Results • Conclusion Dept of ECE, University of Massachusetts at Amherst
Introduction • Routers exchange control packets to • Disseminate routing information • Determine liveliness of peering sessions • Control packets • Share resource with data traffic • Subject to loss due to congestion • Purpose of the paper • Analyze and quantify the effect of congestion on the stability of OSPF and BGP Dept of ECE, University of Massachusetts at Amherst
Motivation • Related work • most assume reliable and loss-free delivery • Our Contribution • Build Analytical models of as a function of • traffic overload factor, queuing delay, packet size and propagation delay • Analyze the dynamics of routing protocols on congested networks • Thefirst to show the dynamics of BGP for various round trip times, which can be used for traffic engineering Dept of ECE, University of Massachusetts at Amherst
Experimental Setup Dept of ECE, University of Massachusetts at Amherst
Experimental Setup • Traffic Generator • 10.4.4.2 • Transmission rate set according to desired traffic overload level • IP over ATM by using AAL5 encapsulation • Destination • 192.168.64.1 Dept of ECE, University of Massachusetts at Amherst
Experimental Setup • Three Routers: HR1, HR2, HR0 • Two distinct forwarding paths • primary path • 10.4.4.2 -> HR1 -> HR2 -> 192.168.64.1 • secondary path • 10.4.4.2 -> HR1 -> HR0 -> HR2 ->192.168.64.1 • When both exist, choose the shortest path • If link failure, use the secondary path Dept of ECE, University of Massachusetts at Amherst
Experimental Setup • Queue scheduling policy -- FIFO • Route Flaps • Link-down events as seen and reported by the routing protocol • Assume links are always operational at the physical layer and continue forwarding packets that have already been queued even after the routing protocol reports a link-down event • Router infer a link-down due to losses in the transmission of “keepalive” or “hello” messages Dept of ECE, University of Massachusetts at Amherst
Experimental Setup • When link-down happens, the routers withdraw any routes in the forwarding table associated with the failed link • Packets with IP lookup failure are dropped • Dropping Policy in ATM • Drop-Tail v.s. Drop-from-Front • Do not affect drop probability • Do affect average queuing delay experienced by packets that are eventually forwarded Dept of ECE, University of Massachusetts at Amherst
Experimental Setup • For a given buffer size and transmission rate, the queuing delay experienced by packets in a system using the Drop-from-Front policy is inversely proportional to the traffic overload factor. Thus the queuing delay for packets that are eventually transmitted decreases as congestion grows • The dropping process at the routers is “packet-aware”. No cell loss occurs at the ATM layer • Data collector 192.168.63.1 • Only the ATM egress links from HR1 are overloaded Dept of ECE, University of Massachusetts at Amherst
Experimental Methodology • Routing Platform: 4.3BSD-like TCP/IP stack • Notations • r -- transmission rate of the corresponding VC • r’ -- rate of the generated traffic • f -- Overload Factor • p -- packet dropping probability Dept of ECE, University of Massachusetts at Amherst
Experimental Methodology • Buffer size • 4M or 16M bytes • static buffer threshold per physical interface basis • Data Packet size: 64, 256, 1500 bytes • Usage of ATM CBR VC • allow to shape egress traffic to arbitrary low rates • easy to configure for given overload factor, reduce the requirement of traffic generator Dept of ECE, University of Massachusetts at Amherst
Experimental Methodology • Purpose • show that the robustness of routing protocols to congestion is mainly determined by loss rate and queuing delay • But it is independent of actual transmission rate • Two quantities to characterize the robustness • U2D: time from overload to route flap • D2U: time for a link adjacency to be re-established once a failure has occurred Dept of ECE, University of Massachusetts at Amherst
Experimental Methodology • Experimental results comes from • repeated experiment for 10-16 times • computing the average and confidence intervals • Note • U2D is independent of alternate forwarding path • D2U is dependent of the alternate path • Two configurations • 2-node: single static link between HR1 and HR2 • 3-node: two forwarding paths Dept of ECE, University of Massachusetts at Amherst
Analytical Models • Two Assumptions • The overload factor remains constant • Every packet has the same dropping probability irrespective of its size and source. So the dropping probability depends only on the overload factor. Also assume the decision of dropping a particular packet is made independently for each packet. Dept of ECE, University of Massachusetts at Amherst
Route Flap for OSPF (U2D) • Absorbing Markov Chain Dept of ECE, University of Massachusetts at Amherst
Route Flap for OSPF (U2D) • Notations • HelloIntervals: t_HI=10 sec in HR1 • RouterDeadIntervals: t_RDI=40 sec in HR2 • HR2 declares a Route Flap if • timer T_RDI expires ( for the fixed t_RDI sec ) • i.e. four consecutive Hello packets are lost • assumes clock synchronization in routers • jitter in time may results in adjacency going down even when less than 4 Hello packets are lost Dept of ECE, University of Massachusetts at Amherst
Route Flap for OSPF (U2D) • HR1 jitters T_HI: • choose to guarantee the transmission of at least three hello packets within t_RDI sec. • From state S2, two branches • Depends on the total time spent until the fourth hello packets is transmitted • if larger than 40 sec, then one more drop to arrive at S4 • if smaller than 40 sec, two more drops to arrive at S4 • Probabilities to take these branches are both p/2 with different costs Dept of ECE, University of Massachusetts at Amherst
Route Flap for OSPF (U2D) • Expected duration of a U2D cycle Dept of ECE, University of Massachusetts at Amherst
Adjacency Recovery for OSPF ( D2U ) • Differs for 2-node and 3-node experiments • For the 2-node case • The route is recovered if the router receives a new hello packet from the link that was down • The single static link HR1->HR2 still works after it is declared as link down, and it is still overloaded. • During a D2U cycle, the Hello packets may also get dropped with probability p • The expected number of hello pkts transmission attempts until the route comes up is 1/(1-p) Dept of ECE, University of Massachusetts at Amherst
Adjacency Recovery for OSPF ( D2U ) • Expected duration of D2U cycles = 10/(1-p) Dept of ECE, University of Massachusetts at Amherst
Route Flap for BGP (U2D) Dept of ECE, University of Massachusetts at Amherst
Route Flap for BGP (U2D) • Route Flap for BGP • BGP transmits messages over TCP • TCP achieves reliable delivery by retransmission • t_KT: KeepaliveTime in HR1, = 60 sec • t_HT: HoldTime in HR2, = 180 sec • RTT: Round trip time • For the adjacency to be refreshed, HR2 must receive at least one Keepalive message within t_HT sec from last message arrival Dept of ECE, University of Massachusetts at Amherst
Route Flap for BGP (U2D) • RTO: TCP Retransmission interval (<= 64 sec) • a function of RTT estimate and standard deviation • a backoff factor of two is applied to RTO for every unsuccessful retransmission attempt for a packet • Assumptions • clock synchronization for T_HT • initial value of RTT = the actual queuing delay + link propagation delay, standard deviation = 0 • Result in overestimating the number of retransmission attempts before link down Dept of ECE, University of Massachusetts at Amherst
Route Flap for BGP (U2D) • # of states is the max int for • Left Over time: • Expected flap time depends on RTT • When RTT=1 • When RTT increases, backoff times decrease, so route flap time decreases Dept of ECE, University of Massachusetts at Amherst
Route Flap for BGP (U2D) • Note that BGP becomes more robust when the Drop-from-Front policy is in effect, which results in shorter queuing delay than that with Drop-Tail policy Dept of ECE, University of Massachusetts at Amherst
Route Flap for BGP (U2D) Dept of ECE, University of Massachusetts at Amherst
Adjacency Recovery for BGP (D2U) • BGP Adjacency Recovery: two stages • TCP connection establishment • BGP session establishment • BGP session establishment • Bidirectional in nature • But congestion traffic flows only in one direction • The cost is different of initiating a TCP connection in different directions Dept of ECE, University of Massachusetts at Amherst
SYN Client Server SYNACK ACK OPEN Peer Peer KEEPALIVE Adjacency Recovery for BGP (D2U) • Scenario Dept of ECE, University of Massachusetts at Amherst
Adjacency Recovery for BGP (D2U) Dept of ECE, University of Massachusetts at Amherst
Adjacency Recovery for BGP (D2U) Dept of ECE, University of Massachusetts at Amherst
Adjacency Recovery for BGP (D2U) • Note • TCP handshaking msgs retried for up to 3 times • Fig. 5: congestion in C->S • for SYN segment: cST1, cST2, cSYNOK • for ACK segment: sST1, sST2, sSYNOK • no state for SYNACK because of no congestion in S->C • Fig.6: congestion in S->C • for SYNACK segment: cST1, cST2, cSYNOK • no state for SYN or ACK segments • Other states captures events for BGP OPEN msg Dept of ECE, University of Massachusetts at Amherst
Adjacency Recovery for BGP (D2U) Drop-from-Front: RTT depends on overload factor as a result of the change of queuing delay Dept of ECE, University of Massachusetts at Amherst
Adjacency Recovery for BGP (D2U) Shorter D2U time for the same overload factor as compared with Table 4 Dept of ECE, University of Massachusetts at Amherst
Adjacency Recovery for BGP (D2U) Dept of ECE, University of Massachusetts at Amherst
Experimental Results: OSPF U2D 2-node • OSPF(2-node) : Closely matching analytical models Dept of ECE, University of Massachusetts at Amherst
Experimental Results: OSPF D2U 2-node Dept of ECE, University of Massachusetts at Amherst
Experimental Results: OSPF U2D 3-node • OSPF (3-node): consider buffer fill-up time Dept of ECE, University of Massachusetts at Amherst
Experimental Results: OSPF D2U 3-node • Difference of D2U in 3-node case from 2-node case. • depends on buffer size, smaller than D2U of 2-node case Dept of ECE, University of Massachusetts at Amherst
Experimental Results: OSPF D2U 3-node Dept of ECE, University of Massachusetts at Amherst
Experimental Results: OSPF • Explanation of OSPF D2U in 3-node case • The paper does not give analytical models • Once the route flaps, link HR1->HR2 does not remain overloaded because the traffic is diverted over the secondary link HR1->HR0 • Allows for the adjacency to recover almost immediately when the first hello packet already queued makes it to HR2 • The recovery time depends on queue length, so is affected by buffer size Dept of ECE, University of Massachusetts at Amherst
Experimental Results: BGP U2D 2-node • BGP U2D: analytical values are higher than the average values in most of the cases Dept of ECE, University of Massachusetts at Amherst
Experimental Results: BGP D2U 2-node • BGP D2U: analytical values lower than the mean values in most of the cases Dept of ECE, University of Massachusetts at Amherst
Experimental Results: BGP • Explanations for BGP D2U results • Interference of connections initiated by both ends -- more “destructive” • BGP state machine implementation on our routing platform deviates slightly from that described in the BGP specification, especially in the part dealing with recovery from failed connection establishment Dept of ECE, University of Massachusetts at Amherst
Conclusion • OSPF • Behavior depends mainly on the traffic overload factor • Insensitive to the packet size distribution, buffer size, or packet dropping policy. • Enable to derive closed-form expression that accurately captures its stability properties Dept of ECE, University of Massachusetts at Amherst
Conclusion • BGP • More complicated models • Stability depends on • Traffic overload factor • RTT ( or queuing delay, or buffer size, or packet dropping policy ) • We need to isolate routing control messages from data traffic to improve the stability of routing protocols Dept of ECE, University of Massachusetts at Amherst
Questions? Dept of ECE, University of Massachusetts at Amherst
Appendix • Derivation of E[U2D] for OSPF • :expected time to reach S4 from S0 • :expected time to reach S4 from Si, Dept of ECE, University of Massachusetts at Amherst
Appendix Dept of ECE, University of Massachusetts at Amherst