200 likes | 311 Views
On the Interaction between Dynamic Routing in the Native and Overlay Layers. INFOCOM 2006 Srinivasan Seetharaman Mostafa Ammar College of Computing Georgia Institute of Technology. Inter-Layer Interaction Problem.
E N D
On the Interaction between Dynamic Routing in theNative and Overlay Layers INFOCOM 2006 Srinivasan Seetharaman Mostafa Ammar College of Computing Georgia Institute of Technology
Inter-Layer Interaction Problem • Infrastructure overlay networks offer better services by deploying intelligent routing schemes. • Uncoordinated dynamic routing in the two layers lead to many problems. • We focus on the effect of native link failures, as they trigger each layer to reroute independently • Dual Rerouting
Temporal Dynamics Consider a native link failure in CE • Only one overlay link is affected. • The native path AE is rerouted over F(ACE → ACFDE) 2 G 3 A I 3 2 4 E OVERLAY NATIVE G B I A F C H + ∞ E D Cost Overlay recovery: 8 Original: 2 Native Rerouting: 2 Overlay rerouting: 4 Time Native Failure Native Recovery Native Repair
Downside to Dual Rerouting • Overlap of functionality between layers causing large number of route flaps (oscillations) • Unawareness of other layer’s decisions leading to • resource overloading, • multiple simultaneous failures • a low success rate in rerouting • sub-optimal paths after rerouting • Lack of flexibility and control
Problem Statement I • Assume the two ends of each link (native & overlay) use a keepAlive protocol for link verification. • 3 keepAlive messages lost Failure • Understand the effects of different parameters on the rerouting performance. • KeepAlive-time: Time between two keepAlive messages • Hold-time: Time window to declare link as down • Overlay link cost scheme (Ex: Native hops, Overlay hops)
Performance Metrics • Hit-time: Time taken for traffic to be recovered. = Detection time + Convergence time + Device time (depends on timers) (protocol specific) (Negligible) • Success rate of recovery Success rate of a layer = Number of paths recovered Number of failed overlay paths • Number of route flaps Average route flaps = Number of route flaps Number of failed overlay paths • Peak & Stabilized inflation (before repair) Path cost inflation = Path cost after rerouting Path cost before failure
Temporal Dynamics Overlay path AE Overlay detects first 100% success rate 3 route flaps Peak inflation = 8/2 Stabilized inflation = 4/2 Hit time ∞ Cost Overlay recovery: 8 Original: 2 Native Rerouting: 2 Overlay rerouting: 4 Time Native Failure Native Recovery Native Repair
Performance Evaluation – ns2 • Using GT-ITM, we randomly generate: 25 topologies = (5 overlay network) x (5 native network) • Two scenarios • Inspect intra-domain failures in single-domain native network • Inspect inter-domain failures in multi-domain native network • In each scenario, tabulate failure recovery statistics of all overlay paths by breaking one native link at a time
Effect of Routing Parameters Observations: By varying the overlay keepAlive-time, hold-time and cost scheme, we observe: • hold-time hit time (only until overlay hold-time < native hold-time) • hold-time # route flaps • hold-time sub-optimality • keepAlive-time hit-time hold-time
Conclusion I Dual rerouting can be made optimal by adopting the following recommendations: • Overlay hold-time very close to the native hold-time. • Overlay keepAlive-time that is half that of the hold-time as it leads to an earlier detection.
Problem Statement II • Main observation from previous simulations: • “Native-rerouting yields the optimal path, albeit a bit later” • Make the overlay layer aware of this observation and give higher precedence to native rerouting attempts • Improve overlay routing performance by adjusting the overlay layer functioning
Three Levels of Layer Awareness • No awareness • Dual rerouting • Awareness of native layer’s existence: • Probabilistically Suppressed Overlay Rerouting (PSOR): Suppress overlay rerouting attempt with probability ‘p’ • Deferred Overlay Rerouting (DOR): Delay overlay recovery by time ‘d’
Three Levels of Layer Awareness (contd.) • Awareness of native layer’s parameters: • Follow-on Suppressed Overlay Rerouting (FSOR) If follow-on time < threshold ‘f’, then suppress overlay rerouting Follow-on time Time Overlay layer detects failure Native layer detects failure Failure
Effect of Adjusting Overlay • All three schemes are simple and offer significant control over the tradeoffs between hit-time and the other metrics. • PSOR: • Least number of route flaps • Least peak inflation • DSOR and FSOR behave similarly (FSOR has slightly better hit-time): • Better success rate • Lower stabilized inflation
Conclusion II By appropriately tuning • keepAlive-time • hold-time • suppression probability • delay • follow-on threshold …we can improve results for: • Hit-time • # Route flaps • Path cost inflation • Stabilization time • Success rate
Problem Statement III • Main observation from previous simulations: • “It is not possible to improve all metrics simultaneously. Hence, performance is still bounded!” • As overlay applications proliferate, the native layer should gradually evolve to suit them • Improve overlay routing performance by adjusting the native layer functioning
Tuning the Native keepAlive-time • We adopt a non-invasive procedure to advance the native layer rerouting • Tuning of the native layer keepAlive-time • Constraints: • Tuning should not generate any extra overhead • Effective detection time should be same
Tuning the Native keepAlive-time (contd.) Consider the following scenarios for tuning. Scenario B is vanilla Dual rerouting Scenario A is the layer-aware overlay rerouting scheme Scenario C is the tuning we recommend here
Conclusions III • Native layer tuning we proposed achieves the best performance in all our metrics
Summary • We propose means to mitigate the problems associated in the inter-layer interaction • We explore two directions: • Adjusting the overlay layer functioning • Adjusting the native layer functioning