1 / 10

Configurable restoration in overlay networks

Configurable restoration in overlay networks. Matthew Caesar, Takashi Suzuki. Motivation. Today’s internet core has bursty losses Backbones have low average loss rates (<0.2%), but experience large bursts in loss Loss durations vary from 10ms to 33.72sec

Download Presentation

Configurable restoration in overlay networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki

  2. Motivation • Today’s internet core has bursty losses • Backbones have low average loss rates (<0.2%), but experience large bursts in loss • Loss durations vary from 10ms to 33.72sec • 6 out of 7 providers experienced large outage periods 10-220sec for 1-2 times per day • Difficult for multimedia applications to recover from repeated loss (e.g. with FEC) • Commonly used restoration techniques insufficient • Link layer recovery, MPLS not yet uniformly deployed • RON too slow (20 sec), not scalable •  real-time recovery desired • “Assessment of VoIP Quality over Internet Backbones,” Markopoulou, Tobagi, Karam (INFOCOM 2002)

  3. Approach • Technique: Overlay based, real-time recovery • Use Link-state routing • Determine link cost from packet receipt delay • Adaptively dampen, localize route advertisements • Desirable properties: • Speed: Low end-to-end failure time • Stability: Few route oscillations • Accuracy: Avoid reacting to transient failures • Scalability: Low probing/communication overhead

  4. LCE Phase 1: Detection Mechanism • Goal: • Calculate link cost • Link Cost Estimation (LCE): • Estimates failure probability from packet loss • “Delay-deficit algorithm” • Adaptive Tracking (AT): • Filters noise • Reacts quickly to changes • Route Stabilization (RS): • Dampens route flaps AT RS

  5. Phase 1: Example LCE output AT output RS output • Show detailed actions of layers • Link Cost Estimation (LCE): • metric representing probability link has failed • spikes result from packet losses • Adaptive Tracking (AT): • metric with noise filtered • Route Stabilization (RS): • advertised value for link • Setup • Link Failure at t=[150s-170s] • Probe every 300ms, 10% loss • Results • First Detection in 0.92s, next at 5.42 • A false positive due to cold start. Stabilizes in 5 seconds.

  6. achieve Phase 1: Comparison LCA • Compared LCA, RON, and “Oracle-based” routing. • Results: • RON requires 4 to 10x more advertisements than LCA • RON’s overhead increases exponentially with probe speed, LCA’s overhead increases linearly • Packet loss has an extreme effect on RON, moderate effect on LCA • LCA can achieve subsecond reactions on most internet links achieve desired LCA

  7. Phase 2: Routing • Goals: • Limit scope of link effects • Configurable tradeoff between availability and overhead • Related work: •  No existing scheme autonomously scopes dissemination with respect to links

  8. Phase 2: Solution • Solution idea: Autonomously localize LSA propagation • Unstable cost changes propagated locally • Stable changes propagated further • Application can control distance of propagation based on its requirements • Mechanism: Localized Reaction • Global level propagates link existence • Local level propagates up to date link state • Create boundary around link • Only nodes inside boundary receive LSAs from link • Dynamically resize the boundary based on link characteristics S D

  9. Phase 2: Simulation Results • Compared: • Localized Reaction (LR) • Stability-sensitive LR (LRS) • Cost-sensitive LR (LRC) • Random Hierarchies (RH) • Flooding smaller, more stable costs further improves performance • However, LR not appropriate for all environments • Works best for large, heterogeneous

  10. Conclusions • Can achieve sub-second reactions on most links with reasonable stability • Congested links increase reaction time • Can react well on most internet links • Localization of route updates • Scoping state dissemination with respect to links improves performance in heterogeneous network • Can achieve resiliency close to link-state with fraction of overhead

More Related