70 likes | 82 Views
Configurable restoration in overlay networks. Matthew Caesar, Takashi Suzuki. Motivation. Today’s internet core has bursty losses Backbones have low average loss rates (<0.2%), but experience large bursts in loss Loss durations vary from 10ms to 33.72sec
E N D
Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki
Motivation • Today’s internet core has bursty losses • Backbones have low average loss rates (<0.2%), but experience large bursts in loss • Loss durations vary from 10ms to 33.72sec • 6 out of 7 providers experienced large outage periods 10-220sec for 1-2 times per day • Difficult for multimedia applications to recover from repeated loss (e.g. with FEC) • Commonly used restoration techniques insufficient • Link layer recovery, MPLS not yet uniformly deployed • RON too slow (20 sec), not scalable • real-time recovery desired • “Assessment of VoIP Quality over Internet Backbones,” Markopoulou, Tobagi, Karam (INFOCOM 2002)
Approach • Technique: Overlay based, real-time recovery • Use Link-state routing • Determine link cost from packet receipt delay • Adaptively dampen, localize route advertisements • Desirable properties: • Speed: Low end-to-end failure time • Stability: Few route oscillations • Accuracy: Avoid reacting to transient failures • Scalability: Low probing/communication overhead
LCE Failure Detection Mechanism • Goal: • Calculate link cost • Link Cost Estimation (LCE): • Estimates failure probability from packet loss • “Delay-deficit algorithm” • Adaptive Tracking (AT): • Filters noise • Reacts quickly to changes • Route Stabilization (RS): • Dampens route flaps AT RS
Example LCE output AT output RS output • Show detailed actions of layers • Link Cost Estimation (LCE): • metric representing probability link has failed • spikes result from packet losses • Adaptive Tracking (AT): • metric with noise filtered • Route Stabilization (RS): • advertised value for link • Setup • Link Failure at t=[150s-170s] • Probe every 300ms, 10% loss • Results • First Detection in 0.92s, next at 5.42 • A false positive due to cold start. Stabilizes in 5 seconds.
achieve Comparison LCA • Compared LCA, RON, and “Oracle-based” routing. • Results: • RON requires 4 to 10x more advertisements than LCA • RON’s overhead increases exponentially with probe speed, LCA’s overhead increases linearly • Packet loss has an extreme effect on RON, moderate effect on LCA • LCA can achieve subsecond reactions on most internet links achieve desired LCA
Conclusions • Can achieve sub-second reactions on most links with reasonable stability • Congested links increase reaction time • Can react well on most internet links • Future work: • Wide area deployment and evaluation • More complete analysis