The Impact of False Sharing on Shared Congestion Management

The Impact of False Sharing on Shared Congestion Management Srinivasa Aditya Akella Joint work with Srini Seshan and Hari Balakrishnan 28 Feb, 2001

Introduction • Predominant model for congestion control • Slow-start • AIMD • Not always optimal • Multiple concurrent flows from Src to Dest may share a bottleneck • Compete for resources rather than co-operate • Especially visible in the context of Web transfers

Sharing Congestion Information... • Solution - share congestion information • Granularity of sharing • Common destination host (network interface) • All destination hosts on the same IP subnet • Set of flows sharing congestion info - macroflow

False Sharing • Flows sharing congestion state might not share the same bottleneck • Sender has no knowledge • False sharing in the Internet • Flows are treated differently- Service Differentiation • Flows take different paths - Path Diversity

False Sharing • Service Differentiation • Integrated Services • Differentiated Services (DiffServ) • Path Diversity • Network Load Balancers • Network Address translators (NATs)

Questions... • Impact on performance and correctness • Compromise to end-to-end congestion control? • Degradation in performance of individual flows? • Detection • Under what conditions can false-sharing be detected? • Response • How should congestion sharing systems be modified? • What effect do these modifications have? • What should be the default behavior?

Quantifying the Penalty XXX needs to be fixed • Analysis • False sharing reduces observed flow throughput • l_share = l_1 l_2 / (l_1 + l_2) • False sharing increases observed flow loss rate • r_noshare = sqrt(r_1 r_2) • r_share = (r_1 + r_2)/2

Service Differentiation • Network treats different flows differently • Bandwidth allocation and buffer resources • IETF DiffServ architecture • Three PHBs : Assured Forwarding, Expedited Forwarding, Best Effort • Nortel's implementation of Diffserv • Experiments with two traffic classes : AF and BE • WRR for bandwidth sharing • RIO (for AF) and RED (for BE) for buffer management • Styles of buffer management • Shared and unshared

Topology for Diffserv

Results... • Predicted throughput = XXX need to fill • The faster connection is slowed down by the slower one • Slower connection is never persistently overloaded • Loss rate for the slower connection does not increase appreciably with sharing

Path Diversity • Two flows taking different routes may not share a bottleneck • Two scenarios where path diversity leads to false sharing • Dispersity Routing • NATs • Three distinct categories • Unshared bottleneck • No shared bottleneck link • Semi-shared bottleneck • One of the unshared paths has a bottleneck • Fully shared bottleneck • No bottlenecks in the unshared portions • RTTs would be different

Topology for Unshared Bottleneck

Results for Unshared-Bottleneck • Bandwidth is close to the prediction • Loss rates followed similar pattern as with the DiffServ case

Delays and Losses... • Delays vary independently of each other • Losses are uncorrelated • Variations and delays in losses in one flow are more correlated than those across flows

Path Diversity, Other Cases

Fully Shared Bottleneck - How is it Different? • Variations in delay seem correlated • The two flows share a common point of congestion • The flows should not share congection information

Detection • Test description • Rubenstein's Delay and Loss Correlation tests • Need modifications to be a part of the architecture • Flows might undergo false-sharing if even one of their bottlenecks is unshared • Two differentially served flows might observe statistically dependent delays • Scheduler at the sender might apportion bandwidths non-uniformly • Congestion control schemes depend on RTTs • Aggregating flows with different RTTs would lead to false sharing

Loss-correlation Test • Idea -- Losses are likely to come in bursts • This should hold across flows from the same source when a bottleneck is shared • Rubenstein's tests compare the auto and cross correlation metrics for pairs of flows • Does not detect unshared bottlenecks • Need a test to detect all if all bottlenecks are shared • New test - Symmetric Loss Correlation • Loss and cross correlation metrics defined in a manner independent of the flows solves the problem • However, packets across flows are assumed to be spaced closer than those within a flow -- Not always true • A fix -- Schedule transmmissions appropriately

Delay-correlation Test • Delay = f(propagation time, queueing delay) • Queueing delay (Q)can vary significantly with time • Current Q is strongly related to recently values • Challanges with measuring delay • Clocks cannot be easily synchronized • Use change in delay or the relative delay • Methodology of the tests • Use timestamps to compute delays • Compute correlations • Correlation is independent of constant differences

Out-of-Order Test • Flows might have fundamentally different delays • DelayCorr does not identify this • Loss and Delay tests might help detect false-sharing • MultiPath Routing where bottleneck is shared • Out-of-Order test handles this well • Look at packet reordering from a source • Reordering by more than 3 packets => No sharing • Limitation: Packets must be delivered to the same physical destination • Cannot be applied to situations like NAT • Rely on RTTs in such situations

Genuine Sharing is Harder to Detect

Evaluation of the Tests • Two metrics for each tests • Detection time • Probability of correct decision • Which test is the best? • Out-of-order tests are mostly accurate • Loss tests are neither timely nor accurate • Delay tests are timely but not as accurate • Symmetric Loss test ouputs correct result much more often than the asymmetric test

Response to False Sharing • Design Issues • Default behavior: share information and detect false-sharing • Scheduling • False sharing detected more easily than genuine sharing • Default of no-sharing makes no sense with out-of-order tests • Upon detection, stop sharing • In CM, associate the different flows to different macroflows • Relatively small confidence intervals can be used • No significant penalty due to an incorrect decision

Performance • How good can restoration possibly be? • False sharing may penalize flows significantly • It might take time to restore performance • However, the greater the penalty, the easier it is to detect • Approach to performance evaluation -- multiple, de-randomized, offline runs • Performance restored in less then a factor of 3 of time taken to detect

The Impact of False Sharing on Shared Congestion Management

The Impact of False Sharing on Shared Congestion Management

Presentation Transcript

NERC Congestion Management

Impact of Congestion Pricing Travel Time Reliability on Travel Demand

Update on the Congestion Management Process ( CMP )

Congestion Management

Congestion Management

Predator : Predictive False Sharing Detection

End-host Perspectives on Congestion Management

White Paper on the Future of Congestion Management

Software Distributed Shared Memory (SDSM): MultiView SDSM, false sharing. Solution: MultiView.

Briefing on the Congestion Management Process (CMP)

CONGESTION MANAGEMENT

NERC Congestion Management

IT Update on Impact of Additional Congestion Zones

Estimating Shared Congestion Among Internet Paths

Estimating Shared Congestion Among Internet Paths

Shared resources, shared values? Ethical implications of sharing translation resources

Shared Collections, Shared Records? Resource sharing at the meta-level

2010 Census The Impact on Revenue Sharing

Predator : Predictive False Sharing Detection

The Impact of Re-provisioning on the Choice of Shared versus Dedicated Networks

The Impact of Active Queue Management on Multimedia Congestion Control

Congestion Management