Detecting Shared Congestion of Flows Via End-to-end Measurement (and other inference problems)

Detecting Shared Congestion of Flows Via End-to-end Measurement(and other inference problems) Dan Rubenstein joint work with Jim Kurose and Don Towsley Umass Amherst

NETWORK Network Inference • What’s going on in there? • Where are packets getting lost / delayed? • Where is congestion occurring? • Where are the network hot spots? • What are routers doing (WFQ, RED)? • What version of TCP are end-hosts using?

Multiple Autonomous Systems somebody else! • What routing capabilities does your ISP provide? “That’s proprietary info” • Who’s to blame for poor service? • Consequence: who has to figure out what and where the problem is and how to fix it?

Overview • Overview of other inference work: • Identifying bottleneck capacities • Multicast inference of loss (MINC) • TCP inference (TBIT) • Detecting shared points of congestion

Identifying bottleneck bandwidths • Links have different capacities • “skinniest” link processes slowest: creates a rate bottleneck • can the bottleneck rate be identified? • Lots of work here [Carter’96, Jacobson’97, Downey’99, Lai’99, Melander’99, Lai’00]

S Pts of loss R R R R R Multicast Inference • Infer loss points on multicast tree via correlation patterns of receivers w/in a multicast group [Ratnas’99, Caceres’99 (3), LoPresti’99, Adler’00]

TCP Inference (TBIT) • Many versions of TCP exist • RENO, TAHOE, VEGAS • Many “optional” components • SACK, ECN compliance • Are specification reqmts being met? • initial window sizes, slow start • TBIT: TCP Behavior Identification Tool [Padhye’00] • stress-tests a server’s TCP by intentionally delaying / dropping various ACKs • different TCPs / TCP options respond differently to the delayed / dropped ACKs

Server Point of congestion Point of congestion Client Detecting Shared Pts of Congestion: Why bother? • When flows share common point of congestion (POC), bandwidth can be “transferred” between flows w/o impacting other traffic • Applications: WWW servers, multi-flow (multi-media) sessions, multi-sender multicast • Can limit “transfer” to flows w/ identical e2e data paths [Balak’99] • ensures flows have common bottleneck • but limits applicability

Detecting Shared POCs Q: Can we identify whether two flows share the same Point of Congestion (POC)? Network Assumptions: • routers use FIFO forwarding • The two flows’ POCs are either all shared or all separate

R1 S1 S1 R1 S2 R2 S2 R2 Techniques for detecting shared POCs • Requirement: flows’ senders or receivers are co-located co-located senders co-located receivers • Packet ordering through a potential SPOC same as that at the co-located end-system • Good SPOC candidates

Internet Internet Simple Queueing Models of POCs for two flows Separate POCs A Shared POC FG Flow 1 FG Flow 2 FG Flow 1 FG Flow 2 BG BG BG

Approach (High level) • Idea: Packets passing through same POC close in time experience loss and delay correlations[Moon’98, Yajnik’99] • Using either loss or delay statistics, compute two measures of correlation: • Mc: cross-measure (correlation between flows) • Ma: auto-measure (correlation within a flow) • such that • if Mc < Mathen infer POCs are separate • else Mc > Maand infer POCs are shared

E[XY] - E[X]E[Y] (E[X2] - E2[X])(E[Y2] - E2[Y]) C(X,Y) = The Correlation Statistics... i-4 i-3 Flow 1 pkts Loss-Corr for co-located senders: Mc = Pr(Lost(i) | Lost(i-1)) Ma = Pr(Lost(i) | Lost(prev(i))) Loss-Corr for co-located receivers: a bit more complex i-2 time i-1 Flow 2 pkts i Delay: Either co-located topology: Mc = C(Delay(i), Delay(i-1)) Ma = C(Delay(i), Delay(prev(i)) i+1

Intuition: Why the comparison works • Recall: Pkts closer together exhibit higher correlation • E[Tarr(i-1, i)] < E[Tarr(prev(i), i)] • On avg, i “more correlated” with i-1 than with prev(i) • True for many distributions, e.g., • deterministic, any • poisson, poisson • Rest of talk: assume poisson, poisson Tarr(i-1, i) Tarr(prev(i), i)

Analytical Results As # samples • Loss-Correlation technique: • Assume POC(s) are M+M/M/1/K queues: • Thm: Co-located senders, then Mc > Ma iff flows share POCs • co-located receivers: Mc > Ma iff flows share POCs shown via extensive tests using recursive solutions of Mc and Ma • Delay-Correlation technique: Assume POC(s) are M+G/G/1/ queues • Thm: Both co-located topologies: Mc > Ma iff flows share POCs

Simulation Setup • Co-located senders: Shared POCs on/off sources R1 30ms TCP traffic 20ms 20 pps 30ms S1 10ms 30ms 10ms 1000 Mbs S2 20 pps 20ms 1.5 Mbs R2 20ms

2nd Simulation Setup • Co-located senders: Independent POCs on/off sources TCP traffic R1 30ms 20ms 20pps 30ms S1 10ms 30ms 10ms 1.5 Mbs S2 20pps 20ms 1000 Mbs R2 20ms on/off sources TCP traffic

Simulation results • Delay-corr an order of magnitude faster than loss-corr • The Shared loss-corr dip: bias due to delayed Mcsamples • Similar results on co-located receiver topology simulations Independent POCs Shared POCs

Internet Experiments • Goal: Verify techniques using real Internet traces • Experimental Setup: • Choose topologies where POC status (shared or unshared) • Use traceroute to assess shared links and approximate per-link delays 264 ms UMass 30 ms UCL ACIRI 193 ms Separate POCs (?)

Correct Inconclusive Wrong Sites 3 Umass (MA) Columbia (NY) UCL (UK) AT&T (Calif.) ACIRI (Calif.) Experimental Results

Summary • E2E Shared-POC detecting techniques • Delay-based techniques more accurate, take less time (order of magnitude) • Future Directions: • Experiment with non-Poisson foreground traffic • Focus on making techniques more practical (e.g., Byers @ BU CS for recent TR) • Paper available (SIGMETRICS’00)

Detecting Shared Congestion of Flows Via End-to-end Measurement (and other inference problems)

Detecting Shared Congestion of Flows Via End-to-end Measurement (and other inference problems)

Presentation Transcript

Promoting the Use of End-to-End Congestion Control

Toward Optimal Network Fault Correction via End-to-End Inference

Promoting the Use of End-to-End Congestion Control in the Internet

End

End-to-End Inference of Router Packet Forwarding Priority

Promoting the Use of End-to-End Congestion Control in the Internet

Vigilante: End-to-End Containment of Internet Worms

End-end congestion control: window-based congestion control

RNP Measurement Developments End to End Performance WG

End-to-end Congestion Management for the NGI

End-to-End Congestion Control for InfiniBand

End-to-End Multicast Congestion Control and Avoidance

A Class of End-to-End Congestion Control Algorithms for the Internet

Detecting Shared Congestion of Flows Via End-to-end Measurement

TCP End-To-End Congestion Control

SIP End-to-End Performance Metrics

Promoting the Use of End-to-End Congestion Control

End-to-End Detection of Shared Bottlenecks

Promoting the Use of End-to-End Congestion Control in the Internet

Promoting the Use of End-To-End Congestion Control in the Internet

TCP End-To-End Congestion Control

End-end congestion control: window-based congestion control