570 likes | 887 Views
Thomas J. Hacker. May 8, 2002. Internet 2 Member Meeting. Problem ... There are concerns about the effectiveness, fairness and efficiency of parallel flows ...
E N D
1. The Effects of Systemic Packets Loss on Aggregate TCP Flows Thomas J. Hacker
May 8, 2002
Internet 2 Member Meeting
2. Problem High performance computing community is making use of parallel TCP sockets to increase end-to-end throughput
There are concerns about the effectiveness, fairness and efficiency of parallel flows
This research uses simulation to investigate the effectiveness, fairness and efficiency questions
Based on simulations with empirically based loss model, parallel TCP is effective and efficient, but not always fair
May be possible to improve fairness
3. Outline Introduction
Motivation
Background
Simulation
Evaluation
Conclusion
4. Introduction HPC community needs high speed bulk throughput
Using parallel TCP flows to increase throughput
Examples
Bbcp - Stanford Linear Accelerator (SLAC)
Globus - Argonne National Lab
GridFTP – Grid Forum and ANL
Storage Resource Broker – San Diego Supercomputer Center
PSockets Library – University of Illinois at Chicago
SLAC has extensive measurements that demonstrates successful use
5. Introduction Actual end-to-end network throughput is much less than expected
Host and Network tuning helps a little
Infrastructure upgrades help a little
But after tuning, throughput still much less than expected
Network measurements gathered from infrastructure show available unused bandwidth (“head room”)
Observed packet loss rate from transfers are too high to support high throughput bulk data transfers
6. Introduction Networking community discourages use of Parallel TCP flows
May cause congestion collapse at worst
Unfair to single stream flows at best
This is based on the belief that packet losses are due exclusively to network overload
7. Motivation This research examines the use of parallel TCP flows on shared networks
Goals of the research are to determine if parallel TCP is
Effective
Fair
Efficient
8. Motivation Effective
Does the use of parallel TCP flows increase aggregate throughput?
Fair
Does the use of parallel TCP flows steal bandwidth from competing TCP flows?
Efficient
Does the use of parallel TCP flows improve the overall efficiency of the network bottleneck?
9. Outline Introduction
Motivation
Background
Simulation
Evaluation
Conclusion
10. Background Factors that affect TCP throughput
Maximum Segment Size (MSS)
Maximum TCP segment size
Limited by maximum frame size supported by network
Round Trip Time (RTT)
Depends on
Length of network
Load on network (queueing delays)
Packet Loss Rate
Number of packets dropped / Number of packets transmitted
Packet losses considered a sign of overload
11. Background Packet Loss
Most dynamic factor of the three
High rates of packet loss limits throughput
Cause assumed to be exclusively from overload
Statistical distribution of packet loss is important
12. Background Sources of Packet Loss
Network bottleneck overload
Other sources
Hardware and Software Bugs
Faulty Hardware
Others…
13. Background Implication
When there is no congestion, packet loss from other sources limits throughput
Evidence of non congestion packet loss
Lack of recorded drops in routers
Underutilized network links
Packet drops present in TCP sessions that are not due to overload
14. Background Parallel TCP flows
Overcomes effects of packet loss on throughput
Recovers from loss faster than single stream
Averages out effects of non-congestion related packet losses
15. Outline Introduction
Motivation
Background
Simulation
Evaluation
Conclusion
16. Simulation NS2 simulation built to investigate the effectiveness, fairness, and efficiency of parallel TCP flows
17. Simulation Loss Model in simulator is critical
Measurements from real transfers used to build loss model
153 data transfers from U-M to Caltech
Performed over 3 days
Packet traces from experiments analyzed to extract losses
Source of Loss
Network operations centers certified no router drops during test
Bandwidth graph for network bottleneck showed underutilization
18. Simulation Observed Loss Characteristics
19. Simulation Right hand side of histogram
20. Simulation Left Hand Side of Histogram
Intraburst Losses
Collection of exponential distributions
Between 61% and 78% of analyzed intrabursts fit an exponential distribution
Right Hand Side of Histogram
Interburst Losses
Fits a normal distribution
21. Simulation Loss Models Considered
Constant Loss Probability
Random I.I.D.
Poisson Loss Arrival
Unconditional and Conditional Loss
A.k.a 2-state Markov or Gilbert
Kth Order Markov Loss Model
Extended Gilbert Model
22. Simulation 6-state Markov Model selected
6 states were enough to simulate throughput equivalent to observed
Markov chain used to drive a Markov Modulated Poisson Process (MMPP)
1 state is the loss state, 5 states no-loss
Sojourn time and transition probabilities from observed data
Poisson Loss Model used for the Loss State
23. Simulation MultiState Loss Model in ns2 used to implement MMPP loss model
Extension made to ns2 to support MultiState Loss Model on multiple links in the simulator
Each simulation instance was run 10 times with different random seeds for the Loss Model
Total number of all simulations was over 3000
24. Outline Introduction
Motivation
Background
Simulation
Evaluation
Conclusion
25. Evaluation Effectiveness
Fairness
Efficiency
26. Evaluation Effectiveness Question
Does the use of parallel TCP flows increase aggregate throughput?
Addressing the Question
Between 1 and 6 parallel flows simulated
No Cross Traffic
27. Evaluation Effectiveness Results
28. Evaluation Effectiveness Conclusion
Parallel flows improve aggregate throughput in the presence of systemic non-congestion related packet loss
Corroboration of simulation results with observed results
29. Evaluation Effectiveness
Fairness
Efficiency
30. Evaluation Fairness Question
Does the use of parallel TCP flows steal bandwidth from competing TCP flows?
Addressing the Question
Between 1 to 12 parallel flows
Between 1 to 5 cross streams of competing single stream traffic
31. Evaluation Reading the Graphs
32. Evaluation
33. Evaluation
34. Evaluation
35. Evaluation
36. Evaluation
37. Evaluation
38. Evaluation Fairness Conclusions
Fair when there is approximately more than 10% unused bandwidth
Unfair when there is no available bandwidth
Parallel TCP flows steal bandwidth from competing single stream flows to increase throughput when no unused bandwidth
39. Evaluation Improving Fairness
Parallel flow aggressiveness due to
Increased recovery rate over single stream
Fractional response to packet drops
If we could make parallel flows only as aggressive as a single stream, can we preserve effectiveness and efficiency while improving fairness?
40. Evaluation Slight modification to the TCP congestion avoidance algorithm
If n parallel flows are used, increase congestion window one packet for every n packets successfully transmitted, rather than one packet for every one packet successfully transmitted
Overall aggressiveness of n parallel flows is then the same as one single TCP flow
Simulation for 1 and 5 cross streams run with 1 to 20 parallel streams to investigate boundries
41. Evaluation
42. Evaluation
43. Evaluation
44. Evaluation
45. Evaluation
46. Evaluation
47. Evaluation
48. Evaluation Parallel flows with modification are about ½ as aggressive as parallel flows with no modification
Also found some asymptotic behavior as the number of parallel flows increased
49. Evaluation Asymptotic behavior
Derived aggregate throughput of parallel flow with modified TCP
50. Evaluation
51. Evaluation
52. Evaluation Fairness Conclusions
Fair when there is more than 10% available bandwidth in bottleneck
Parallel flows steal from single stream flows when bottleneck is over 90% utilized
TCP modification
Reduces aggressiveness
Curbs ability of parallel flow to steal bandwidth as number of flows increase
53. Evaluation Effectiveness
Fairness
Efficiency
54. Evaluation Efficiency Results
Efficiency is increased when parallel flows used if there is unused bandwidth in bottleneck
When all nodes use same number of parallel flows
Efficiency maintained
Fairness maintained
55. Outline Introduction
Motivation
Background
Simulation
Evaluation
Conclusion
56. Conclusions Parallel flows are
Effective
Fair when bottleneck is utilized less than 90%
Unfair when bottleneck is near saturation
Efficient
TCP congestion avoidance algorithm can be modified to
Reduce aggressiveness by approximately 1/2
Maintain effectiveness and efficiency
57. Future Work Implement modified algorithm for assessment
Further investigate loss models
Parameterization of loss models
Assessment of end-to-end networks loss characteristics
Investigate optimal TCP response to observed loss characteristics
Investigate stochastic analysis of parallel TCP over wide area networks