280 likes | 437 Views
LT-TCP: End-to-End Framework to Improve TCP Performance over Networks with Lossy Channels. Omesh Tickoo, Vijay Subramanian, Shiv Kalyanaraman (Rensselaer Polytechnic Institute) K. K. Ramakrishnan (AT&T). Overall Motivation. TCP response to errors and congestion is the same:
E N D
LT-TCP: End-to-End Framework to Improve TCP Performance over Networks with Lossy Channels Omesh Tickoo, Vijay Subramanian, Shiv Kalyanaraman (Rensselaer Polytechnic Institute) K. K. Ramakrishnan (AT&T)
Overall Motivation • TCP response to errors and congestion is the same: • drop the window, and thus reduce load on the network • In the worst case, timeout when particular sequence of packets get lost (retransmits, entire window) • TCP was designed for congestion, loss rate in the 1-2% max. range. • TCP suffers significant timeout penalties with erasure rates > 5%. • Wireless channels becoming more pervasive • With mesh networks (infrastructure or community) it is likely that more than the last hop will be wireless. • Wireless links: • individual links can experience loss that can be high (even 10-15%) in transient situations, until power and link rate adjustments kick in • interference can also result in high loss rates. • E.g., ad-hoc networks, Mesh networks.
Approach • Tools available to us: • Method of getting congestion indication that is separate from packet loss due to errors: Explicit Congestion Notification (ECN) • Use error recovery methods beyond retransmission and timeouts to overcome packet loss, so that TCP’s performance is retained. • Use FEC on an end-end basis: • Dynamic knowledge of the loss information can be exploited by the end-system. • Track short term loss rates. • Protect data by using FEC proactively and reactively. • FEC can work in a coordinated fashion with TCP’s window mechanisms to optimize the usage of FEC within a window (which is not available at the link level).
Goals We pose the following questions.. • Dynamic Range: • Can we extend the dynamic range of TCP into high loss regimes? • Can TCP perform close to the theoretical capacity achievable under high loss rates? • Congestion Response: • How should TCP respond to notifications due to congestion.. • … but not respond to packet erasures that do not signal congestion? • Mix of Reliability Mechanisms: • What mechanisms should be used to extend the operating point of TCP into loss rates from 0% - 50 % packet loss rate? • How can Forward Error Correction (FEC) help? • How should the FEC be split between sending it proactively (insuring the data in anticipation of loss) and reactively (sending FEC in response to a loss)? • Timeout Avoidance: • Timeouts: Useful as a fall-back mechanism but wasteful otherwise especially under high loss rates. • How can we add mechanisms to minimize timeouts?
TCP uses Loss Feedback to Estimate Available Capacity LT-TCP: Adaptive Mechanisms to Reinstate Performance Adaptive MSS/ Proactive and Reactive FEC Erasure Recovery/ Loss Estimation Capacity Used Capacity Used Available Capacity RECEIVER SENDER X X Loss Feedback Through Acknowledgements X – Packet Erasure
Building Blocks… • ECN-Only: We infer congestion solely from ECN markings. Window is cut in response to • ECN signals: which means that hosts/routers have to be ECN-capable. • Timeouts: The response to a timeout is the same as before. • Window Granulation and Adaptive MSS: We ensure that the window always has at least G segments at all times. • Window size in bytes initially is the same as normal SACK TCP. • Initial segment size is small to accommodate G segments. • Packet size is continually so that we have at least G segments. Once we have G segments, packet size increases with window size. • Loss Estimation: The receiver continually tracks the loss rate and provides a running estimate of perceived loss back to the TCP sender through ACKs. An adaptive EWMAapproach to estimating loss is used.
Building Blocks … • Proactive FEC: TCP sender sends data in blocks where the block contains K data segments and R FEC packets. The amount of FEC protection (K) is determined by the current loss estimate. • Proactive FEC based upon estimate of per-window loss rate (Adaptive) • Reactive FEC: Upon receipt of 1 or 2 dupacks, Reactive FEC packets are sent based on the following criteria. • Number of Proactive FEC packets already sent. • Number of holes still left in the decoding block. • Loss rate currently estimated. • Reactive FEC to complement retransmissions
Overestimate after spikes : = Elatest/ (Elatest+ E) Estimate is fairly accurate within small erasure rate variations Trade off :Over-estimation leads to overhead. Overestimate Inefficiency Period Block Behavior: Per-Block Loss Estimator for P-FEC Packet Erasure Rate EWMA Estimator: E = *Elatest + (1-)*E Estimation is done at receiver and fed-back to the sender
0% 0% 20% 20% 30% 0% 20% 10% 0 50 100 150 200 250 300 350 400 Loss Tracking at Sender Sender can quickly and accurately track the loss rate based on feedback from the receiver. Packet Error Rate (Time)
Recover K data packets! >= K of N received Lossy Network Reed-Solomon FEC: RS(N,K) RS(N,K) FEC (N-K) Block Size (N) Data = K Recovery possible if we receive at least K packets out of N
Timout Cause #1: Burst Errors + Large MSS 5 4 Window 3 4 3 2 1 2 Transmission Loss 1 X X X X Complete Window Lost!
Window Granulation Reduces the Risk of Losing the Complete Window 7 6 5 Window 7 6 5 4 3 2 1 4 3 2 Transmission Loss X X X X 1 2 3 8 ACK Stream 6 5 4 3 Rexmins
Timout Cause #2: Insufficient Dupacks => SACK not triggered 6 5 Window 4 6 5 4 3 2 1 3 Transmission Loss 2 X X X 1 2 3 3 ACK Stream DUPACK-1 Timeout because of insufficient dupacks
4 3 2 1 Proactive FEC P-FEC P-FEC 4 Window P-FEC P-FEC 4 3 2 1 3 2 Transmission Loss X X 1 Receiver FEC Decoder + + + P-FEC P-FEC 2 1 Recover data packets…
5 6 5 4 2 1 4 3 X X 2 1 Transmission Loss Timeout Cause #3: Loss of Retransmissions 6 Window 3 2 2 2 2 ACK Stream DUPACK1 DUPACK2 DUPACK3 Retransmission 2 Transmission Loss X ReXMITS ESPECIALLY vulnerable!
Reactive FEC: Complements Rexmits 6 5 Transmission Loss Window 4 6 5 4 3 2 1 3 2 X X 1 2 4 5 6 ACK Stream DUPACK1 DUPACK2 DUPACK3 Selective Acknowledgements R-FEC R-FEC + + + R-FEC R-FEC 4 1 Receiver FEC Decoder 4 3 2 1
Putting it Together…. Application Data MSS Adaptation • Granulated Window Size Window P-FEC Window Size (n,k) Loss Estimation Data FEC Computation Loss Estimate
SACK (Multiple Sources) LT-TCP (Multiple Sources) Performance Results
Contribution of Components (20% PER case (Single Source) • LT-TCP is able to • reduce timeouts drastically • keep the queue non-empty maximizing throughput and capacity utilization. • minimize use of FEC to level needed
Comparison w/ Link Layer FEC, HARQ LL FEC: FEC based upon average PER HARQ: 10% FEC; ARQ persistence = 3 LT-TCP: end-to-end
Summary • TCP performance over wireless with residual erasure rates 0-50% (short- or long-term). • E2E FEC: • Granulation ensures better flow of ACKs especially in small window regime. • Adaptive FEC (proactive and reactive) can protect critical packets appropriately • Adaptive => No overhead when there is no loss. • ECN used to distinguish congestion from loss. • Near-optimal performance for wide range: from low to high loss rates. • Future Work: • Optimal division of reliability functions between PHY,MAC, E2E • Study of interaction between LT-TCP and link-layer schemes.
Thanks! Researchers: Omesh Tickoo: tickoo@rpi.edu Vijay Subramanian: subrav@rpi.edu Shiv Kalyanaraman: shivkuma@rpi.edu K.K. Ramakrishnan, kkrama@research.att.com
Building Block Behavior: Adaptive MSS (Window Granulation) • Adaptive MSS behavior. • Congestion window (in segments) kept above G = 10 • MSS increases when CWND grows, • MSS shrinks when CWND shrinks to maintain G
Shortened Reed Solomon FEC (per-Window) RS(N,K) RS(N,K) 0 0 z Zeros (Z) 0 0 0 0 Reactive FEC (R) K = d + z Block Size (N) Proactive FEC (F) Window (W) Data = D d
Performance Results.. Drop in Performance from 40 to 50 % LT-TCP (Single Source) • At 50 % error rate, timeouts increase drastically because.. • Few Proactive FEC packets received. • Proactive FEC cannot counter variation in error patterns. • Reactive FEC is insufficient in this case to avoid timeouts. • This effect can be mitigated by increasing FEC protection.
Changes w.r.t. submitted paper • FEC is now done on a block-by-block basis. • Proactive protection is determined solely by the loss estimate. (no arbitrary constants) • Reactive FEC packets may be wasted if they belong to the wrong block. • Conditions under which Reactive FEC packets are sent are restricted (discussed earlier). • Window granulation is done using the following rule • “Send as big a packet as possible while maintaining granularity” • Throughput and goodput are measured at the receiver for better accuracy. • On partial dupacks, we make sure that retransmission are not duplicated. We send new TCP data instead. • Loss tracking is now done whenever we receive an ACK. • Loss estimation at receiver has changed to accommodate block-by-block decoding.