120 likes | 222 Views
CSE 561 – Reliability. David Wetherall djw@cs.washington.edu Spring 2000. This Lecture. Routers, continued. End to End argument Retransmission timers. E2E Paper. Saltzer, Reed, Clark 1984 Captures folklore in a now classic systems paper
E N D
CSE 561 – Reliability David Wetherall djw@cs.washington.edu Spring 2000
This Lecture • Routers, continued. • End to End argument • Retransmission timers djw // CS 561, Spring 2000
E2E Paper • Saltzer, Reed, Clark 1984 • Captures folklore in a now classic systems paper • Design requires deciding what functions to implement and where to place them; E2E guides the latter • E2E is a powerful design principle, but not a law djw // CS 561, Spring 2000
E2E Argument • A function might be placed in the application, network, or both. • Place it where it can be done correctly and completely, or otherwise (if it does an incomplete job) provides a significant performance gain. • Rationale for moving functions out of the network and into end-systems. An Occam’s razor. djw // CS 561, Spring 2000
Example: Careful File Transfer • File transfer exposed to many kinds of errors other than network corruption (disk, software) • E2E check and retry is required for correctness • Reduces complexity of low probability failures • Strong network checks are not sufficient • So avoid them as wasteful • Q: How does file transfer work today? djw // CS 561, Spring 2000
Simplicity vs. Complexity • Downsides of low-level implementation • Duplication of effort can lower performance • Optimizes network for one type of use • Upsides of low-level implementation • Partial implementation can improve performance • Avoid repeated implementation by each app djw // CS 561, Spring 2000
Tensions • Non-performance aspects of network implementation • Bandwidth enforcement, firewalls, AUPs • Need to take administrative regions into account • System evolution • Transparent caching, NAT boxes • Want to administer end-systems in a scalable manner • Generic vs. per Application network support • Multicast, content distribution, active networks • Value in using network location if not a cost to all djw // CS 561, Spring 2000
Retransmission Timers • Timeouts (RTO) are used to decide a packet has been lost and should be retransmitted. • Based on estimate of RTT and hence maximum likely RTT • SRTT = alpha x sample + (1-alpha) x SRTT • RTO = beta x SRTT • RTO exponential backoff for successive losses djw // CS 561, Spring 2000
The Value of a Good Timer • Q: Does any of this matter? • A: Yes. Critical to performance (protocol and network) • If too large: • Detection of losses delayed, window doesn’t advance, result is low throughput, esp. on error prone links • If too small: • Early retransmissions (before the original ack arrives) • Seriously bad for the network djw // CS 561, Spring 2000
Karn and Partridge (SIGCOMM’87) • Deals with retransmission ambiguity • Is ack for original or retransmitted packet? • Problem: timers were failing, this was a piece. • Can’t assume acks are always for new, always for old, or simply ignore if packet is retransmitted. djw // CS 561, Spring 2000
Karn’s Algorithm • Insight: Use backoff as part of RTT estimation • Don’t use ack for retransmitted packet to calculate RTT • On loss, backoff RTO and keep using for subsequent packets until 3. • When ack for singly-transmitted packet arrives, use sample to update RTT and reset RTO • 1 avoids retransmission ambiguity, 2 ensures good samples will arrive, and 3 tells us when we get one. djw // CS 561, Spring 2000
TCP Timestamps • Several TCP options added in early 1990s as part of extensions for high performance • Round Trip time Measurement • Want more than one RTT sample per window • Send timestamp with packet; receiver echoes with ack • Resolves retransmission ambiguity djw // CS 561, Spring 2000