530 likes | 803 Views
TCP/UDP/IP. Courtesy of Kevin Fall at UC Berkeley & Raghupathy Sivakumar at GATECH. TCP/IP Protocol Suite. Physical layer Data-link layer – ARP, RARP, Network layer – IP, ICMP, IGMP Transport layer – TCP, UDP, RTP Application layer – http, smtp, ftp. Application. Transport. IP.
E N D
TCP/UDP/IP Courtesy of Kevin Fall at UC Berkeley & Raghupathy Sivakumar at GATECH
TCP/IP Protocol Suite • Physical layer • Data-link layer – ARP, RARP, • Network layer – IP, ICMP, IGMP • Transport layer – TCP, UDP, RTP • Application layer – http, smtp, ftp Application Transport IP DataLink Physical
TCP/IP Protocol Suite • IP is used for each network node (or router) Dest Source Application Application Transport Router Router Transport IP IP IP IP DataLink DataLink DataLink DataLink Physical Physical Physical Physical
Internet Protocol (IP) service model • best-effort datagram model • error detection in header only • addressing, routing • signaling (ICMP) • Fragmentation and reassembly • Multiplexing and Demultiplexing
Addressing • Need a unique identifier for every host in the Internet (analogous to postal address) • IP addresses are 32 bits long • Hierarchical addressing scheme • Conceptually … • IPaddress =(NetworkAddress,HostAddress)
0 netId hostId 7 bits 24 bits 1 0 netId hostId 14 bits 16 bits 1 1 0 netId hostId 21 bits 8 bits Address Classes • Class A • Class B • Class C
Addresses and Hosts • Since netId is encoded into IP address, each host will have a unique IP address for each of its network connections • Hence, IP addresses refer to network connections and not hosts • Why will hosts have multiple network connections?
Exceptions to Addressing • Subnetting • Splitting hostId into subnetId and hostId • Achieved using subnet masks • Supernetting (Classless Inter-domain Routing or CIDR) • Combining multiple lower class address ranges into one range • Achieved using 32 bit masks and max prefix routing
Examples • Subnetting (B class) Network Host Network Subnet Host
IP Routing • Direct • If source and destination hosts are connected directly • Still need to perform IP address to physical address translation • Indirect • Table driven routing • Each entry: (NetId, RouterId) • Default router • Host-specific routes
IP fragmentation MTU = 1500 MTU=500 IP Fragmentation • The physical network layers of different networks in the Internet might have different maximum transmission units (MTUs) • The IP layer performs fragmentation when the next network has a smaller MTU than the current network
IP Reassembly • Fragmented packets need to be put together • Where does reassembly occur? • The router at the other end of the smaller MTU network • Router overhead: complexity, buffering • More than one path • The final destination • Many fragments on the path • more chance of missing packets • Utilization inefficiency (many headers)
IP Header • Used for conveying information to peer IP layers Dest Source Application Application Transport Router Router Transport IP IP IP IP DataLink DataLink DataLink DataLink Physical Physical Physical Physical
IP Header (contd.) 4 bit version 4 bit hdr length 16 bit total length 8 bit TOS 16 bit identification 3 bit flags 13 bit fragment offset 8 bit TTL 8 bit protocol 16 bit header checksum 32 bit source IP address 32 bit destination IP address Options (if any) (maximum 40 bytes) data
Multiplexing Web Email MP3 Web Email MP3 TCP UDP TCP UDP IP IP IP datagrams IP datagrams
Endpoint identification • how to identify a remote application/service on the Internet? • [IP_address, port number, protocol] • expect to find a process listening for incoming packets
Port numbers • port numbers are in range [0..64K-1] • ports below 1024 are known as well-known ports and reserved by IANA • ports in range [1024..64K-1] may be registered but are not enforced
UDP • provides a datagram service model • Additional intelligence built at the application layer if needed • Error detection • header (8bytes)
Sending a UDP datagram • application requires that dest IP address, port number to send • application chooses message size, requests send using API (e.g. sockets) • API allocates OS-level buffer, leaving for some headers, copies data from user-level buffer to OS-level buffer, gives to UDP module
Sending a UDP datagram • UDP module receives data and prepends IP and UDP headers • fills in IP header info • proto, len, src, dst,… • fills in UDP header • src_port, dst_port, len,… • sets TTL and TOS • sends UDP/IP packet to IP module Ethernet header IP header UDP header Application data Ethernet trailer
Sending a UDP datagram • IP module receives packet • insert options if enabled • sets IP vers, IHL, offset, ID fields • determines an interface/MTU • fragments if needed and sends to link layer
Receiving a UDP datagram • network adapter receives a frame, interrupts processor • device driver determines frame contains IP type data, strips link layer header and gives to IP module • IP checks IP header, processes options • IP checks IP address (unicast, multicast, …) • IP reassembles if necessary, give the whole packet to UDP based on protocol field
Receiving a UDP datagram • UDP receives IP/UDP packet • checks length and checksum • locates OS PCB based on dest port, providing receiving process’ ID; generates ICMP unreachable if nobody there • copies to receiving process’ buffer • makes receiving process get to this *PCB: protocol control block
Why use UDP? • downsides • no error correction • No flow control • No congestion control • App picks packet size • upsides • No connection establishment • stateless • Broadcast/multicast more straight forward • App picks packet size
TCP • End-to-end transport protocol • Responsible for reliability, congestion control, flow control, and sequenced delivery • Applications that use TCP: http (web), telnet, ftp (file transfer), smtp (email), chat • Applications that don’t: multimedia (typically) – use UDP instead
http ftp smtptelnet A1 A2 A3 TCP UDP Transport Protocol ID IP Layer Port IP address Ports, End-points, & Connections • Thus, an end-point is represented by (IP address,Port) • Ports can be re-used between transport protocols • A connection is (SRC IP address, SRC port, DST IP address, DST port) • Same end-point can be used in multiple connections
TCP • Connection Establishment • Connection Maintenance • Reliability • by acknowledgement packet (ACK) • Congestion control • Flow control • Sequencing • Connection Termination
data ack data ack Fundamental Mechanism data • Simple stop and go protocol • Timeout based reliability (loss recovery) • Multiple unacknowledged packets (W) RTO retx Sliding Window Protocol: 1 2 3 4 5 6 7 8 9 10 11 12 ….
Sliding window • The sender cannot send more data
Active and Passive Open • How do applications initiate a connection? • One end (server) registers with the TCP layer instructing it to “accept” connections at a certain port • The other end (client) initiates a “connect” request which is “accept”-ed by the server
data ack 1 2 3 3 4 3 3 4 Reliability (Loss Recovery) • Sequence Numbers • TCP uses cumulative Acknowledgments (ACKs) • Next expected in-sequence packet sequence number • Pros and cons? • Piggybacking • Timeout calculation • Rttavg = k*Rttavg + (1-k)*Rttsample • RTO = Rttavg + 4*Rttdeviation
Retransmission (fast retransmit) • after 3 duplicate ACKs, TCP sender figures out the packet is lost
Congestion control: slow start (can be bottleneck!) • Initial window size W = 1 • Each ACK will increase W by 1
Congestion Control • Slow Start • Start with W=1 • For every ACK, W=W+1 • Congestion Avoidance (linear increase) • For every ACK, • W = W+1/W • Congestion Control (multiplicative decrease) • ssthresh = W/2 • W = 1 Alternative: Fall to W/2 and start congestion avoidance directly
Why LIMD? (fairness) • W=1 • 100 10 diff = 90 • 1 1 diff = 0 • Problem? – inefficient • W=W/2 • 100 10 diff = 90 • 50 5 diff = 45 • 51 6 diff = 45 • 52 7 diff = 45 • .. • 73 28 diff = 45 • 37.5 14 diff = 23.5 • .. • 61.75 38.25 diff = 23.5 • 30.85 19.65 diff = 11.2 • ..
Flow Control • Prevent sender from overwhelming the receiver • Receiver in every ACK advertises the available buffer space at its end • Window calculation • MIN(congestion control window, flow control window)
1 2 3 3 4 3 3 4 Sequencing • Byte sequence numbers • TCP receiver buffers out of order segments and reassembles them later • Starting sequence number randomly chosen during connection establishment • Why? 1 given to app 2 given to app Loss 4 buffered (not given to app) 3 & 4 given to app 4 discarded
Server does passive open Accept connection request Send acceptance Start connection SYN Active open Send connection request SYN+ACK ACK DATA Connection Establishment & Termination • 3-way handshake used for connection establishment • Delay! • Randomly chosen sequence number (why?) is conveyed to the other end • Similar FIN, FIN+ACK exchange used for connection termination
TCP Segment Format 16 bit SRC Port 16 bit DST Port 32 bit sequence number 32 bit ACK number Flags: URG, ACK, PSH, RST, SYN, FIN HL flags 16 bit window size Rsv’d 16 bit TCP checksum 16 bit urgent pointer Options (if any) Data
Silly window syndrome (SWS) • TCP is a window-based protocol • TCP receiver advertises a small amount of window; so TCP sender transmits only a short packet each time • Inefficient utilization of network BW • So what? • Save up enough to send
Nagle’s algorithm • Buffer all user data if any unacknowledged data is outstanding • Ok to send if all is ACK’d or have a MSS size worth of data • If small delay is wanted, Nagle’s algorithm should be disabled MSS size: maximum TCP payload size MTU: maximum PDU size supported by link layer MTU = MSS + 20 (TCP header) + 20 (IP header)
Interactive applications: Telnet • Remote terminal applications (e.g., Telnet) send characters to a server. The server interprets the character and sends the output at the server to the client. • For each character typed, you see three packets: • Client Server:Send typed character • Server Client: Echo of character (or user output) and acknowledgement for first packet • Client Server: Acknowledgement for second packet
Why 3 packets per character? • We would expect four packets per character: • However, tcpdump shows this pattern: • What has happened? TCP has delayed the transmission of an ACK