510 likes | 522 Views
This text provides an overview of congestion control techniques, TCP delay modeling, network protocols, routing in the Internet, and approaches towards congestion control. It also discusses the basics of TCP congestion control, fair sharing of network resources, and delay modeling in TCP.
E N D
Plan Ahead 5th week: • Congestion control, TCP delay modeling • Network protocols: IPv4, IPv6 6th week: • network routing, routing in the Internet 7th week: • Midterm • Broadcast and multicast routing Before final: • Data link layer, Ethernet, switches, wireless networking CS118/Spring05
Congestion Control Congestion: “too many sources sending too much data too fast for network to handle” Scenario 1 • 2 identical senders, 2 receivers, one router w/infinite buffer, no retransmission • when congested: • large delays; • maximum achievable throughput CS118/Spring05
Congestion: scenario 2 = l l in out lin lin R/2 R/2 R/2 lout R/4 lout lout R/2 R/2 lin retransmission of delayed (not lost) packet makes much larger than R/2 Data losses leads to > • one router, finite buffers; • senders retransmit when timeout CS118/Spring05
Congestion: scenario 3 lout Host A lin : original data l'in : original data, plus retransmitted data Q:what happens as and increase? finite shared output link buffers Host B • Long delays • superfluous retransmissions • when a packet is dropped, any “upstream transmission capacity” used for that packet was wasted! CS118/Spring05
Approaches towards congestion control Network-assisted congestion control: routers provide feedback to end hosts • Single bit congestion indication • Explicit rate sender should send at End-end congestion control: no explicit feedback from network • congestion inferred from end-system observed loss, delay • approach taken by TCP CS118/Spring05
TCP Congestion Control Add a “congestion control window” congwin on top of flow-control window Sender limits LastByteSent-LastByteAcked CongWin How to adjust CongWin CongWininitialized to 1 mss, increase quickly until loss (= congestion) Upon loss: decreasecongwin, then begin probing (increasing) again two “phases”: (1)slow start,(2)congestion avoidance thresholddefines the boundary between the two How the sender infers congestion: Timeout, or 3 duplicate ACKs Congwin recvwin CS118/Spring05
Basic idea: learn from observations • when congwin < threshold, increase congwin exponentially • when congwin ≥ threshold, increase congwin linearly • if packet lost, have gone too far • threshold = congwin / 2 • If 3 dup. ACKs: network capable of delivering some packets, congwin cut in half • If timeout: slow-start again (congwin = 1 mss) • Additive Increase, Multiplicative Decrease (AIMD) CS118/Spring05
TCP SlowStart & Congestion Avoidance time initialize: Congwin = 1 threshold = RcvWindow if (CongWin < threshold) { for every segment ACKed Congwin++ } until (loss event) /* slowstart is over */ { for every w segments ACKed: Congwin++ } Until (loss event) /* loss detected */ threshold = Congwin/2 If (3 dup. ACKs) Congwin = threshold Else Congwin = 1 mss one segment RTT two segments four segments CS118/Spring05
TCP sender congestion control CS118/Spring05
Is TCP fair? Fairness: if N TCP sessions share same bottleneck link, each should get 1/N of link capacity Example: 2 competing connections, same RTT • Additive increase gives slope of 1 • multiplicative decrease decreases throughput proportionally capacity R equal bandwidth share TCP connection 1 loss: decrease window by factor of 2 congestion avoidance: additive increase congestion avoidance: additive increase Connection 2 throughput loss: decrease window by factor of 2 bottleneck router TCP conn 2 R Connection 1 throughput CS118/Spring05
Fairness (more) Fairness and UDP Multimedia apps often do not use TCP do not want rate throttled by congestion control Instead use UDP: pump audio/video at constant rate, tolerate packet loss Research area: TCP friendly Fairness and parallel TCP connections nothing prevents app from opening parallel cnctions between 2 hosts. Web browsers do this Example: link of rate R supporting 9 cnctions; new app asks for 1 TCP, gets rate R/10 new app asks for 11 TCPs, gets R/2 ! CS118/Spring05
Delay modeling Q:How long does it take to receive an object from a Web server after sending a request? Ignoring congestion, delay is influenced by: TCP connection establishment data transmission delay slow start Assumptions: Assume one link between client and server of rate R no retransmissions (no loss, no corruption) Window size: First assume: fixed congestion window, W segments Then dynamic window, modeling slow start CS118/Spring05
Fixed congestion window (1) First case:WS/R > S/R+RTT ACK for first segment in window returns before window’s worth of data sent Notations: S: #bits in one segment O: #bits in one object R: bandwidth W: window size (# segments) K: O/WS Q: # times server idles if O=∞ P = min(Q, K-1) delay = 2RTT + O/R CS118/Spring05
Fixed congestion window (2) Second case: WS/R < RTT + S/R: wait for ACK after sending window’s worth of data sent Server's waiting time delay = 2RTT + O/R + (K-1)[S/R + RTT - WS/R] CS118/Spring05
TCP Delay Modeling: Slow Start (1) • Delay components: • 2 RTTs for connection establish and request • O/R to transmit object • Server's idle time • Server idles: P =min{K-1,Q} times • Example: • O/S = 15 segments • K = 4 windows • Q = 2 • P = min{K-1,Q} = 2 • Server idles P=2 times CS118/Spring05
TCP Delay Modeling: Slow Start (2) Now suppose window grows according to slow start The delay for one object is: CS118/Spring05
HTTP Modeling • Assume Web page consists of: • 1 base HTML page (of size O bits) • M images (each of size O bits) • Non-persistent HTTP: • M+1 TCP connections in series • Response time = (M+1)O/R + (M+1)2RTT + sum of idle times • Persistent HTTP: • 2 RTT to request and receive base HTML file • 1 RTT to request and receive M images • Response time = (M+1)O/R + 3RTT + sum of idle times • Non-persistent HTTP with X parallel connections • Suppose M/X integer. • 1 TCP connection for base file • M/X sets of parallel connections for images. • Response time = (M+1)O/R + (M/X + 1)2RTT + sum of idle times CS118/Spring05
HTTP Response time (in seconds) RTT = 100 msec, O = 5 Kbytes, M=10 and X=5 For low bandwidth, connection & response time dominated by transmission time. Persistent connections only give minor improvement over parallel connections. CS118/Spring05
HTTP Response time (in seconds) RTT =1 sec, O = 5 Kbytes, M=10 and X=5 For larger RTT, response time dominated by TCP establishment & slow start delays. Persistent connections now give important improvement: particularly in high delaybandwidth networks. CS118/Spring05
Network layer R R R To transport protocol segment • transport segment from sending to receiving host • Source host: encapsulates segments into packets • Destination host: delivers segments to transport layer • network layer protocols in every host and router • Each router examines header fields in all packets passing through it • Routing: calculate the best path to each destination • Forwarding: move packets from input to output S D segment Network protocol header CS118/Spring05
Makeup lectures on Monday June 6 There will be no class on Thursday June 9 To make it up: • 8-9:50am Boelter 5422, or • 6-7:50pm Boelter 5419 Pick the lesser evil one • Additional office hours on the final exam day: Saturday June 11: 10:00AM - 1:00PM And the Final exam is: 3:00 - 6:00PM CS118/Spring05
Always keep the big picture in mind IP IP IP Ethernet interface Ethernet interface Ethernet interface host host HTTP message HTTP HTTP TCP segment TCP TCP router router IP packet IP packet IP packet IP Ethernet interface SONET interface SONET interface CS118/Spring05
Network layer: Connection vs. connection-less service • Virtual Circuit network provides connection-oriented service • source-to-dest path works in a way much like telephone circuit • Datagram network provides connectionless service • The two services analogous to TCP vs. UDP at transport-layer, but: • Network delivery service: host-to-host • No choice: a given network provides one or the other but not both (as in transport layer) CS118/Spring05
Virtual circuit Network application transport network data link physical application transport network data link physical • Use a signaling protocol to setup connection before data can flow • every router on source-dest path maintains “state” for each passing connection • link, router resources (bandwidth, buffers) allocated to each VC • each packet carries VC identifier (not destination host address) • VC number must be changed on each link. • New VC number comes from forwarding table 5. Data flow begins 6. Receive data 4. Call connected 3. Accept call 1. Initiate call 2. incoming call CS118/Spring05
Forwarding table VC number 22 32 12 2 1 3 interface number Incoming interface Incoming VC # Outgoing interface Outgoing VC # 1 12 2 22 2 63 1 18 3 7 2 17 1 97 3 87 … … … … Forwarding table in northwest router: Routers maintain connection state information! CS118/Spring05
Internet: A Datagram Network • hosts are connected to subnets • subnets are interconnected by IP routers • All hosts and routers speak IP • routers also “speak” many different link layer protocols • IP provides two basic functions • globally unique address for all connected points • Best effort datagram delivery from source to destination hosts • Fragmentation/reassembly of packets whenever needed H1 H8 R2 R3 R1 IP IP IP IP IP ETH FDDI WLAN ETH ETH FDDI WLAN ETH CS118/Spring05
The Internet Network layer Host, router network layer functions: • ICMP protocol • error reporting • router “signaling” • IP protocol • addressing conventions • datagram format • packet handling conventions • Routing protocols • RIP, OSPF, BGP … forwarding table Router function Transport layer: TCP, UDP Network layer Link layer physical layer CS118/Spring05
IP datagram format IP version number header length 3 fields used for packet fragmentation/reassembly max number of remaining hops basic header 32 bits type of service head. len ver Total length fragment offset flgs 16-bit identifier time to live IP header checksum protocol source IP address upper layer protocol to deliver payload to destination IP address E.g. timestamp, route recording, Specify list of routers to visit. Options (if any) data (variable length, typically a TCP or UDP segment) how much overhead for a TCP segment? • 20 bytes of TCP • 20 bytes of IP • = 40 bytes CS118/Spring05
IP Address structure IP address space: 2-level hierarchy What’s a network ? (from IP address perspective) device interfaces with same network part of IP address can physically reach each other without going thru a router 173.1.1.1 = 10101101 00000001 00000001 00000001 173 1 1 1 4 byte • 32-bits, uniquely identifies a host or router interface • interface: connection between host/router and physical link 173.1.1.1 Network-ID host-ID 173.1.2.1 173.1.1.2 173.1.2.9 173.1.1.4 173.1.2.2 173.1.3.27 173.1.1.3 LAN 173.1.3.2 173.1.3.1 CS118/Spring05
IP Address: how many bits for net-ID multicast address 1110 network host 110 network 10 host Host ID Network ID • Original IP design: class-based address • Two changes added over the last 25 years • Subnetting: add a hidden level to address hierarchy • An organization gets one address block, then split the host part into two parts: subnet and host parts • CIDR:Classless InterDomain Routing (today) • network portion of address of arbitrary length 1.0.0.0 to 127.255.255.255 A network 0 host 128.0.0.0 to 191.255.255.255 B 192.0.0.0 to 223.255.255.255 C 224.0.0.0 to 239.255.255.255 D CS118/Spring05
Classless InterDomain Routing host part network part • address format: a.b.c.d/x, x # bits in network portion • Internet Service Providers get blocks of IP addresses from the Internet address authority • Internet customers get portion of their ISP’s addr. block 200.23.16.0/23 11001000 0001011100010000 00000000 ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20 Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23 Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23 Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23 ... ….. …. …. Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23 CS118/Spring05
Hierarchical addressing: route aggregation 200.23.16.0/23 200.23.18.0/23 200.23.30.0/23 200.23.20.0/23 . . . . . . Hierarchical addressing allows efficient advertisement of routing information: Organization 0 “Send me anything with addresses beginning 200.23.16.0/20” Organization 1 Organization 2 Fly-By-Night-ISP Internet Organization 7 “Send me anything with addresses beginning 199.31.0.0/16” ISPs-R-Us CS118/Spring05
Hierarchical addressing: route aggregation 200.23.20.0/23 200.23.16.0/23 200.23.18.0/23 200.23.30.0/23 . . . . . . Multi-homing • Route aggregation helps reduce routing table size • multi-homing defeats address aggregation • ISPs-R-Us has a more specific route to Org. 7 Organization 0 Organization 1 “Send me anything with addresses beginning 200.23.16.0/20” Organization 2 Fly-By-Night-ISP Internet Organization 7 “Send me anything with addresses beginning 199.31.0.0/16” “Send me anything with addresses beginning 199.31.0.0/16, or 200.23.30.0/23 ” ISPs-R-Us CS118/Spring05
IP Subnet Viewed from inside Viewed from outside • subnet mask: indicates the portion of the address that is considered as “network ID” by the local site • subnet mask does not need to align with a byte boundary • Each host must be configured with both an IP address and a subnet mask • subnets are invisible outside of the local site • backbone routers only know how to forward packets to the networkID • Within the organization, routers store: [subnet, mask, next hop] • Subnet advantages: aggregate local info., keep backbone routers table size small 11111111111111111111110000000000 Host ID Network ID 10-bit host ID CS118/Spring05
An example Network# mask next-hop 131.179.96 255.255.255.0 C …… ……….. 131.179.96.0 Network# next-hop 131.179 B a class-B address Network ID host ID subnet mask(255.255.255.0) 111111111111111111111111 00000000 subnetted address 131 . 179 . 96 15 UCLA CS Global Internet A B Look up IP addr. 131.179.96.15 C 131.179.96.15 CS118/Spring05
Getting an IP packet from source to dest. 173.1.1.1 173.1.2.1 A B E 173.1.1.2 173.1.2.9 173.1.1.4 173.1.2.2 173.1.3.27 173.1.1.3 173.1.3.2 173.1.3.1 Source host A destination B: • Host A: [A’s addr & subnet mask] ═ [B’s addr & subnet mask] ? • yes: B is on the same net, use link layer to send pkt directly to B Source host Adestination E: • [A’s addr & subnet mask] = [E’s addr & subnet mask] ? • yes • No: Send pkt to default router 173.1.1.4 Router: Is E on any of my directly connect subnets? • Yes: send pkt directly to E • No: forward to another router according to routing table CS118/Spring05
IP Fragmentation & Reassembly Different subnets have different MTUs (Maximum Transmission Unit) Sender host always uses its max MTU size Routers “fragment” IP packets if the next link has a smaller MTU chop packets to the MTU size of next link further fragmentation down the path possible packet reassembled at dest. host H1 sending an IP packet of 1300 byte data to H2: R1 MTU=1500B R2 1300B H1 reassembly 512B H2 276B R3 MTU=532B CS118/Spring05
IP Fragmentation: An example 4 5 TOS 1320 0 0 0 0 7394 rest of the IP header data (1300 bytes) 4 5 TOS 532 0 0 1 0 7394 rest of the IP header data (512 bytes) 4 5 TOS 532 0 0 1 64 7394 rest of the IP header data (512 bytes) 4 5 TOS 296 0 0 0 128 7394 rest of the IP header data (276 bytes) R2 1300B H1 reassembly 512B H2 276B R3 MTU=532B • At destination: • identifier: tell all pieces in the same packet • the last fragment: MF=0 • the offsets tell whether there are holes missing in the middle CS118/Spring05
ICMP: Internet Control Message Protocol used by hosts & routers to communicate network-level information error reporting: unreachable host, network, port, protocol echo request/reply ICMP msgs carried in IP packets ICMP message format type code checksum unused (or used by certain ICMP types) IP header and first 64bits of data Or data (according to ICMP types) TypeCodedescription 0 0 echo reply (ping) 3 0 dest. network unreachable 3 1 dest host unreachable 3 2 dest protocol unreachable 3 3 dest port unreachable 3 6 dest network unknown 3 7 dest host unknown 4 0 source quench (congestion control - not used) 8 0 echo request (ping) 9 0 route advertisement 10 0 router discovery 11 0 TTL expired 12 0 bad IP header IP header CS118/Spring05
NAT: Network Address Translation rest of Internet local network (e.g., home network) 10.0.0/24 10.0.0.1 10.0.0.4 10.0.0.2 138.76.29.7 10.0.0.3 Datagrams with source or destination in this network have 10.0.0/24 address for source, destination (as usual) All datagrams leaving local network have same single source NAT IP address: 138.76.29.7, different source port numbers CS118/Spring05
NAT: Network Address Translation rest of Internet local network (e.g., home network) 10.0.0/24 10.0.0.1 10.0.0.4 10.0.0.2 138.76.29.7 10.0.0.3 Datagrams with source or destination in this network have 10.0.0/24 address for source, destination (as usual) All datagrams leaving local network have same single source NAT IP address: 138.76.29.7, different source port numbers CS118/Spring05
NAT: Network Address Translation 2 4 1 3 S: 138.76.29.7, 5001 D: 128.119.40.186, 80 S: 10.0.0.1, 3345 D: 128.119.40.186, 80 1: host 10.0.0.1 sends datagram to 128.119.40, 80 2: NAT router changes datagram source addr from 10.0.0.1, 3345 to 138.76.29.7, 5001, updates table S: 128.119.40.186, 80 D: 10.0.0.1, 3345 S: 128.119.40.186, 80 D: 138.76.29.7, 5001 NAT translation table WAN side addr LAN side addr 138.76.29.7, 5001 10.0.0.1, 3345 …… …… 10.0.0.1 10.0.0.4 10.0.0.2 138.76.29.7 10.0.0.3 4: NAT router changes datagram dest addr from 138.76.29.7, 5001 to 10.0.0.1, 3345 3: Reply arrives dest. address: 138.76.29.7, 5001 CS118/Spring05
NAT implementation • NAT router must do the following: • outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #) • . . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr. • remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair • incoming datagrams: replace (NAT IP address, new port #) in destination fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table • Problems due to NAT • Increased network complexity, reduced robustness • Cannot run services from inside a NAT box • address shortage should instead be solved by IPv6 CS118/Spring05
IPv6 • Motivation: 32-bit address space exhaustion • Take the opportunity for some clean-up • IPv6 datagram format: • Address length changed from 32 bits to 128 bits • fragmentation fields moved out of base header • IP options moved out of base header • Header Length field eliminated • Header Checksum eliminated • Type of Service field eliminated • Time to Live Hop Limit, Protocol Next Header • Precedence Priority, added Flow Label field • Length field excludes IPv6 header CS118/Spring05
IPv6 header format Flow Label Version Priority Payload Length Next Header Hop Limit Source Address (16 bytes, 128 bits) Destination Address (16 bytes) IPv4 header CS118/Spring05
Changes from IPv4 • Priority: identify priority among datagrams in flow • Flow Label: identify datagrams in same “flow” (concept of“flow” not well defined). • Next header: identify upper layer protocol for data • Options: allowed, but outside of the basic header, indicated by “Next Header” field • Checksum: removed entirely to reduce processing time at each hop • ICMPv6: new version of ICMP • additional message types, e.g. “Packet Too Big” • multicast group management functions CS118/Spring05
Transition From IPv4 To IPv6 Flow: X Src: A Dest: F data Flow: X Src: A Dest: F data E B A F E B C D F A tunnel Logical view: IPv6 IPv6 IPv6 IPv6 • Not all routers can be upgraded simultaneous • to allow the Internet operate with mixed IPv4 and IPv6 routers : tunneling Physical view: IPv6 IPv6 IPv6 IPv6 IPv4 IPv4 Src:B Dest: E Src:B Dest: E Flow: X Src: A Dest: F data Flow: X Src: A Dest: F data E-to-F: IPv6 B-to-C: IPv6 inside IPv4 B-to-C: IPv6 inside IPv4 A-to-B: IPv6 CS118/Spring05
routing algorithm local forwarding table header value output link 0100 0101 0111 1001 3 2 2 1 value in arriving packet’s header 1 0111 2 3 Interplay between routing and forwarding CS118/Spring05
Router Architecture Overview Two key router functions: • run routing algorithms/protocol (RIP, OSPF, BGP) • forwarding datagrams from incoming to outgoing link CS118/Spring05
Input Port Functions Decentralized switching: • given datagram dest., lookup output port using forwarding table in input port memory • goal: complete input port processing at ‘line speed’ • queuing: if datagrams arrive faster than forwarding rate into switch fabric Physical layer: bit-level reception Data link layer: e.g., Ethernet see chapter 5 CS118/Spring05