380 likes | 493 Views
Networking Acronym Smorgasbord: 802.11, DVMRP, CBT, WFQ. EE122 Fall 2011 Scott Shenker http:// inst.eecs.berkeley.edu /~ee122/ Materials with thanks to Jennifer Rexford, Ion Stoica , Vern Paxson and other colleagues at Princeton and UC Berkeley. Announcements.
E N D
Networking Acronym Smorgasbord: 802.11, DVMRP, CBT, WFQ EE122 Fall 2011 Scott Shenker http://inst.eecs.berkeley.edu/~ee122/ Materials with thanks to Jennifer Rexford, Ion Stoica, Vern Paxsonand other colleagues at Princeton and UC Berkeley
Announcements • Congratulations: You all got 100% on HW4 • Worksheet will provide practice • This is last week of sections • See posting about additional office hours next week • Next week will have office hours during class times • Will work through problems on work sheet • Be there or be square…. • Wednesday’s Review: will figure something out….
Today’s Lecture: Dim Sum of Design • Wireless review • Multicast • Packet Scheduling • Peer-to-peer
History • MACA proposal: basis for RTS/CTS in lecture • Contention is at receiver, but CS detects sender! • Replace carrier sense with RTS/CTS • MACAW paper: extended and altered approach • Implications of data ACKing • Introducing DS in exchange: RTS-CTS-DS-Data-ACK • Shut up when hear DS or CTS • Other clever but unused extensions for fairness, etc. • 802.11: uses carrier sense and RTS/CTS • RTS/CTS often turned off, just use carrier sense • When RTS/CTS turned on, shut up when hear either • RTS/CTS augments carrier sense
What Will Be on the Final? • General awareness of wireless (lecture) • Reasoning about a given protocol • If we used the following algorithm, what would happen? • You are not expected to know which algorithm to use; we will tell you explicitly.
Motivating Example: Internet Radio • Internet concert • More than 100,000 simultaneous online listeners • Could we do this with parallel unicast streams? • Bandwidth usage • If each stream was 1Mbps, concert requires > 100Gbps • Coordination • Hard to keep track of each listener as they come and go • Multicast addresses both problems….
Backbone ISP Unicast approach does not scale… Broadcast Center
Backbone ISP Instead build data replication trees • Copy data at routers • At most one copy of a data packet per link Broadcast Center • LANs implement link layer multicast by broadcasting • Routers keep track of groups in real-time • Routers compute trees and forward packets along them
R1 joins G [G, data] [G, data] [G, data] R0 joins G [G, data] Rn joins G Multicast Service Model • Receivers join multicast group identified by a multicast address G • Sender(s) send data to address G • Network routes data to each of the receivers • Note: multicast is both a delivery and a rendezvous mechanism • Senders don’t know list of receivers • For many purposes, the latter is more important than the former R0 R1 S Net . . . Rn
Multicast and Layering • Multicast can be implemented at different layers • link layer • e.g. Ethernet multicast • network layer • e.g. IP multicast • application layer • e.g. End system multicast • Each layer has advantages and disadvantages • Link: easy to implement, limited scope • IP: global scope, efficient, but hard to deploy • Application: less efficient, easier to deploy [not covered]
Multicast Implementation Issues • How is join implemented? • How is send implemented? • How much state is kept and who keeps it?
Link Layer Multicast • Join group at multicast address G • NIC normally only listens for packets sent to unicast address A and broadcast address B • After being instructed to join group G, NIC also listens for packets sent to multicast address G • Send to group G • Packet is flooded on all LAN segments, like broadcast • Scalability: • State: Only host NICs keep state about who has joined • Bandwidth: Requires broadcast on all LAN segments • Limitation: just over single LAN
Network Layer (IP) Multicast • Performs inter-network multicast routing • Relies on link layer multicast for intra-network routing • Portion of IP address space reserved for multicast • 228 addresses for entire Internet • Open group membership • Anyone can join (sends IGMP message) • Internet Group Management Protocol • Privacy preserved at application layer (encryption) • Anyone can send to group • Even nonmembers
How Would YOU Design this? • 5 Minutes….
IP Multicast Routing • Intra-domain (know the basics here) • Source Specific Tree: Distance Vector Multicast Routing Protocol (DVRMP) • Shared Tree: Core Based Tree (CBT) • Inter-domain [not covered] • Protocol Independent Multicast • Single Source Multicast
Distance Vector Multicast Routing Protocol • Elegant extension to DV routing • Using reverse paths! • Use shortest path DV routes to determine if link is on the source-rooted spanning tree • See whiteboard….. • Three steps in developing DVRMP • Reverse Path Flooding • Reverse Path Broadcasting • Truncated Reverse Path Broadcasting (pruning)
r Reverse Path Flooding (RPF) If incoming link is shortest path to source • Send on all links except incoming • Otherwise, drop Issues: (fixed with RPB) • Some links (LANs) may receive multiple copies • Every link receives each multicast packet s:3 s:2 s:3 s:1 s:2 s
Other Problems • Flooding can cause a given packet to be sent multiple times over the same link • Solution: Reverse Path Broadcasting S x y a duplicate packet z b
forward only to child link Reverse Path Broadcasting (RPB) • Choose single parent for each link along reverse shortest path to source • Only parent forwards to child link • Identifying parent links • Distance • Lower address as tie-breaker S Parent of z on reverse path 5 6 x y a child link of x for S z b
Even after fixing this, not done • This is still a broadcast algorithm – the traffic goes everywhere • Need to “Prune” the tree when there are subtrees with no group members • Networks know they have members based on IGMP messages • Add the notion of “leaf” nodes in tree • They start the pruning process
Pruning Details • Prune (Source,Group) at leaf if no members • Send Non-Membership Report (NMR) up tree • If all children of router R send NMR, prune (S,G) • Propagate prune for (S,G) to parent R • On timeout: • Prune dropped • Flow is reinstated • Down stream routers re-prune • Note: a soft-state approach
Distance Vector Multicast Scaling • State requirements: • O(Sources Groups) active state • How to get better scaling? • Hierarchical Multicast • Core-based Trees
Core-Based Trees (CBT) • Pick “rendevouz point” for the group (called core) • Build tree from all members to that core • Shared tree • More scalable: • Reduces routing table state from O(S x G) to O(G)
Use Shared Tree for Delivery • Group members: M1, M2, M3 • M1 sends data root M1 M2 M3 control (join) messages data
Barriers to Multicast • Hard to change IP • Multicast means changes to IP • Details of multicast were very hard to get right • Not always consistent with ISP economic model • Charging done at edge, but single packet from edge can explode into millions of packets within network
Scheduling • Decide when and what packet to send on output link • Classifier partitions incoming traffic into flows • In some designs, each flow has their own FIFO queue flow 1 Classifier flow 2 Scheduler 1 2 flow n Buffer management
Packet Scheduling: FIFO • What if scheduler uses one first-in first-out queue? • Simple to implement • But everyone gets the same service • Example: two kinds of traffic • Video conferencing needs low bandwidth and low delay • E.g., 1 Mbps and 100 msec delay • E-mail not sensitive to delay, but need bandwidth • Cannot admit much e-mail traffic • Since it will interfere with the video conference traffic
Packet Scheduling: Strict Priority • Strict priority • Multiple levels of priority • Always transmit high-priority traffic, when present • .. and force the lower priority traffic to wait • Isolation for the high-priority traffic • Almost like it has a dedicated link • Except for the (small) delay for packet transmission • High-priority packet arrives during transmission of low-priority • Router completes sending the low-priority traffic first
50% red, 25% blue, 25% green Scheduling: Weighted Fairness • Limitations of strict priority • Lower priority queues may starve for long periods • … even if the high-priority traffic can afford to wait • Traffic still competes inside each priority queue • Weighted fair scheduling • Assign each queue a fraction of the link bandwidth • Rotate across the queues on a small time scale • Send extra traffic from one queue if others are idle
Max-Min Fairness • Given a set of bandwidth demands riand a total bandwidth C, the max-min bandwidth allocations are: ai= min(f, ri) • where f is the unique value such that Sum(ai) = C • Property: • If you don’t get full demand, no one gets more than you
Computing Max-Min Fairness • Denote • C – link capacity • N – number of flows • ri – arrival rate • Max-min fair rate computation: • compute C/N (= the remaining fair share) • if there are flows i such that ri ≤ C/Nthen update C and Nandgo to 1 • ifnot, f = C/N; terminate
f = 4: min(8, 4) = 4 min(6, 4) = 4 min(2, 4) = 2 8 10 4 6 4 2 2 Example • C = 10; r1 = 8, r2 = 6, r3 = 2; N = 3 • C/3 = 3.33 • Can service all of r3 • Remove r3 from the accounting: C = C – r3 = 8; N = 2 • C/2 = 4 • Can’t service all of r1 or r2 • So hold them to the remaining fair share: f = 4
Fair Queuing (FQ) • Conceptually, computes when each bit in the queue should be transmitted to attain max-min fairness (a “fluid flow system” approach) • Then serve packets in the order of the transmission time of their last bits • Allocates bandwidth in a max-min fairly
Example Flow 1 (arrival traffic) 1 2 3 4 5 6 time Flow 2 (arrival traffic) 1 2 3 4 5 time Service in fluid flow system 1 2 3 4 5 6 1 2 3 4 5 time Packet system 1 2 1 3 2 3 4 4 5 5 6 time
Fair Queuing (FQ) • Provides isolation: • Misbehaving flow can’t impair others • Could change congestion control paradigm • But not used…. • Doesn’t “solve” congestion by itself: • Still need to deal with individual queues filling up • Generalized to WeightedFairQueuing (WFQ) • Can give preferences to classes of flows • Used for quality of service (QoS) • Allocations to aggregates