640 likes | 771 Views
Quality of Service. Outline. Realtime Applications Integrated Services Differentiated Services. Realtime Applications. Require “ deliver on time ” assurances must come from inside the network Example application (audio) sample voice once every 125us
E N D
Outline • Realtime Applications • Integrated Services • Differentiated Services
Realtime Applications • Require “deliver on time” assurances • must come from inside the network • Example application (audio) • sample voice once every 125us • each sample has a playback time (usually the time it was created if it is live data vs stored data) • packets experience variable delay in network • add constant factor to playback time: playback point Sampler , Microphone Buffer , A D D A converter Speaker
Playback Buffer Packet arrival packet xmission = Playback point Sequence number playback time Buffer Network delay T ime
Timely Delivery • How to achieve timely delivery: (I am assuming delay from src to dst is about the same as from dst to src) • When one-way delay is small (< 1/3) relative to acceptable delay • Sent packets will arrive on time • If lost, a packet has enough time to be retransmitted. • When one-way-delay is large (> 1) relative to acceptable delay • Impossible for transmitted packets to arrive on time • Otherwise • Packets may arrive on time • No possibility of retransmission.
Timely delivery (continued …) • We can reduce the one-way-delay by providing support delay preferences in the network; called quality of service, or QoS • However, one-way delay cannot be less than propagation-delay (i.e. no queuing or transmission delay) • Within the 48-state U.S, one-way propagation-delay peaks around 25 msec (not to mention transmission delay per hop) • Humans notice about 50 msec delay for voice • Not much chance for retransmission! • We want to provide QoS such that • End-to-end delay is small • Minimum, if any, packet loss
Taxonomy Applications Real time Elastic T olerant Intolerant Adaptive Nonadaptive Delay- Rate- adaptive adaptive
Quality of Service Approaches Fine-grained • Provide QoS to individual applications or flows • Proposed for the Internet: • IETF Integrated Services (IntServ) with Resource Reservation Protocol (RSVP, like a VC “setup”) • You will need a “flow ID”, i.e., like virtual circuits Coarse-grained • Provide QoS to large “classes” of data or aggregated flows • Proposed for the Internet: • IETF Differentiated Services (DiffServ) • You only need a few bits in the header to mark the “class” of the packet
Integrated Services (Fine Grained Approach) • Two IntServ Service Classes • Guaranteed service • Specified maximum delay • I.e., for intolerant applications • We will cover this one first • Controlled load services • No bound on delay • I.e., for delay tolerant, adaptive applications • Network “shields” this traffic from congestion • Network “appears” lightly loaded
Integrated Services - Mechanisms • Flow specification • Tell the network the properties of the data flow • Admission control • Network decides if it can handle flow • Reservation • Reserve resources if flow admitted by admission control • Packet classification • Map packets to flows • Scheduling • Forwarding method • Policing • Enforcing traffic properties at entrance of network
A “generic” view of the mechanisms Reserve message: gives flowspec (burstiness, rate, desired delay) and reservers resources along the way. Destination host Source host Accept/reject (admission control) according to available resources Data messages Destination host Source host Policing: ensuring the source’s traffic satisfies the flowspec
Mechanisms for data flow. Input channel Packet classification: Separate into queues according to flow ID Output channel Scheduler: chooses which packet to forward next
Flow Specification • We next go more deeply into flow specification
Flowspec’s have two parts • Rspec: describes servicerequested from network • In IntServ’s guaranteed service: delay bound • In IntServ’s controlled-load service: none • Tspec: describes flow’s traffic characteristics • Usually defined in terms of a rate r and max burst size size B • r is the long-term rate of the flow • B is the “burstiness” or how much you can deviate from the rate r • (more on this next)
Constant Rate Service • Many QoS protocols guarantee a constant rate r to a network flow (or at least this rate) • Thus, they guarantee a packet will exit the network no later than the time it would exit from a constant-rate server of rate r (plus a small constant) • I.e., the network “mimics” a constant-rate server as best it can. • Question: for a given input flow f, what will be the delay through a constant rate server? • The delay of a packet is the size of the queue when the packet arrives at the server divided by the rate r. Queue Constant rate r My flow f
(r,B) constrained flows • A flow f is (r,B) constrained at some point P in the network iff making a copy of f at P and giving it as input to a constant rate server of rate r causes the server’s queue to grow to no more than B bytes. • I.e., its delay through a server of rate r is at most B/r • Note: a flow may be (r,B) constrained going into a router but may not be when it leaves the router (due to delays at the router) • Given an input flow f (i.e. if you know the arrival time of each packet of f and its size), can you determine if f is (r,B) constrained? • Sure, just compute the behavior of a constant rate server of rate r and see how big its queue gets.
Example: assume flow f is given to a constant rate server of rate 1 MBps • Is f (r,B) constrained, where • r = 1 MBps, B = 1 MB? Consider any interval • r = 1 MBps, B = 0.5 MB? Consider interval (2,3) • r = 0.01 MBps, B = 2 MB? Consider interval (0,3) • (ignore the green line in this case since that was the buffer for a 1MBps server) Flow f 6 5 4 Buffer size at constant server 1 MBps 3 Cumulative Arrival function (MB) 2 1 6 5 4 1 2 3 Time
(r,B)-constrained delay • Assume f is (r,B)-constrained • What is the delay of a flow f if it is given as input to a constant-rate server of rate r?
Alternative definition • Claim: A flow is (r,B) constrained iff, for any interval of time of length t, the number of bytes arriving from the flow are at most r*t + B. • Why this definition? • Perhaps because we don’t have the flow a-priori but we know something about the process that generates it.
Proof of claim (both definitions are the same) • First part: • If the queue of the server grows to no more than B, then the number of bytes arriving from the flow during any interval of length t are at most r*t + B • We can prove the contra-positive: if the number of bytes arriving from the flow during some interval of length t are more than r*t + B, then the queue grows to more than B • Trivial, since in an interval of size t we can transmit only r*t bytes, if more than r*t + B arrive, then the queue becomes greater than B.
Proof of claim • Second part: • If the number of bytes arriving from the flow at any time interval of size t are at most r*t + B, then the queue of the server grows to no more than B • Prove again the contra-positive: if the queue of the server grows to more than B, then there exists some interval of size t during which more than r*t + B bytes arrived • Let x be the beginning of “busy period” when the queue grew to more than B, and y be the time when the queue grew to more than B. • I.e., queue = 0 right before x, queue > 0 from x to y, queue > B at y. • Since we are busy from x to y, the server sent (y - x)*r bytes during [x,y] • Since the queue was 0 before x, and more than B at y, the total bytes that came in during [x,y] are more than (y-x)*r + B
Enforcing (r,B) constraint • Assume I don’t know f in advance nor anything about the process that generates it. • I want to ensure it is (r,B) constrained before giving it to the network (my tspec says that my traffic is (r,B) constrained) • How do I filter (smooth out) f so that it is (r,B) constrained? • Note that the long-term average rate of f has to be at most r, otherwise you can’t, i.e., packets will be delayed in the filter forever. • Use token buckets (also known as leaky buckets).
r tokens/sec r tokens/sec arrive continuously Data Data Token Bucket Filters Token Bucket Capacity, B (limited token accumulation) Token Bucket Capacity, B (limited token accumulation) Each byte needs a token in order to pass (packet of size L requires L tokens) Each byte needs a token in order to pass (packet of size L requires L tokens) Dropping Filter: drops packets if token is not available Dropping Filter: drops packets if token is not available Buffered Filter: buffers data until tokens become available (we will assume buffered) Buffered Filter: buffers data until tokens become available (we will assume buffered) The “output channel” rate of the bucket is infinity, i.e., tokens, and not bandwidth, determine when the packet exits the bucket. We assume the bucket is full initially
Output Example Assume for the token bucket: r = 1 MB/S B = 1 MB 6 5 4 3 Cumulative Departure function Cumulative Arrival function Tokens left in the bucket 2 1 6 5 4 1 2 3 Time
Output of Token Bucket Filters • The output of a token bucket of parameters r and B is (r,B) constrained • Why? • Consider any interval of size t • How many tokens can there be during this interval? • At most B initially • r*t that arrive during the interval • Hence, the bucket cannot forward more than B + r*t during an interval of size t. • Hence, if the output of a token bucket with parameters r,B is given to a constant rate server of rate r, the queue in the server grows to at most B.
No delay in token bucket • If the input to a token bucket is (r,B) constrained then there is no delay in the bucket. Why? • Let y be the first time when data arrive and there are not enough tokens in the bucket for the data to leave. • I.e., no data is queued at any time before y • Let x be the latest time such that x < y and the bucket is full and during (x, y] the bucket is not full. • How many tokens exist during [x,y]? • B initially at x • (y-x)*r are generated • No tokens are “lost” by overflowing the bucket during [x,y] since the bucket is not full during [x,y] • For y to be delayed, more than B + r*(y-x) packets must have been generated.
Per-Router Mechanisms • Admission Control (talk about this later) • Admission control decides if a new flow can be supported • answer depends on service class and traffic specification • admission control is not the same as policing • policing shapes traffic at the entrance of the network to ensure it satisfies the flowspec once the flow has been admitted. • policing is usually done with a token bucket • Packet Processing (cover this first) • classification: associate each packet with the appropriate reservation (easy if we use flow id’s, so we skip this) • scheduling: manage queues so each packet receives the requested service
Scheduling • Please see the slides on virtual clock • Afterwards we return here and talk about WFQ
WFQ Scheduling • Each flow f is assigned a weight Wf • The bandwidth given to the flow is proportional to its weight where C is the rate of the output channel, and the sum is over the set of “backlogged flows” in the fake server (more on this later).
f f g g h h Real and Fake servers Real (WFQ) server • Real server forwards one packet at a time • It assigns timestamps to packets • Packets are sent out in order of timestamp • The timestamp is the “virtual” finishing time of the packet at the fake server • Bit-by-bit server forwards a few bits of each flow at a time (i.e. fractions of a packet) Fake bit-by-bit server
WFQ Details • The virtual time V(t) at real time t is the “bit number” or “round-number” in the fake bit-by-bit server at real time t. It is computed as follows • Simple FQ: increase V(t) by 1 every time you forward one bit from ALL queued flows in the bit-by-bit server, • WFQ: increase V(t) by 1 every time you forward Wf bits from every queued flow f in the bit-by-bit server. • Thus, V(t) increases faster over time if there are less flows queued in the bit-by-bit server. • The “bit by bit round-robin service” is therefore faster for flows with greater weight.
Virtual time rate changes with backlogged flows V(t) Virtual time of real time t time Backlogged flows at fake server increased Backlogged flows at fake server decreased
Timestamps • Ff,i (timestamp of ith packet of flow f) is the virtual time when the ith packet of f exits the bit-by-bit server. • Packets in the “real server” are forwarded in order of their timestamps. • Timestamps are computed as • Ff,i = virtual time when packet begins service + number of rounds required to service the packet • Ff,i = max(V(Af,i),Ff,i-1) + Lf,i/Wf • Af,i is the arrival time (real time) of ith packet of f (L is packet size)
Bit-by-bit (fake) server crucial equations • For any interval of real-time [T’, T’’] such that the set of backlogged flows does not change in the fake server V(T’’) = V(T’) + (T’’ - T’)*C/WB where WB is the sum of the weights of the backlogged flows in the fake server. • Likewise, if the set of backlogged flows did not change during real-time T’ up to real-time T’’, then T’’ = T’ + (V(T’’) – V(T’))*WB/C
Example • Wf = 4, Wg = 2, Wh = 3, C = 2bits/msec • Time 0, 1 packet (1000 bits) from f arrives and 3 packets (200 bits each) of g arrives. • Timestamp • Ff,1 = max(V(0),Ff,0) + Lf/Wf = max(0,0) + 1000/Wf = 250 • Fg,1 = max(V(0),Fg,0) + Lg/Wg = max(0,0) + 200/Wg = 100 • Fg,2 = max(V(0),Fg,1) + Lg/Wg = max(0,100) + 200/Wg = 200 • Fg,3 = max(V(0),Fg,2) + Lg/Wg = max(0,200) + 200/Wg = 300
Packet of h arrives What is V(775)? Fg3 300 V(t) Virtual time of real time t 250 Ff1 200 Fg2 100 Fg1 0 time 775 msec Packet of h arrives
Example Cont. • At time 775 msec, a packet from h comes in (300 bits in size) • Must compute the value of V(775) to compute the timestamp of the new packet. • Rate at which V grows changes depending on how many flows are queued in the fake server. • Note,rate of V can only increase, and it increases when f or g are no longer queued (until h arrives) • Must check if f or g are no longer queued by time 775 msec.
Example (cont) • Will f or g finish (no longer queued) in the fake server by then (time 775 msec)? • Max(Ff) = 250, Max(Fg) = 300 (f will finish before g) • Because Max(Ff) = 250, we have V = 250 when Ff,1 finishes (by definition of F) • Thus, at what real time will f no longer be queued? • I.e., what is T’’ such that V(T’’) = 250? We know V(0) = 0, i.e., let T’ be 0. • T’’ = T’ + (V(T’’) – V(T’))*WB/C • WB = (Wf + Wg) = 4 + 2 = 6 • T’’ = 0 + (250 – 0)*6 / (2 bits/msec) = 750 msec. (real time)
We know V(750) = 250 What is V(775)? Fg3 300 V(t) Virtual time of real time t 250 Ff1 200 Fg2 100 Fg1 0 time 775 msec Packet of h arrives 750 msec f is no longer queued (only g)
Example (almost finally) • At real time 750, only g is queued in the bit-by-bit server • Thus, from real time 750 until h arrives at 775, ONLY bits from g are forwarded. • Will g finish before h arrives? • We compute the real-time when g finishes. The virtual time when g finishes is 300 (recall Max(Fg) = 300) • Let T’ = 750, V(T’) = 250, V(T’’) = 300, find T’’ • T’’ = T’ + (V(T’’) – V(T’))*WB/C • T’’ = 750 + (300 – 250)*2/2 = 800 (recall WB = Wg = 2, C = 2) • Thus, g would finish at real time 800 if h does not show up • But h DOES show up at time 775 (before g finishes), so g is still queued at this time. • The above computation is thus not useful except to tell us that g is still queued
No break between 750 and 775 What is V(775)? Fg3 300 V(t) Virtual time of real time t 250 Ff1 200 Fg2 100 Fg1 0 time 775 msec Packet of h arrives, g is still queued 750 msec f is no longer queued (only g)
Example (finally) • What is the value of V at real time 775 (when h arrives)? • Only g is transmitted from real time 750 to 775 • T’ = 750, V(T’) = 250, T’’ = 775 • V(T’’) = V(T’) + (T’’ - T’)*C/WB • V(775) = V(750) + (25 msec * 2 bits/msec) / 2 ( Wg) • V(775) = 250 + 25 = 275 • Timestamp of h is thus • Fh,1 = max(V(Ah,1),Fh,0) + Lh/Wh = max(275,0) + 300/3 = 375 • How many bits of g remain when the packet of h arrives?
WFQ (algorithm) • When a packet arrives at time Tarrive we must compute Varrive, i.e., V(Tarrive) = Varrive • Let Vlast be the round number at the time the last packet was received • Let Tlast be the real-time when last packet was received, i.e., V(Tlast) = Vlast • Let Blast be the set of backlogged flows at real time Tlast • Let Ff be the F value of the last packet of flow f.
“Big Picture” Slope of V can only increase (flows finish) Varrive? Vlast Tlast Tarrive
WFQ algoritm continued… • Let f be the flow with minimum Ff in Blast • Let WB be the sum of the weights in Blast • (compute real-time when V = Ff, assuming no new packets) • Tf = Tlast + (Ff – Vlast)*WB/C • If Tf < Tarrive (f did finish before time Tarrive) • (update all values up to when f finishes) • Vlast = Ff, Tlast = Tf, Blast = Blast – {f}; GOTO 1 • else (f did not finish before time Tarrive) • (compute new V, note that B does not change from Tlast to Tarrive) • Varrive = Vlast + (Tarrive – Tlast)*C/WB • Compute packet timestamp using Varrive • Vlast = Varrive; Tlast = Tarrive; Blast = Blast U {flow of packet}
WFQ algortihm (continued) • Whenever a packet arrives and there are no packets in the real server (no packets from any flow, i.e., the packet queue is empty), then we basically reset • Let pf,i be the packet that just arrived • Tlast = Af,i • Vlast = 0 (actually, any value that you want, 0 will do) • WB = Wf
WFQ Exit Time • Let Df,i = real-time when pf,i exits the fake server. • Is WFQ a bounded appetite server? • In an interval [t, t’], consider those packets that arrive during the interval and Df,i≤ t’, i.e, they arrive AND exit the fake server during the interval. • We must show these add to at most (t’-t)*C. • The fake server cannot xmit more than (t’-t)*C bits during the [t’-t] interval. • Hence, no more than (t’-t)*C bits can arrive during [t’-t] AND exit during [t’-t]. • Done! • Hence, WFQ is a bounded appetite server • Packets exit the real serverby the time they exit the fake server plus Lmax/C
Exit time from fake server in WFQ • The rate given to f isi.e., in each round of length ( Wx), it forwards Wf bits from f. • E.g, let Wf = Rf, and ( Wx) ≤ C, then Rf • Then, the fake server serves f at least as fast as constant rate server of rate Rf • From previous slide, each packet exits the real server no later than the time it would exit the fake server plus Lmax/C • Thus, each packet of f exits the real server no later than it would exit a constant rate server of rate Rf plus Lmax/C • The above exit bound is the same as Virtual Clock.
Why WFQ and not VC? • WFQ is fair • The unfairness example of VC will not happen with WFQ
Admission Control (VC and WFQ) • Assume the input flow f is (r,B) constrained • Flow f wants a packet delay of at most seconds • What should be the value of Rf, i.e., the rate reserved by f from the server? • Note • r ≤ Rf,so f is also (Rf, B) constrained. • thus, max delay through server of rate Rf is B/Rf • max delay through real server is B/Rf + Lmax/Cout • We want this to be at most B/Rf + Lmax/Cout ≤ • Thus, Rf≥ B/( - Lmax/Cout)