500 likes | 628 Views
QoS over IP Networks. Dave Olshefski Candidacy Exam, 12/8/99. Overview. Motivation for QoS in the network Flow specification Flow separation Resource reservation Solving the scaling problem Management Interesting research issues. Top Three reasons for QoS in the Network.
E N D
QoS over IP Networks Dave Olshefski Candidacy Exam, 12/8/99
Overview • Motivation for QoS in the network • Flow specification • Flow separation • Resource reservation • Solving the scaling problem • Management • Interesting research issues
Top Three reasons for QoS in the Network • Ensure mission critical applications are allocated the resources they need. • Multimedia/realtime applications can only operate within a certain range of service. • Service providers wish to increase revenues through premium pricing and competitive differentiation of services
What is a service? • End users concerned with response time & cost • Better than Best Effort for webserver • 1 Mbps, to any point in net, over 1Mbps gets BE • Leased Line Emulation from point A to B (realtime) • 100 Kbps, A to B, over 100 Kbps get dropped • Media Playback from point A to B (adaptive) • 100Kbps sustained, 100Kbps max burst at peak rate of 200 Kbps, A to B, excess burst traffic over sustained rate given lower priority in class, non-conforming traffic etc..
Characterize a Flow: <r,b,p,m,M> r constant token rate (bps) p max peak rate variable rate input b token bucket depth input queue burst leaky bucket depth
Weighted Round Robin (WRR) Queue A WA Bandwidth assurances iff mean packet size per flow is known Queue B Packet Scheduler WB No delay assurances Queue C WC Wi = number of packets to service during turn
Weighted Fair Queuing (WFQ) Queue A WA Bandwidth and delay assurances t0 Packet Scheduler Queue B Simulates bit-by-bit WRR Sorted list expensive to maintain WB t1 Queue C WC t2 Wi = number of “bits” to service during turn
Deficit Round Robin (DRR) Queue A QA+DA Bandwidth assurances but no delay bounds Queue B Packet Scheduler QB+DB Queue C QC+DC IF (quantum+deficit) > packet length THEN send packet ; deficit -= length; ELSE deficit += quantum
in in Meter Meter Dropper Shaper out out Router Traffic Conditioning Queue A Queue B Multifield Classifier data Queue C BE • 5-tuple classification <src IP, src port, dest IP, dest port, prot id> • queue per microflow – scaling problem • How is this configured??
ReSource reserVation Protocol R3 R5 R4 R6 R2 • PATH • Follows routing path (router alert) • Carries Tspec for generated traffic • Soft state left in each router (prev hop) • Nodes on path advertise capabilities • All nodes are equivalent R7 R1 H1 H2
RSVP R3 R5 R4 • RESV • Hop-by-hop • Carries Tspec for reservation • Admission decision made at each node R6 R2 R7 R1 H1 H2
Diff-serv Backbone IR ER CR BR BR Aggregation region single field classification and class-based queuing R6 R1 RSVP Stub RSVP Stub H1 H2
in in in Meter Meter Meter Dropper Remarker Shaper out out out Ingress router to DS domain Queue A Queue B MF Classifier & Marker data Queue C 5-tuple to DSCP mapping Multiple field -> Single field i.e. microflow->class of service
Core router in DS domain Queue A Queue B BA Classifier Packet Scheduler data Queue C Single field classification No policing, shaping, dropping
Aggregate Region RSVP Region RSVP Region H1 BR IR CR ER BR H2 RSVP_E2E_IGNORE to H2 PATH to H2 PATH to H2 RSVP PATH w/DSCP for aggregate RESV to IR for aggregate RESV from H2 RESV from H2 RESV from H2 Aggregate RSVP
Variations • Partial path reservations w/incremental updates • Bottleneck reservations • “reserve if needed else use BE” • Sender initiated reservations
RSVP Stub Multicast Problem BR H3 ER Diff-serv Backbone IR CR ER BR BR R6 R1 RSVP Stub RSVP Stub H1 H2
Windows 2000/98 QoS-Enabled Application Traffic Management Application GQoS TC API Winsock2.dll RAPI RSVP.exe Protocol Stack Kernel Traffic Control Packet Scheduler GPC API Network Card Driver
Winsock 2 API WSAStartup(…); WSAEnumProtocols(NULL, Protocols, &bufferSize); WSASocket(…, Protocols[i], …); WSAConnect(…, &FlowSpecs, …); WSAGetQoSByName(socket,”RSVP”,&FlowSpec); WSAIoctl(socket, SIO_SET_QOS, …, &Qos2, …);
WSAStartup(…); numProtocols = WSAEnumProtocols(NULL, Protocols, &bufferSize); For (i = 0; i < numProtocols; i++) { if (Protocols[i].dwServiceFlags1 & XP1_QOS_SUPPORTED) && (Protocols[i].dwServiceFlags1 & XP1_CONNECTIONLESS) && (Protocols[i].iAddressFamily == AF_INET) && (Protocols[I].iProtocol == IPROTO_UDP)) { s = WSASocket(FROM_PROTOCOL_INFO, // address family FROM_PROTOCOL_INFO, // socket type FROM_PROTOCOL_INFO, // protocol Protocols[i], 0, WSA_FLAG_OVERLAPPED); WSAConnect(s, (sockaddr *)&RemoteHost, sizeof(RemoteHost), NULL, NULL, &Qos, RESERVED); } } …..
Service Level Agreement • constraints on source and destination points • out of profile actions • marking and shaping services provided • encryption services • authentication mechanisms • pricing and billing • availability/reliability, refunds, re-negotiation/cancellation • static or dynamic
Negotiation During RSVP Signaling Aggregate Region RSVP Region RSVP Region H1 BR IR CR ER BR H2 • Each node on path adds its fee to the charge/price as the reservation is propagated from end to end. • Iteration not well supported
Bandwidth Broker Negotiation SLA/COPS SLA/COPS BB/PDP IR ER CR BR BB/PDP BB/PDP BR • Hierarchical • Inter-domain • Intra-domain • With OSPF, BB can determine price from demand RSVP RSVP R6 R1 H1 H2
H1 MPLS BR BR BR BR BR BR BR BR H2 MPLS encapsulation at ingress -> de-encapsulation at egress Routing + QoS
Market BB PDP BB BB IR ER CR BR PDP PDP BR • Negotiation more efficient • Matching alg. (social welfare) • Fairness controls RSVP RSVP R6 R1 H1 H2
Management Console PAPI Policy Server BB Policy DB CPC LDAP COPS PDP PDP Border Router PEP Proxy SNMP LPDP Border Router PEP Border Router
Research Issues • Aggregation • Expanding/collapsing regions • dynamic or static SLA’s • Integrating policy, pricing and measurement for admissions control/marketing decisions • Market mechanisms and product definition • Matching algorithms • Advanced reservations (pre-emption) • Pricing (holding, usage, congestion)
Research Issues • Adaptive applications: • Temporal scaling (sampling rate) • Spatial resolution (image size) • Layering (video) • Advance Reservations: adapting to price fluctuations and availability over time. • Bandwidth + [CPU, memory & disk space] • Server farms, web hosting • VPN’s, VAN’s, active networks
Pricing/Charging • Reservation charge • Usage based charging • Congestion charging • Pre-emption vs. selling back unused bandwidth
Reservation Styles • Fixed filter • Distinct reservations from each source to dest • Shared Explicit • Specified sources share the same reservation • Wildcard filter • All sources share the same reservation
IP-SEC • DS codepoint not encrypted • incoming IP-SEC packets can’t be multi-field classified • may have to rely on DS codepoint marking in hosts • IP-SEC hides dst address? Difficult to provide quantitative services without knowing the egress point.
Diff-serv Backbone Diff-serv Backbone IR IR ER ER CR CR DSCP remarking between domains BR BR R6 R1 RSVP Stub RSVP Stub H1 H2
Controlled Load • Approximate BE under unloaded conditions • Most packets receive little/no queuing delay but no delay/jitter assurances • No micro-flow packet reordering • <r,b,p,M,m> • Out-of-profile share BE
Guaranteed Service • Firm guarantees on delay bound and minimum bandwidth • No jitter, min or avg delay assurances • Apps with hard real-time requirements • Out-of-profile gets dropped
Assured Forwarding Group • Four classes • Each class receives a certain amount of resources • Within each class there are three drop precedence's • No micro-flow packet reordering • No delay/jitter assurances • <r,b,p,M,m> • Approximates Controlled Load
(Weighted) Random Early Detection Max Threshold Min Threshold p=1 p=0 • P is a function of mean queue length and time to last packet drop. • P increases slowly as queue fills up • Flows lose packets in proportion to their allocated bandwidth • RIO: RED applied to in-profile and out-profile queues separately • Weights(drop probabilities) based on IP-precedence
Slots bandwidth new intervals unallocated allocated time demand time
Intervals bandwidth new intervals unallocated allocated time demand time
hEvent = WSACreateEvent(); While(True) { rc = WSAWaitForMultipleEvents(1, &hEvent, FALSE, timer, FALSE); switch(rc) { case WSA_WAIT_EVENT_0: WSAEnumNetworkEvents(s, hEvent, &events); if (events.lNetworkEvents & FD_QOS) { WSAIoctl(s, SIO_GET_QOS, NULL, 0, &Qos2, sizeof(QoS2), …); } if (events.lNetworkEvents & FD_READ) { WSARecvFrom(…. } } }
Windows 2000 Flow Spec typedef struct _flowspec { uint32 TokenRate; / In Bytes/sec uint32 TokenBucketSize; / In Bytes uint32 PeakBandwidth; / In Bytes/sec uint32 Latency; / In microseconds uint32 DelayVariation; / In microseconds SERVICETYPE ServiceType; / Guaranteed, Predictive, / Best Effort, etc. uint32 MaxSduSize / In Bytes uint32 MinimumPolicedSize / In Bytes }
Services • Qualitative • relative, measured by comparison • “Traffic service B has a lower drop probability than traffic in service A” • arbitrary end points “A to ?” • harder to manage & predict • Quantitative • measured independent of other services • “90% of traffic delivered at service level C will experience no more than 50 msec delay” • between specific end points “A to B” • Easier to manage and predict
Aggregate Region RSVP Region RSVP Region H1 BR IR CR ER BR H2 RSVP PATH from H1 to H2 Using RSVP signaling to establish aggregate reservation. Signal = <src IP, src port, dest IP, dest port, RSVP, Tspec >
Aggregate Region RSVP Region RSVP Region H1 BR IR CR ER BR H2 Microflow RSVP packet’s IP Protocol number are remarked from RSVP to RSVP_E2E_IGNORE by ingress router. Router Alert is set but CR doesn’t recognize RSVP_E2E_IGNORE. Signal = <src IP, src port, dest IP, dest port, RSVP_E2E_IGNORE, Tspec >
Aggregate Region RSVP Region RSVP Region H1 BR IR CR ER BR H2 Egress router recognizes RSVP_E2E_IGNORE and resets IP Protocol to RSVP. Packets can now reserve rest of path IR is seen as the previous hop so RESV will be addressed to IR. Signal = <src IP, src port, dest IP, dest port, RSVP, Tspec >
Aggregate Region RSVP Region RSVP Region H1 BR IR CR ER BR H2 TSpec’s/Flowspecs for microflows are summed. An RSVP packet for the aggregate is passed from IR to ER which CR notices. The DSCP is included. Signal == <src IP, src port, dest IP, dest port, RSVP, Tspec, DSCP >
Aggregate Region RSVP Region RSVP Region H1 BR IR CR ER BR H2 Remote host sends a RESV to allocate resources, hop-by-hop. Signal == <src IP, src port, dest IP, dest port, RSVP, Tspec>
Aggregate Region RSVP Region RSVP Region H1 BR IR CR ER BR H2 ER sends an aggregate RESV to IR. CR routers perform normal RESV admission control. Signal == <src IP, src port, dest IP, dest port, RSVP, Tspec, DSCP >
Aggregate Region RSVP Region RSVP Region H1 BR IR CR ER BR H2 ER sends the mircoflow RESV to IR (previous hop). Signal == <src IP, src port, dest IP, dest port, RSVP, Tspec, DSCP>
Aggregate Region RSVP Region RSVP Region H1 BR IR CR ER BR H2 IR establishes reservation and sends the microflow RESV on to source. Signal == <src IP, src port, dest IP, dest port, RSVP, Tspec>