220 likes | 366 Views
QoS in Clustered Environments. Ahmad Faraj Faraj@cs.fsu.edu. Overview. Introduction Routing Mechanisms Approaches to QoS in Clusters Conclusions. Introduction. Networked applications inject different mixes of traffic in the network. Some classes of traffic require QoS treatment.
E N D
QoS in Clustered Environments Ahmad Faraj Faraj@cs.fsu.edu
Overview • Introduction • Routing Mechanisms • Approaches to QoS in Clusters • Conclusions
Introduction • Networked applications inject different mixes of traffic in the network. • Some classes of traffic require QoS treatment. • Traditional best-effort model cannot handle such QoS demand
Cluster Systems • Cost effective for high performance environment • Used in scientific computing, web servers, multimedia servers, commercial applications • Two switch/router design to build clusters: • Virtual cut-through (includes wormhole) • Packet switching
Virtual cut-through • Designed for multicomputers • Offers low latency and high bandwidth for best-effort traffic • To support QoS, must modify the switch • Packet switching: Ex. ATM • QoS support available for real time traffic • Can not handle best-effort efficiently due to high message latency (compared to virtual cut-through)
Bottom Line • Must reevaluate and optimize the network architecture to handle both types of traffic, best-effort and QoS, in clustered environments.
Routing mechanisms • Virtual cut-through & wormhole: • Packets is composed of small flits • A header flit leads and middle flits follow in a pipelined manner • Once header is received at the switch, it is forwarded to the outgoing channel • If channel is busy: • Virtual cut-through: store whole packet at the switch • Wormhole: store a few flits across several switches • Each worm carries routing info: • Can support multiple connections on a virtual channel
Virtual channels and physical links are shared resources • Real time application require predictable scheduling of such resources • Must enforce a global priority ordering among competing messages • Example of limitation: • assume a message with highest priority p at time t occupies a virtual channel • If another message arrives with p` > p, it must wait till p message release the channel • Limitation: with v virtual channels, can only enforce v level priority ordering, although message priority levels may be more
Pipelined circuit switching: • Similar to wormhole in terms of flits • Connection oriented: header flit tries to reserve the path first • If path is blocked, must backtrack and find another • Middle flits follow if path is available • If not, a connection is dropped
Classification of QoS Approaches • Virtual circuits • Paths are virtualized & controlled locally at switches • Based on QoS parameters, a separate VC is created where buffer and link bandwidth are reserved • To guarantee end-to-end QoS, switches are responsible to schedule packets • Flexible in terms of providing QoS • Large buffer, complex scheduling algorithms • Increases hardware complexity of switches
Physical circuits: • No virtualization simpler design of switches • Link arbitration policy is used to implement some control on delay and bandwidth • Policy merges multiple streams at a physical link • This causes coupling between streams sharing the link • A QoS stream depends on other traffic flows sharing the link • inflexible to manage network resources to support QoS
Global Scheduling: • Complexity is moved out from switches to network interfaces • Switches are much simpler and fast • Network interfaces augmented with special hardware responsible for: • Routing • Timing packets injected into the network • Negotiation of shared resources with other NICs • Relatively new approach • Issues of practicality, scalability, cost of synchronization, and scheduling are open subjects to discuss
QoS in Packet Switching Networks • Rotating Combined Queuing RCQ: • Low cost queuing & scheduling algorithm • Provides QoS support in multicomputers and point to point LANs • Switch model supports: • Connection based switching: decide routing and reserve bandwidth at connection setup time • Output queuing: packets arriving simultaneously at an output link are queued and scheduled for transmission (reduces head-of-line blocking)
RCQ: • Reduce traffic cost by combing multiple decoupled queues • Combine queues allocated for a few connection with small traffic and large delay bounds • Support best-effort traffic using multiple FIFO queues per port • Uses frame-based scheduling • Connection is allocated number of packet slots in a period of time • Extra queues enable sender to send at higher rate more than reserved • Queuing structure allows real time traffic to bypass best-effort traffic • Permits best-effort traffic to utilize unused bandwidth by other connections
How Does It Work? • Enqueue arriving packets into one of the queue pointed by the current input queue pointer for a specific connection • If maximum number of allowed packets per connection is reached in the current queue, then move the pointer to the next queue • For each idle cycle of the output channel, send any pending packets • Else if there are no packets to transmit, move output queue pointer to the next queue and do the same • Idle connections change their input queue pointer to always point to the current output queue pointer • If QoS packet arrives, it is enqueued in the queue that is also pointed by the current output queue pointer, incurring a delay of the packets in front of it in the queue • Guarantees a worst-case delay of one frame time • End-to-end worst-case delay is bounded by the distance multiplied by the frame time
QoS in PCS Networks • Wormhole switching may suffer from message blockage while PCS does not • PCS is connection oriented • Can reserve bandwidth at connection setup • Requires a VC per connection • Thus, it demands for large number of virtual channels per PC for high link bandwidth • Switch hardware must support VCs ≥ Max simultaneous streams in the network • Else, new connection are not guaranteed • Streams may be dropped
To support QoS in PCS, use a preemption protocol for real time traffic • Higher priority messages can preempt lower priority message on a virtual channel • Blockage only occurs for low priority message competing with a high priority one
QoS in Wormhole-Switched Networks • SuperNet project: • QoS using a separate subnet: • Costly in terms of number host interfaces • Imposing synchronous structure over asynchronous network • Large overhead for small messages • Costly in terms of number host interfaces • Virtual Channels: • Better than the two above • Requires complex scheduling and buffer space at switches
Continued • MediaWorm: • Wormhole based router to support QoS • Supports two traffic: best-effort and QoS • Unlike FIFO, uses rate-based algorithm called Virtual Clock to schedule network resources • Virtual Clock regulates bandwidth of each connection by assigning virtual clock value vtick that ticks at each packet arrival • High bandwidth is represented by smaller vtick
Example: • Message requires 50K flits/s • Header flit carries a vtick set to 1/50K • Header flit asks this value at all routers it passes till it reaches the destination • Thus, no need for explicit connection setup • For best-effort traffic, vtick is set to ∞ since it has the maximum slack • Virtual Clock algorithm can improve QoS delivered to real time traffic compared to FIFO
MediaWorm can achieve as good performance as a PCS router without dropping any connections • PCS is expected to perform better since it is connection oriented. Yet, dropping of connections occurs
Conclusions • For cluster systems, wormhole-like routings seem to be popular • To support QoS is a challenge • Several approaches are overviewed • Use of virtual channels with a preemptive protocol to enforce priority among network traffic is a promising technique