3.14k likes | 3.15k Views
Explore after-the-fact analysis, scaling projections, queuing theory, and simulation models. Discover single-server and multiserver queueing models with input source, system components and queue parameters.
E N D
Queuing Analysis • Do an after-the-fact analysis based on actual values. • Make a simple projection by scaling up from existing experience to the expected future environment. • Develop an analytic model based on queuing theory. • Program and run a simulation model. • Option 1 is no option at all: we will wait and see what happens. • Option 2 sounds more promising. The analyst may take the position that it is impossible to project future demand with any degree of certainty. • Option 3 is to make use of an analytic model, which is one that can be expressed as a set of equations that can be solved to yield the desired parameters • The final approach is a simulation model. Here, given a sufficiently powerful and flexible simulation programming language, the analyst can model reality in great detail and avoid making many of the assumptions required of queuing theory.
QUEUING MODELS The Single-Server Queue The central element of the system is a server, which provides some service to items. Items from some population of items arrive at the system to be served. If the server is idle, an item is served immediately. Otherwise, an arriving item joins a waiting line When the server has completed serving an item, the item departs. If there are items waiting in the queue, one is immediately dispatched to the server. Examples: A processor provides service to processes. A transmission line provides a transmission service to packets or frames of data. An I/O device provides a read or write service for I/O requests.
Components of a Basic Queuing Process Input Source The Queuing System Served Jobs Service Mechanism Calling Population Jobs Queue leave the system Queue Discipline Arrival Process Service Process Queue Configuration
The theoretical maximum input rate that can be handled by the system is:
To proceed, to make some assumption about this model: • Item population: Typically, we assume an infinite population. This means that the arrival rate is not altered by the loss of population. If the population is finite, then the population available for arrival is reduced by the number of items currently in the system; this would typically reduce the arrival rate proportionally. • Queue size: Typically, we assume an infinite queue size. Thus, the waiting line can grow without bound. With a finite queue, it is possible for items to be lost from the system. In practice, any queue is finite. In many cases, this will make no substantive difference to the analysis. We address this issue briefly, below. • Dispatching discipline: When the server becomes free, and if there is more than one item waiting, a decision must be made as to which item to dispatch next. The simplest approach is first-in, first-out; this discipline is what is normally implied when the term queue is used. Another possibility is last-in, first-out. One that you might encounter in practice is a dispatching discipline based on service time. For example, a packet-switching node may choose to dispatch packets on the basis of shortest first (to generate the most outgoing packets) or longest first (to minimize processing time relative to transmission time). Unfortunately, a discipline based on service time is very difficult to model analytically.
The Multiserver Queue • If an item arrives and at least one server is available, then the item is immediately dispatched to that server. • If all servers are busy, a queue begins to form. • As soon as one server becomes free, an item is dispatched from the queue using the dispatching discipline in force. • If we have N identical servers, then r is the utilization of each server, and we can consider Nr to be the utilization of the entire system. • The theoretical maximum utilization is N × 100%, and the theoretical maximum input rate is:
Basic Queuing Relationships Assumptions The fundamental task of a queuing analysis is as follows: Given the following information as • input: • Arrival rate • Service time Provide as output information concerning: • Items waiting • Waiting time • Items in residence • Residence time.
Kendall’s notation • Notation is X/Y/N, where: X is distribution of interarrival times Y is distribution of service times N is the number of servers • Common distributions • G = general distribution if interarrival times or service times • GI = general distribution of interarrival time with the restriction that they are independent • M = exponential distribution of interarrival times (Poisson arrivals) and service times • D = deterministic arrivals or fixed length service M/M/1? M/D/1?
What Is Congestion? • Congestion occurs when the number of packets being transmitted through the network approaches the packet handling capacity of the network • Congestion control aims to keep number of packets below level at which performance falls off dramatically • Data network is a network of queues • Generally 80% utilization is critical • Finite queues mean data may be lost
Effects of Congestion • Packets arriving are stored at input buffers • Routing decision made • Packet moves to output buffer • Packets queued for output transmitted as fast as possible • Statistical time division multiplexing • If packets arrive too fast to be routed, or to be output, buffers will fill • May have to discard packets • Can use flow control • Can propagate congestion through network
Ideal Network Utilization Power = throughput/delay
Practical Performance • Ideal assumes infinite buffers and no overhead • Buffers are finite • Overheads occur in exchanging congestion control messages
Backpressure • If node becomes congested it can slow down or halt flow of packets from other nodes • May mean that other nodes have to apply control on incoming packet rates • Propagates back to source • Can restrict to logical connections generating most traffic • Used in connection oriented networks that allow hop by hop congestion control (e.g. X.25)
Choke Packet • Control packet • Generated at congested node • Sent to source node • e.g. ICMP source quench • From router or destination • Source cuts back until no more source quench message • Sent for every discarded packet, or anticipated • Rather crude mechanism
Implicit Congestion Signaling • Transmission delay may increase with congestion • Packet may be discarded • Source can detect these as implicit indications of congestion • Useful on connectionless (datagram) networks • e.g. IP based • (TCP includes congestion and flow control - see chapter 20) • Used in frame relay LAPF
Explicit Congestion Signaling • Network alerts end systems of increasing congestion • End systems take steps to reduce offered load • Backwards • Congestion avoidance in opposite direction (toward the source) • Forwards • Congestion avoidance in same direction (toward destination) • The destination will echo the signal back to the source • or the upper layer protocol will do some flow control
Categories of Explicit Signaling • Binary • A bit set in a packet indicates congestion • Credit based • Indicates how many packets source may send • Common for end to end flow control • Rate based • Supply explicit data rate limit • e.g. ATM
Traffic Management • Fairness • Quality of service • May want different treatment for different connections • Reservations • e.g. ATM • Traffic contract between user and network
Congestion Control in Packet Switched Networks • Send control packet (e.g. choke packet) to some or all source nodes • Requires additional traffic during congestion • Rely on routing information • May react too quickly • End to end probe packets • Adds to overhead • Add congestion info to packets as they cross nodes • Either backwards or forwards
Frame Relay Congestion Control • Minimize discards • Maintain agreed QoS • Minimize probability of one end user monopoly • Simple to implement • Little overhead on network or user • Create minimal additional traffic • Distribute resources fairly • Limit spread of congestion • Operate effectively regardless of traffic flow • Minimum impact on other systems • Minimize variance in QoS
Techniques • Discard strategy • Congestion avoidance • Explicit signaling • Congestion recovery • Implicit signaling mechanism
Traffic Rate Management • Must discard frames to cope with congestion • Arbitrarily, no regard for source • No reward for restraint so end systems transmit as fast as possible • Committed information rate (CIR) • Data in excess of this rate is liable to discard • Not guaranteed • Aggregate CIR should not exceed physical data rate • Committed burst size (Bc) • Excess burst size (Be)
Explicit Signaling • Network alerts end systems of growing congestion • Backward explicit congestion notification • Forward explicit congestion notification • Frame handler monitors its queues • May notify some or all logical connections • User response • Reduce rate
Introduction • TCP Flow Control • TCP Congestion Control • Performance of TCP over ATM
TCP Flow Control • Uses a form of sliding window • Differs from mechanism used in LLC, HDLC, X.25, and others: • Decouples acknowledgement of received data units from granting permission to send more • TCP’s flow control is known as a credit allocation scheme: • Each transmitted octet is considered to have a sequence number
TCP Header Fields for Flow Control • Sequence number (SN) of first octet in data segment • Acknowledgement number (AN) • Window (W) • Acknowledgement contains AN = i, W = j: • Octets through SN = i - 1 acknowledged • Permission is granted to send W = j more octets, i.e., octets i through i + j - 1
Credit Allocation is Flexible Suppose last message B issued was AN = i, W = j • To increase credit to k (k > j) when no new data, B issues AN = i, W = k • To acknowledge segment containing m octets (m < j), B issues AN = i + m, W = j - m
Credit Policy • Receiver needs a policy for how much credit to give sender • Conservative approach: grant credit up to limit of available buffer space • May limit throughput in long-delay situations • Optimistic approach: grant credit based on expectation of freeing space before data arrives
Effect of Window Size W = TCP window size (octets) R = Data rate (bps) at TCP source D = Propagation delay (seconds) • After TCP source begins transmitting, it takes D seconds for first octet to arrive, and D seconds for acknowledgement to return • TCP source could transmit at most 2RD bits, or RD/4 octets