Web Performance Analysis Using Queuing Theory: Predictions and Optimization

Queuing Theory

Web Performance • We talked about the importance of performance in web applications • It is not easy to separate the different components of response time • It is important to be able to make predictions from information that is easy to gather.

Approaches to performance analysis • Try new configuration and observe results • Scale up from prior results • Arrival rate .5, mean 43, max 627 • Arrival rate .95, mean 1859, max 4285 • Arrival rate .99, mean 2583, max 5068 • Develop queuing theoretic model • Run a simulation

Predicting response from past experience

Queue Model

Queue Statistics • Arrival rate = load - Poisson - • Packet length Distribution - exponential • E(m) expected value (mean) of random variable • Ts=Mean service time, Tq=Mean queue time • Standard Deviation of service time • Link utilization

Practical Example Tw =Tr =Ts

Kendall Notation • A/S/NS/B/K/SD • A,S=Interarrival time, service time distribution • M = Exponential • Ek = Erlang • Hk = Hyperexponential • D = Deterministic • NS=Number of servers • B=Number of Buffers • K=Population size

Kendall Notation • SD=Service Discipline • FCFS,FCLS… • Defaults B=  , K=  SD=FCFS • M/M/1 = M/M/1/  /  /FCFS

Multiserver queue Model Single web server forwards reqests to cluster

Multiple Single-server queues DNS rotation for cluster

Fixed Packets • Average transmission time •  = average packets serviced/sec = capacity n Buffer Arriving Packets Departing Packets

Exponential Distribution • The cumulative distribution function F(x) and probability density function f(x) are: • In queuing theory, we often assume the service time follows an exponential distribution. • The service time corresponds to the packet transmission time and is proportional to the packet length. • When standard deviation is equal to mean, we estimate it to be exponential

Poisson Distribution It can be shown that for Poisson arrival process, the sequence of interarrival times Tn are independent identically distributed exponential random variables having mean 1/λ

M/M/1 queue

Markov Models - analyze state n(n buffers are occupied) • n+1 • n • n-1 departure Buffer Occupancy • n arrival 0 1 ... n-1 n n+1

Probability of being in state n one arrived and one departed None arrived or departed None arrived, one departed one arrived, none departed

0 1 ... n-1 n n+1 Markov Chains

Substituting Utilization

Substituting P1 • Higher states have decreasing probability • Higher utilization causes higher probability • of higher states

What about P0 Queue determined by

Little’s Law • Applies to systems where no jobs are lost or created • Arrival rate = arrivals/total time=N/Tt • Mean time in system = J/N • Mean number in system=

J is the shaded areaExamine time up to D6=19 T1=3 T2=7 T3=10 T4=6 T5=6 T6=6 J=38 Arrival Rate = N/Tt=6/19=.31 Mean Time in system = J/N = 38/6=6.3 Mean number in system = J/Tt=38/19=2 = (J/N)*(N/Tt)=6.3*.31=2 =Tq

Throughput • Throughput=utilization/service time • Throughput =/Ts • For =.5 and Ts=1ms • Throughput is 500 packets/sec

Single Server Ts=0 (Ts)/Ts=1 (Ts)/Ts>>1

Example • Given things that you can easily measure • Arrival rate of 1000 packets/sec • Service capacity of 1100 packets/sec • Find • Utilization • Probability of having 4 buffers filled

Example

Examples • Web Site with 100 Clients and 1 server. • Service time .6sec and = (exponential) • What is average response time at 20 queries/min=1q/3sec? • Use M/M/1 model • =Ts=(1/3)(.6)=0.2 • Tq=Ts/(1-)=.6/.8=.75sec • If 1.5 seconds is too long, what utilization is allowable (90% of responses are less than 1.5sec) • mTr(r)=Tr*ln(100/(100-r)) • mTr(90)=Tr*ln(10)=[Ts/(1-)]*2.3=1.5 sec • 2.3 Ts=(1-)1.5 sec, 1.5  = 1.5-2.3*.6 , =.08 • The utilization must actually be reduced

Multi-server System with separate queues (DNS Rotation) • five processors (5 M/M/1), average service time= 0.1 sec. • Assume that the standard deviation of service time is observed to be 0.094 sec (exponential service time). • Web page hits= 40 per second. • Separate Queue Approach • If processes are evenly distributed among the processors, then the load for each processor is • 40/5 = 8 requests per second. Thus, • r = Ts= 8  0.1 = 0.8 • The residence time is then easily calculated: • tr= Ts/(1- r)=.1/.2=.5 seconds

Single Queue - Multiple server • (M/M/5) aggregate arrival rate of 40 processes per second. Utilization is still =( Ts /5) = 0.8 • To calculate the residence time first calculate Erlang C function, table lookup,  =0.8 for 5 servers C = 0.554 • Tr= (C/N)(Ts/(1- ))+ Ts= ((0.544)(0.1))/(5(1- 0.8))+ (0.1) = 0.1544 • So the use of multiserver single queue has reduced average residence time from 0.5 sec down to 0.1544 sec, which is greater than a factor of 3. If we look at just the waiting time, the single queue case is 0.0544 seconds compared to 0.4 seconds, for multiple queues which is a factor of 7.

Another Example • Arrival rate •  = 125 requests/sec • Service rate • =1/0.002 = 500 requests/sec • Gateway utilization • = / =0.25 • Probability of n packets in gateway = • (1- ) n =0.75*(.25)n • Mean time spent in web server = • Ts/(1- )=.002/(1-.25)=2.66ms

Aggregation Results • If incoming distributions are exponential, output distribution will also be exponential. • The sum will tend to be smoothed (smaller variance) • These results do not apply to self-similar distributions.

Web Performance Analysis Using Queuing Theory: Predictions and Optimization

Web Performance Analysis Using Queuing Theory: Predictions and Optimization

Presentation Transcript

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

QUEUING THEORY

Queuing Theory

Queuing Theory

Queuing Theory

Queuing Theory

Queuing theory

Queuing Theory

Queuing Theory

Queuing Theory

QUEUING THEORY