360 likes | 433 Views
Queuing Theory. Web Performance. We talked about the importance of performance in web applications It is not easy to separate the different components of response time It is important to be able to make predictions from information that is easy to gather. Approaches to performance analysis.
E N D
Web Performance • We talked about the importance of performance in web applications • It is not easy to separate the different components of response time • It is important to be able to make predictions from information that is easy to gather.
Approaches to performance analysis • Try new configuration and observe results • Scale up from prior results • Arrival rate .5, mean 43, max 627 • Arrival rate .95, mean 1859, max 4285 • Arrival rate .99, mean 2583, max 5068 • Develop queuing theoretic model • Run a simulation
Queue Statistics • Arrival rate = load - Poisson - • Packet length Distribution - exponential • E(m) expected value (mean) of random variable • Ts=Mean service time, Tq=Mean queue time • Standard Deviation of service time • Link utilization
Practical Example Tw =Tr =Ts
Kendall Notation • A/S/NS/B/K/SD • A,S=Interarrival time, service time distribution • M = Exponential • Ek = Erlang • Hk = Hyperexponential • D = Deterministic • NS=Number of servers • B=Number of Buffers • K=Population size
Kendall Notation • SD=Service Discipline • FCFS,FCLS… • Defaults B= , K= SD=FCFS • M/M/1 = M/M/1/ / /FCFS
Multiserver queue Model Single web server forwards reqests to cluster
Multiple Single-server queues DNS rotation for cluster
Fixed Packets • Average transmission time • = average packets serviced/sec = capacity n Buffer Arriving Packets Departing Packets
Exponential Distribution • The cumulative distribution function F(x) and probability density function f(x) are: • In queuing theory, we often assume the service time follows an exponential distribution. • The service time corresponds to the packet transmission time and is proportional to the packet length. • When standard deviation is equal to mean, we estimate it to be exponential
Poisson Distribution It can be shown that for Poisson arrival process, the sequence of interarrival times Tn are independent identically distributed exponential random variables having mean 1/λ
Markov Models - analyze state n(n buffers are occupied) • n+1 • n • n-1 departure Buffer Occupancy • n arrival 0 1 ... n-1 n n+1
Probability of being in state n one arrived and one departed None arrived or departed None arrived, one departed one arrived, none departed
0 1 ... n-1 n n+1 Markov Chains
Substituting P1 • Higher states have decreasing probability • Higher utilization causes higher probability • of higher states
What about P0 Queue determined by
Little’s Law • Applies to systems where no jobs are lost or created • Arrival rate = arrivals/total time=N/Tt • Mean time in system = J/N • Mean number in system=
J is the shaded areaExamine time up to D6=19 T1=3 T2=7 T3=10 T4=6 T5=6 T6=6 J=38 Arrival Rate = N/Tt=6/19=.31 Mean Time in system = J/N = 38/6=6.3 Mean number in system = J/Tt=38/19=2 = (J/N)*(N/Tt)=6.3*.31=2 =Tq
Throughput • Throughput=utilization/service time • Throughput =/Ts • For =.5 and Ts=1ms • Throughput is 500 packets/sec
Single Server Ts=0 (Ts)/Ts=1 (Ts)/Ts>>1
Example • Given things that you can easily measure • Arrival rate of 1000 packets/sec • Service capacity of 1100 packets/sec • Find • Utilization • Probability of having 4 buffers filled
Examples • Web Site with 100 Clients and 1 server. • Service time .6sec and = (exponential) • What is average response time at 20 queries/min=1q/3sec? • Use M/M/1 model • =Ts=(1/3)(.6)=0.2 • Tq=Ts/(1-)=.6/.8=.75sec • If 1.5 seconds is too long, what utilization is allowable (90% of responses are less than 1.5sec) • mTr(r)=Tr*ln(100/(100-r)) • mTr(90)=Tr*ln(10)=[Ts/(1-)]*2.3=1.5 sec • 2.3 Ts=(1-)1.5 sec, 1.5 = 1.5-2.3*.6 , =.08 • The utilization must actually be reduced
Multi-server System with separate queues (DNS Rotation) • five processors (5 M/M/1), average service time= 0.1 sec. • Assume that the standard deviation of service time is observed to be 0.094 sec (exponential service time). • Web page hits= 40 per second. • Separate Queue Approach • If processes are evenly distributed among the processors, then the load for each processor is • 40/5 = 8 requests per second. Thus, • r = Ts= 8 0.1 = 0.8 • The residence time is then easily calculated: • tr= Ts/(1- r)=.1/.2=.5 seconds
Single Queue - Multiple server • (M/M/5) aggregate arrival rate of 40 processes per second. Utilization is still =( Ts /5) = 0.8 • To calculate the residence time first calculate Erlang C function, table lookup, =0.8 for 5 servers C = 0.554 • Tr= (C/N)(Ts/(1- ))+ Ts= ((0.544)(0.1))/(5(1- 0.8))+ (0.1) = 0.1544 • So the use of multiserver single queue has reduced average residence time from 0.5 sec down to 0.1544 sec, which is greater than a factor of 3. If we look at just the waiting time, the single queue case is 0.0544 seconds compared to 0.4 seconds, for multiple queues which is a factor of 7.
Another Example • Arrival rate • = 125 requests/sec • Service rate • =1/0.002 = 500 requests/sec • Gateway utilization • = / =0.25 • Probability of n packets in gateway = • (1- ) n =0.75*(.25)n • Mean time spent in web server = • Ts/(1- )=.002/(1-.25)=2.66ms
Aggregation Results • If incoming distributions are exponential, output distribution will also be exponential. • The sum will tend to be smoothed (smaller variance) • These results do not apply to self-similar distributions.