260 likes | 383 Views
Performance Engineering of the WWW: Application to Dimensioning and Caching. Jean-Crysostome Bolot Philipp Hoschka INRIA. Main Ideas. Focuses on applying analytic techniques to study performance issues of the WWW Dimensioning Caching Main contributions:
E N D
Performance Engineering of the WWW:Application to Dimensioning and Caching Jean-Crysostome Bolot Philipp Hoschka INRIA
Main Ideas • Focuses on applying analytic techniques to study performance issues of the WWW • Dimensioning • Caching • Main contributions: • Show the use of time series analysis to model Web traffic and forecasting Web server loads • Show that cache replacement algorithms other than LRU and size-based should be used
Dimensioning and caching what? • WWW = distributed data service • WWW Quality = Data access quality • Access time = Users’ utility function • Decompose access time into: • Time the request takes to reach the server • Time the server takes to process the request • Time the reply takes to get to the client • Time the client takes to process the reply
Use of Time Series for Web Dimensioning • A time series is a sequence of observations of a random process taken sequentially in time • Data can be measured in an active or passive manner • An intrinsic feature of a time series is that typically there are dependencies among adjacent observations • Time series analysis is a large body of principles designed to study such dependencies • To study time series we need: • identify a model • fit the model • validate the model
Time Series Modeling • Let’s say we measure round-trip times for packets • Constructing a time series model involves expressing rrtn in terms of previous observations rttn-i and some error noise en • Noise processes are assumed to be uncorrelated with mean 0 and finite variance (simplest models) • For n > 0, rttn = f({rttn-i},{en-i}, i >= 0) • Most common models in the literature are linear models, the most known being: Autoregressive (AR), the MovingAverage (MA), and the AutoregressiveMoving Average (ARMA)
AR Models • In this model, the current value of the process is expressed as a finite, linear aggregate of previous values of the process and an error term et • Ex: zt = values of a process at time trttt = f1*rttt-1 + f2*rttt-2 +...+ fp*rttt-p • Why Autoregressive?
MA Models • In these models zt is expressed as a linearly dependent combination of a finite number q of previous error terms (e’s): • rttt = q0et + q1*et-1 + q2*et-2+ ...+qq*et-q • It is called an MA process of order q • The weights not need to add to 1 or be positive
Mixed AR-MA Models • In order to get more flexibility in fitting actual time series, it is sometimes convenient to include both AR and MA terms in the same model:rttt = f1*rttt-1 + f2*rttt-2 +...+ fp*rttt-p - q0*et - q1*et-1 - ...- qq*et-q • The model uses p + q + 2 unknown parameters • ARMA models have been widely used for: video traffic, call requests in telephone networks, and memory references software systems • Not widely used in computer networking
ARIMA Models • The ARMA model fitting assumes the underlying stochastic process to be stationary • Recall: a stationary process is one that remains in equilibrium about a constant mean level • Many “real-life” time series are nonstationary • Nonstationary processes do not oscillate around a specific mean • However, they can exhibit homogeneous behavior
ARIMA Models • The most common approach to deal with them: use two models: • One for the nonstationary part • One for the stationary residual part • Nonstationary time series can be modeled with an integrated model as the ARIMA model • An ARIMA model of order (p,d,q) is an ARMA model of order (p,q) that has been differenced d times
Seasonal ARIMA Models • For other nonstationary series, “plain” ARIMA models can not be used • The most common of such series are the seasonal trends • Ex: The data for a particular hour in a month-long trace is typically correlated with the hours preceding it as well as with the same hours in preceding days • We can deal with them using seasonal ARIMA models referred to as ARIMA (p,d,q) x (P,D,Q)s • Idea: Carry out two models. One for the entire time series and another only for data points that are s units apart
After Selecting the model... • Once a model has been selected we have: • identification (select the values for p and q) • Estimation (estimate the values of f1,...,fp, q0,...,qq) • Evaluation of diagnostic checking • Now we have a model, use it to forecast (predict) future values of the process
Application to Web Analysis • Use these models to analyze several data sets from their Web servers: • Number of requests handled • Size of the replies (Mbytes) • Want to study variations at the hour granularity (consider averages over a month interval) • Main point: there are strong seasonal variations (daily cycles) • Also, observe trend reflecting the growing number of requests handled by the server over the past year • Used seasonal ARIMA model (2,2,1)x(3,2,0)24
Using the Model to Predict • Want to forecast the number of requests received by the server (important for dimensioning the server) • Forecasting problem: Given a series of values observed up to time n, predict the value of the process at some specific time in the future minimizing some prediction error • Found that the ARIMA-based forecasting provides reasonable accurate short- and medium-term predictions • More accurate medium- and long-term predictions are limited because of limited available trace data
Efficient Cache Replacement Algorithms for Web Proxies • Avoid overloads? => control the amount of data pumped into the network => minimize distant requests => caching • Proxy caching is good iff clients exhibit enough temporal locality in accessing documents • Also, small files are requested more often • Good replacement algorithms are needed • Typically used: LRU and Size-based
Caching Algorithms • Cache algorithms are compared in terms of: • Miss ratio • Normalized Resolution time • The lower the miss ratio the lower the amount of data going through the network • User don’t care about miss ratios: they want low response times • Quantify quality of algorithm in terms of normalized resolution time T: the ratio of the average resolution time with and without the cache
Cache Algorithms • Let p: miss probability, Tc: Average time to access cache entry Tnc: Average time to access a document not in the cache • Then: Tc + p*Tnc T = --------------- Tnc • Assuming Tnc >> Tc, T~p • T is minimized when p is minimized
Cache Algorithms • The above statement seems to argue for large caches • However, • cache size is limited • the miss ratio is related to the size of the documents stored in the cache • For a given cache size, the number of cached documents, and hence the hit ratio, decrease as the document size increases • Small files are more often requested • This observations have led to algorithms that take into account not only temporal locality but also document sizes
Cache Algorithms • Surprisingly, no cache algorithm takes as an input parameter the time it took to retrieve a given document • A Web cache replacement algorithm should take into account the retrieval time associated with each document in the cache • One way to achieve this: assign weights to documents and use a weight-based replacement algorithm • The weights might be a function of: • the time to last reference the item • the time it took to retrieve the item • the expected time-to-live for the item • the size of the document, etc.
Selecting A Replacement Algorithm • The problem can be associated with a function that given: • The state s of the cache, and • A newly retrieved document • Decides the following: • Should the retrieved document be cached? • If yes and no space available, which existing entry to discard? • State s of the cache: • The set of documents stored • For each document, a set of sate variables which typically include statistical information associated with the document
Selecting a Replacement Algorithm • Examples of state variables for a document: • ti: time since document was last referenced • Si: Size of the document • rtti: Time it took to retrieve the document • ttli: Time-to-live for the document • Idea: to assign a weight to each cached document: Wi = W(ti, Si, rtti, ttli) • This weight function can be specialized to obtain commonly used algorithms
Selecting A Replacement Algorithm • They suggest the following function for Web proxies:W(ti, Si, rtti, ttli) = (w1*rtti + w2*Si)/ ttli + (w3 + w4*Si)/ ti • The second term captures temporal locality and the first term captures the cost associated with retrieving documents • The multiplying factor 1/ttli indicates: the cost associated with retrieving a document increases as the useful lifetime of the document decreases
Cache Coherence in Web Proxies • Usually remote sites exporting time-critical pages, associate with them time-to-live values • Not commonly used => don’t use ttli • Instead:W(ti, Si, rtti, ttli) = w1*rtti + w2*Si + (w3 + w4*Si)/ ti • To select the wi values, must consider overall goals: • Maximize hit ratio? • Minimize perceived retrieval time for random user? • Minimize the cache size for a given hit ratio? • Etc.
Testing Two Algorithms • Use trace driven simulations to compare the performance of different schemes: LRU and a scheme that takes into account all state variables • Algorithm 1: W(ti, Si, rtti, ttli) = 1/ti • Algorithm 2: W(ti, Si, rtti, ttli) = w1*rtti + w2*Si + (w3 + w4*Si)/ tiParameters: • w1: 5000 b/s • w2: 1000 • w3: 1000 b/s • w4: 10 s
Testing two Algorithms • To compare, the following performance measures are used: • Miss Ratio • Weighted miss ratio: probability that a document is not in the cache multiplied by the document’s weight • Miss ratio for algorithm 1 is slightly lower than the miss ratio for algorithm 2 • Weighted miss ratio (and hence the perceived retrieval time) is much lower for algorithm 2 • It is good to use a cache replacement algorithm that takes retrieval time into consideration