1 / 20

Web Cache Replacement Policies: Properties, Limitations and Implications

Web Cache Replacement Policies: Properties, Limitations and Implications. Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida. Computer Science Department Federal University of Minas Gerais Brazil. Summary. Introduction to Web caching Motivations and goals

emanley
Download Presentation

Web Cache Replacement Policies: Properties, Limitations and Implications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Web Cache Replacement Policies: Properties, Limitations and Implications Fabrício Benevenuto, Fernando Duarte, Virgílio Almeida, Jussara Almeida Computer Science Department Federal University of Minas Gerais Brazil

  2. Summary • Introduction to Web caching • Motivations and goals • Evaluation methodology • Performance metrics • Workload description • Caching system simulator • Experimental results • Conclusions and future work

  3. Web Caching • Dramatic growth of the WWW in terms of content, users, servers and complexity • Web caching is a common strategy used to: • reduce the traffic over Internet • increase server scalability • diminish the latency in the network • Use of caching by the deployment of Web Proxies

  4. Servers Proxies Clients Web Caching • Web proxies can be seen as intermediaries of the traffic between the HTTP clients and servers • Nowadays the Web has a hierarchical topology:

  5. Web Caching • Cache replacement is one of the issues that a proxy should be able to manage: • As the cache has finite size, when it is full, how does a proxy choose a page to remove from its cache? • A lot of research has been done to address this question and several cache replacement policies can be found in the literature • Key questions: • Is the design of new cache replacement policies needed? • What are the properties that new policies should take advantage of to improve a caching system?

  6. Goals Investigate how much a new caching policy could improve cache system performance Explore the main causes of periods of poor and high performance in caching systems

  7. Evaluation Methodology • Evaluation of different metrics over time: • Hit Ratio • Percentage of first-timers • Maximum improvement • Entropy • Time intervals of 1, 10 and 100 minutes • Use of real workloads

  8. Performance Metric: Hit Ratio • Hit ratio is the percentage of requests satisfied by the cache • It is most general metric used to evaluate the effectiveness of a caching policy • Measuring hit ratio over time to detect periods of variations of performance

  9. Performance Metric: Percentage of First-Timers • Caching policies cannot satisfy first-timers • the first-timer has never been requested in the past • First-timer is the first request for an object of the trace.

  10. Performance Metric: Maximum Improvement • We evaluate the maximum hit ratio a new caching policy can improve over the simple LRU policy • The maximum improvement MI is defined as: • Maximum improvement over LRU:

  11. Performance Metric: Entropy • Taking n distinct objects with probability pi of occurrence, the entropy H(X) of a request stream is calculated as: • Entropy measures the concentration of popularity of a request stream • The higher the value of the entropy, the lower the concentration of popularity • Caching policies should keep objects with high probability of being referenced in the near future

  12. Performance Metric: Entropy • Entropy depends on the number of distinct objects • Use of the normalized entropy HN: • Investigate the influence of popularity on caching performance

  13. Experiment Setup • Real traces from proxy caches located at two points of the Web topology: • Closer to clients: Federal University of Minas Gerais (UFMG) • Closer to servers: National Laboratory for Applied Network Research (NLANR) • Cache Size: 10% of the number of distinct objects • Replacement caching policy: Simple LRU

  14. Workload Description • Traces used • Cache warming: University 1, NLANR 1 • Performance evaluation: University 2, NLANR 2 • Higher concentration of popularity on university traces (lower entropy) • Larger fraction of different objects in the NLANR traces, what diminish significantly the caching performance

  15. Experimental Results: Hit Ratio proxy closer to clients proxy closer to servers • Higher hit ratio for University trace • Strong variation along the time • What are the factors that causes the variations on hit ratio?

  16. Experimental Results: Percentage of First-Timers proxy closer to clients proxy closer to servers • Smaller % of first-timers at the proxy closer to clients • Correlation coefficient between hit ratio and the percentage of first-timers: • -0.857 for the NLANR and -0.962 for the university • Caching policies cannot satisfy first-timers, the most important factor for poor and good performance in the analyzed traces

  17. Experimental Results: Entropy proxy closer to clients proxy closer to servers • Proxy closer to clients: lower entropy → higher concentration of popularity • LRU policy does not take advantage of all locality of reference • Correlation coefficient between hit ratio and entropy: • -0.787 for the NLANR and -0.453 for the university • If we had a caching policy able to filter all the locality (entropy = 1), how much could hit ratio be improved?

  18. Experimental Results: Maximum Improvement proxy closer to clients proxy closer to servers • The hit ratio cannot be significantly improved for the trace closer to clients • High number of first-timers diminishing the hit ratio • Improving caching performance • Reorganization of the hierarchy of caches (cache placement) • Caching system able to deal with the first-timers

  19. Conclusions and Future Work • Summary of main findings • Strong variation of hit ratio along the time • High number of first-timers (higher close to servers) • Main cause of low hit ratio • LRU policy is not able to filter the entire locality of a stream • Small correlation with hit ratio • The maximum improvement we could obtain over LRU: • less than 5 percent closer to clients • In average 25 percent closer to servers • Results suggest reorganization of cache topology and a caching system able to deal with the higher number of first-timers • Future work • Cache placement: find the optimal cache organization in order to improve the overall system performance • Auto-adaptive cache system able to minimize periods of poor performance

  20. Questions? Fabricio Benevenuto, Fernando Duarte, Virgilio Almeida, Jussara Almeida {fabricio, fernando, virgilio, jussara}@dcc.ufmg.br

More Related