200 likes | 313 Views
Evaluating Content Management Techniques for Web Proxy Caches. Martin Arlitt, Ludmila Cherkasova, John Diley, Rich Friedrich and Tai Jin (Hewlett-Packard Laboratories) ( in 2 nd Workshop on Internet Server Performance, in conjunction with ACM SIGMETRICS 99 ).
E N D
Evaluating Content Management Techniques for Web Proxy Caches Internet Server Martin Arlitt, Ludmila Cherkasova, John Diley, Rich Friedrich and Tai Jin (Hewlett-Packard Laboratories) (in 2nd Workshop on Internet Server Performance, in conjunction with ACM SIGMETRICS 99) Cho Joon-ho(CA Lab, CS department, KAIST) 2001 . 11. 6
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Problems • Current Web Proxy caches utilize simple replacement policies • Relatively low hit rates • Additional delays • So what? • Developing a quantitative understanding of Web traffic • How effective are current proxy cache replacement policies for real workloads? • Focus on two performance metrics • Hit rate • Byte hit rate • Designing new replacement policies • Utilize frequency for higher performance • Are neither susceptible to cache pollution nor require parameterization Evaluating Content Management Tech for Web Proxy Caches
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Quick Tour (Summary) – 1/3 • The problems of existing studies • Short-term traces of busy proxies or long-term traces of relatively inactive proxies • Long-term traces in busy environments are needed • Trace driven simulation • Collect total 117,652,652 requests during five month • Use smaller and more compact log • The points to be considered • Object size • Recency of Reference • Frequency of Reference • Turnover Evaluating Content Management Tech for Web Proxy Caches
Quick Tour (Summary) – 2/3 • Existing replacement policy • LRU (Least-Recently-Used) • Size – replaces the largest object • GD-Size (GreedyDual-Size) • Replaces the object with the lowest utility • LFU - replaces the least frequently used object • New replacement policy • GDSF(GreedyDual-Size with Frequency) • GD-Size + a frequency factor • LFU-DA (Least Frequently Used with Dynamic Aging) • LFU-Aging + a dynamic mechanism(Running age L) • Virtual Caches • Logically partitions the cache into N virtual caches Ki=Ci/Si+L Ki=Fi*Ci/Si+L Ki=Ci*Fi+L Evaluating Content Management Tech for Web Proxy Caches
Comparison of Proposed Policies to Existing Replacement Policies Quick Tour (Summary) – 3/3 Analysis of Virtual Cache Performance; VC0 using GDSF-Hits, VC1 using LFU-DA Evaluating Content Management Tech for Web Proxy Caches
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Critique • Pros • Quantitative understanding of Web traffic • Long term trace-driven simulation in busy proxy servers • Providing two new replacement algorithms that run efficiently • Providing a new cache management method, ‘Virtual Cache’ • Cons • Not fresh • No consideration of dynamic data • No consideration of processing overhead for these more complex algorithms • Performance improvements are insignificant Evaluating Content Management Tech for Web Proxy Caches
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Data Collection and Reduction • Data collection • Long term trace-driven simulation • Total 117,652,652 requests were handled during five month period • Data include • Client IP address, request time, response status, the time required for the proxy to complete its response… • Data reduction • Smaller, more compact log • Due to storage constraint • To ensure that analyses and simulations could be completed in a reasonable amount of time • Reduction by • Storing data in more efficient manner • Removing information of little value Evaluating Content Management Tech for Web Proxy Caches
Key Workload Characteristics • Cacheable Objects • Most client requests be for cacheable objects (96%) • Object Set Size • total 389GB • Object Sizes • Variable – medium : 4KB, maximum : 148MB video clip • Recency of reference • 1/3 of all re-references occurred within one hour • Frequency of reference • Web referencing patterns are non-uniform • Turnover • Objects that were once popular are no longer requested Evaluating Content Management Tech for Web Proxy Caches
Experimental Design – 1/2 • Least-Recently-Used(LRU) • Replaces the object requested least recently • Considers only a single work load characteristic • Size • Replaces the largest object • Tries to minimize the miss ratio (target to byte hit rate) • Cache pollution • GreedyDual-Size(GD-Size) • GD-Size(1) for Hit Rate • GD-Size(Packets) for Byte Hit Rate Ci – the cost associated with bringing object i into the cache Si – the object size L – a running age factor Ki=Ci/Si+L Evaluating Content Management Tech for Web Proxy Caches
Experimental Design – 2/2 • LFU • Replaces the least frequently used object • LFU-Aging = LFU + Aging → avoids cache pollution • Parameterization problem still remains • Greedy Dual-Size with Frequency(GDSF) • GD-Size doesn’t take into account frequency • Least Frequently Used with Dynamic Aging(LFU-DA) • LFU-Aging requiresparameterization to perform well • LFD-DA uses inflation factor as well as the frequency count Ki=Fi*Ci/Si+L Fi – a frequency count Ki=Ci*Fi+L L – a running age factor Evaluating Content Management Tech for Web Proxy Caches
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Figure1. Comparison of existing Replacement Policies Simulation Results – 1/2 Evaluating Content Management Tech for Web Proxy Caches
Figure2. Comparison of Proposed Policies to Existing Replacement Policies Simulation Results – 2/2 Evaluating Content Management Tech for Web Proxy Caches
Agenda • Problems • Quick Tour (Summary) • Critique • Design & Design Rationale • Data Collection and Reduction • Key Workload Characteristics • Experimental Design • Simulation Results • Virtual Cache Evaluating Content Management Tech for Web Proxy Caches
Virtual Cache – 1/2 • An approach that can focus on both of hit rate and byte hit rate simultaneously • Mechanism • Logically partitions the cache into N virtual caches • Each virtual cache(VC)is managed with its own replacement policy • Steps • Initially all objects are in VC0 • Replacements from VCi are moved to VCi+1 • Replacements from VCi+1 are evicted form the cache • When reaccessed, objects are reinserted in VC0 Evaluating Content Management Tech for Web Proxy Caches
Virtual Cache – 2/2 Figure 3. Analysis of Virtual Cache Performance; VC0 using GDSF-Hits, VC1 using LFU-DA Figure 4. Analysis of Virtual Cache Performance; VC0 using LFU-DA, VC1 using GDSF-Hits Evaluating Content Management Tech for Web Proxy Caches