270 likes | 441 Views
Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. Author: Li Fan, et al. Print: JSAC 98 Presentation: Wonyoung Park [Communication Networks Research Lab]. Introduction. Web cache sharing Reduce Web traffic Alleviate network bottlenecks. Web Cache Sharing. Harvest project
E N D
Summary Cache:A Scalable Wide-Area Web Cache Sharing Protocol Author: Li Fan, et al. Print: JSAC 98 Presentation: Wonyoung Park [Communication Networks Research Lab]
Introduction • Web cache sharing • Reduce Web traffic • Alleviate network bottlenecks
Web Cache Sharing • Harvest project • Design the Internet Cache Protocol (ICP) • Today, hierarchies of proxy caches are established via ICP
ICP • The proxy multicastsa query message to all other proxies whenever a cache miss occurs. Proxy Server GET /index.html Proxy Server ICP Query ICP Response Proxy Server Proxy Server
Problem of ICP • The Overhead of the ICP protocol • A proxy multicasts a query message to all other proxies whenever a cache miss occurs • NOT scalable • O(n2)
Summary Cache • When a cache miss occurs, the proxy probe all the summary and then fetch the document from other proxy Proxy Server query GET /index.html or fail (false positive) Proxy Server Proxy Server Cache miss! Summary says what proxy server has the document Proxy Server
Summary Cache (1) • Each proxy keeps a compact summary of a cache directory of every other proxy • Summary update is aperiodic • Implement as an enhancement of the ICP protocol • ICP version 1.1
Summary Cache (2) • Advantage • Reduce network bandwidth, CPU assumption (vs. the ICP protocol) • Scalable • Not so bad compared to the ICP protocol
Summary Cache http://www.lakuyo.com/muchima.htm 32bit Hash function 4 32bit Hash function 1 MD5 101101110101010111100 …………..… 010111 128bit
How to make a summary MD5 result - 128bit 101101110101010111100 …………..… 010111 10…111 10…111 10…111 10…111 1 1 1 4개의 해시 함수에 따라서 해당하는 summary 비트를 1로 set
Operation • 찾고자 하는 URL에 따라서 해당하는 비트를 찾음 • 각 비트가 1로 세트되어 있는지 확인 • 하나라도 0으로 되어 있으면 실패 • 찾는 비트가 모두 1일 때 • 제대로 찾는 경우(확률 높음) • false positive • 표시되어 있지만 실제 존재하지 않는 경우 • 다른 URL들에 의해서 그 비트가 set됐을 경우
Summary • Bit size • 8, 16, 32 times of the average number of document 101101110101010111100 …………..… 010111 ..01.. 1 , if count(All hash func(Every URL))>0 0 , else
Bloom Filter • A hashing technique • m bit • k independent hashing function • many to one mapping • “false positive”
Optimal parameter for Bloom Filter • false positive를 최소화 하는 파라미터 • 많은 비트를 할당할 수록 좋은 성능 • 그러나 많은 메모리 필요
Probability of false positives • upper graph: for 4 hash functions • lower graph: optimal integral number of hash functions
Factors influencing Performance • Update delay • Until the percentage of cached documents that are “new” reaches a threshold • Simulation: 1% ~ 10% • Summary representation • size of memory • exact directory : use URL • server-name : too many false hit • bloom filter parameters
Result • Have virtually the same cache hit ratio as the exact-directory approach • Reduce the number of messages by a factor of 25 to 60 • Reduce the bandwidth consumption by over 50% • Eliminates between 30% to 95% of the CPU overhead • Maintain almost the same hit ratio as ICP
Term • stale hit • 다른 캐시에 자료가 있는 줄 알고 요청했으나 캐시 자료가 오래된 것이라 쓸모 없는 경우 • false hit • 없는데 있다고 Bloom filter가 착각 • exact_dir • bloom_filter를 쓰지 않고 정확하게 했을 때 • 한 캐시의 내용이 바뀌면 다른 캐시도 동시에 알게 됨 • bloom_filter는 false hit있음 • bloom_filter_숫자 • 한 URL당 ‘숫자’만큼 비트를 할당 • 예) 100개 URL 용량 – 8일 경우 800bit
Ratio of false hits under different summary representations • 없는데 있다고 하는 경우
Number of network messages per user request under different summary forms
Bytes of network messages per user request under different summary forms
Implementation • Summary-Cache Enhanced ICP • Included in ICP version 2 • ICP_OP_DIRUPDATE • directory update message • Additional header • 16bit: Function_Num • 16bit: Function_Bits • 32bit: BitArray_Size_InBits • 32bit: Number_of_Updates • set/unset bit • a list of 32bit integers • 1 bit : set/unset • 31bit: index of the bit that need to be changed
Performance Experiment • Performance of ICP and Summary-Cache for UPisa trace • The enhanced ICP protocol reduces the network bandwidth and CPU overhead significantly while only slightly decreasing the total hit ratio • Lowers the client latency compared to no ICP
Conclusion • Proposed the Summary-Cache enhanced ICP • A scalable wide-area Web cache sharing protocol • Simulation and Measurement were done • Two key aspects of this approach • the effects of delayed updates • the succinct representation of summaries
Critique • Good approach! • Static page only • May need another protocol for dynamic pages • Use a URL to make a summary • Any other parameter? / HTTP/1.1 • Does not account for cache replacement policy – e.g) LRU