220 likes | 233 Views
Squirrel: A decentralized peer-to-peer web cache. Paul Burstein 10/27/2003. Outline. Overview Design Evaluation Discussion. Traditional Web Caching. Goals Reduce browser latency Reduce aggregate bandwidth Reduce load on web servers Deployment Dedicated centralized machines
E N D
Squirrel: A decentralized peer-to-peer web cache Paul Burstein 10/27/2003
Outline • Overview • Design • Evaluation • Discussion
Traditional Web Caching • Goals • Reduce browser latency • Reduce aggregate bandwidth • Reduce load on web servers • Deployment • Dedicated centralized machines • Placed at local network boundaries
Squirrel Web Caching • Decentralized caching • Desktops cooperate in a peer-to-peer fashion • Mutual sharing between hosts • Hosts browse and cache
Centralized Dedicated Hardware Cost Administration Handling load bursts Single point of failure Decentralized No additional hardware More users more resources Automatic scaling Self organizing Easy deployment Pros
Assumptions • Cooperative hosts • No security issues • Link and node failures • Nodes are in single geographic location • Low internal network latencies
Outline • Overview • Design • Evaluation • Discussion
Design Goals • Target environment: 100 - 100,000 machines • Goal: Achieve performance comparable to centralized cache
Design Overview • Built on top of Pastry • Objects have 128-bit objectIds • SHA-1 hash of URL • Mapped to home node with closest nodeId • Requests: • GET – new request • cGET – conditional • Two schemes • Home-store • Directory
Home-store • Objects stored at client cache and home node • External requests come through home node • Cache replacement • All objects are considered • home node fresh • home node stale
Directory • Home node keeps a directory of pointers • Randomly redirect to delegates • no directory, add new delegate • cGET not modified • delegate fresh, get from delegate • cGET and stale, update • GET and stale, update
Outline • Overview • Design • Evaluation • Discussion
Evaluation Characteristics • Compare two schemes and dedicated cache • Performance • Latency • External bandwidth • Hit ratio • Overhead • Load • Storage • Fault Tolerance
Bandwidth and Hit ratio • Bytes transferred to origin servers and back • correlated with hit rate • Centralized cache with infinite storage • 100MB cache per node achieves optimal rates • 10MB in-memory cache is reasonable • Directory scheme • Active nodes suffer from eviction • Distributed LRU is worse than centralized • Home-store • More total storage required
Latency • User-perceived time for a response • With comparable hit ratios, only consider internal hops • Many requests can be satisfied locally, with 0 hops • Directory scheme latency is up to one hop greater • Some requests can be satisfied by home node • Squirrel Latency • Based on Pastry hops on cache hit • Overshadowed on cache miss
Load on Nodes(1/2) • Bursty behavior observations • Max objects served per second • Up to 48 and 55 objects per second served for the two traces • Directory scheme • One delegate can get bombarded with requests from many home nodes • Home-store scheme • Replicate objects at request threshold
Load on Nodes(2/2) • Sustained load measurements • Max objects/minute • Average load in any second or minute: • 0.31 objects/minute • Redmond trace, both models
Fault Tolerance • Internet connection loss • Internal partitioning • Individual failure • Desktop shutdown or reboot • Graceful shutdown • Pastry aided content transfer • Directory scheme • More vulnerable to failures
Results • The home-store models seems to outperform the directory model • Hit ratio • Load balancing • Internal network latency • Compared to centralized cache?
Outline • Overview • Design • Evaluation • Discussion
Discussion • Would this be deployed in a corporate network?