300 likes | 612 Views
Squirrel: A peer-to-peer web cache. Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter Druschel. Peer-to-peer Computing. Decentralize a distributed protocol: Scalable Self-organizing Fault tolerant Load balanced Not automatic!!. Web Caching.
E N D
Squirrel: A peer-to-peer web cache Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter Druschel
Peer-to-peer Computing Decentralize a distributed protocol: • Scalable • Self-organizing • Fault tolerant • Load balanced Not automatic!!
Web Caching 1. Latency, 2. External bandwidth, 3. Server load. ISPs, Corporate network boundaries, etc. Cooperative Web Caching: group of web caches tied together and acting as one web cache.
Web Cache Browser Cache Browser Centralized Web Cache Web Server Browser Cache Browser Internet LAN Sharing!
Decentralized Web Cache Browser Cache Browser Web Server Browser Cache Browser Internet LAN • Why? • How?
Why peer-to-peer ? • Cost of dedicated web cache No additional hardware • Administrative costs Self-organizing • Scaling needs upgrading Resources grow with clients • Single point of failure Fault-tolerant by design
Setting • Corporate LAN • 100 - 100,000 desktop machines • Single physical location • Each node runs an instance of Squirrel • Sets it as the browser’s proxy
Pastry Peer-to-peer object location and routing substrate Distributed Hash Table: reliably map an object key to a live node Routes in log2b(N)steps (e.g. 3-4steps for 100,000 nodes, with b=16)
Internet LAN Home-store model client URL hash home
Home-store model client home …that’s how it works!
Directory model Client nodes always store objects in local caches. Main difference between the two schemes: whether the home node also stores the object. In the directory model, it only stores pointers to recent clients, and forwards requests to them.
Net LAN Directory model client home
Directory model client delegate random entry home
other req other req req a : no dir, go to origin. Also d a , d : req 1 1 home 2 2 client b : not-modified dir a , d origin c ,e : object 3 3 2 4 c ,e : req server 1 1 dele- gate object or e 3 not-modified origin e : cGET req 2 server (skip) Full directory protocol
Recap • Two endpoints of design space, based on the choice of storage location. • At first sight, both seem to do about as well. (e.g. hit ratio, latency).
Quirk Consider a • Web page with many images, or • Heavily browsing node In the Directory scheme, Many home nodes pointing to one delegate Home-store: natural load balancing .. evaluation on trace-based workloads ..
105 No web cache 100 (in GB) [lower is better] Total external bandwidth 95 Directory Home-store 90 Centralized cache 85 0.001 0.01 0.1 1 10 100 Per-node cache size (in MB) Total external bandwidth Redmond
6.1 No web cache 6 5.9 Directory (in GB) [lower is better] Total external bandwidth 5.8 Home-store 5.7 5.6 Centralized cache 5.5 0.001 0.01 0.1 1 10 100 Per-node cache size (in MB) Total external bandwidth Cambridge
100% 80% 60% Fraction of cacheable requests 40% 20% 0% 0 1 2 3 4 5 6 Total hops within the LAN Centralized Home-store Directory LAN Hops Redmond
100% 80% 60% Fraction of cacheable requests 40% 20% 0% 0 1 2 3 4 5 Total hops within the LAN Centralized Home-store Directory LAN Hops Cambridge
Load in requests per sec 100000 Home-store Directory 10000 1000 Redmond Number of such seconds 100 10 1 0 10 20 30 40 50 Max objects served per-node / second
Load in requests per sec 1e+07 Home-store Directory 1e+06 100000 10000 Cambridge Number of such seconds 1000 100 10 1 0 10 20 30 40 50 Max objects served per-node / second
Load in requests per min 100 Home-store Directory 10 Redmond Number of such minutes 1 0 50 100 150 200 250 300 350 Max objects served per-node / minute
Load in requests per min Home-store Directory 10000 1000 Cambridge Number of such minutes 100 10 1 0 20 40 60 80 100 120 Max objects served per-node / minute
Conclusion Possible to decentralize web caching Performance comparable to centralized cache Is better in terms of cost, administration, scalability and fault tolerance.
(backup) Full home-store protocol other req other req req (LAN) (WAN) a : object or notmod from home b : req home client 1 b b : object or notmod from origin 2 3 origin server
other req other req req a : no dir, go to origin. Also d a , d : req 1 1 home 2 2 client b : not-modified dir a , d origin c ,e : object 3 3 2 4 c ,e : req server 1 1 dele- gate object or e 3 not-modified origin e : cGET req 2 server (backup) Full directory protocol