1 / 30

Squirrel: A peer-to-peer web cache

Squirrel: A peer-to-peer web cache. Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter Druschel. Peer-to-peer Computing. Decentralize a distributed protocol: Scalable Self-organizing Fault tolerant Load balanced Not automatic!!. Web Caching.

lynde
Download Presentation

Squirrel: A peer-to-peer web cache

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Squirrel: A peer-to-peer web cache Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter Druschel

  2. Peer-to-peer Computing Decentralize a distributed protocol: • Scalable • Self-organizing • Fault tolerant • Load balanced Not automatic!!

  3. Web Caching 1. Latency, 2. External bandwidth, 3. Server load. ISPs, Corporate network boundaries, etc. Cooperative Web Caching: group of web caches tied together and acting as one web cache.

  4. Web Cache Browser Cache Browser Centralized Web Cache Web Server Browser Cache Browser Internet LAN Sharing!

  5. Decentralized Web Cache Browser Cache Browser Web Server Browser Cache Browser Internet LAN • Why? • How?

  6. Why peer-to-peer ? • Cost of dedicated web cache No additional hardware • Administrative costs Self-organizing • Scaling needs upgrading Resources grow with clients • Single point of failure Fault-tolerant by design

  7. Setting • Corporate LAN • 100 - 100,000 desktop machines • Single physical location • Each node runs an instance of Squirrel • Sets it as the browser’s proxy

  8. Pastry Peer-to-peer object location and routing substrate Distributed Hash Table: reliably map an object key to a live node Routes in log2b(N)steps (e.g. 3-4steps for 100,000 nodes, with b=16)

  9. Internet LAN Home-store model client URL hash home

  10. Home-store model client home …that’s how it works!

  11. Directory model Client nodes always store objects in local caches. Main difference between the two schemes: whether the home node also stores the object. In the directory model, it only stores pointers to recent clients, and forwards requests to them.

  12. Net LAN Directory model client home

  13. Directory model client delegate random entry home

  14. other req other req req a : no dir, go to origin. Also d a , d : req 1 1 home 2 2 client b : not-modified dir a , d origin c ,e : object 3 3 2 4 c ,e : req server 1 1 dele- gate object or e 3 not-modified origin e : cGET req 2 server (skip) Full directory protocol

  15. Recap • Two endpoints of design space, based on the choice of storage location. • At first sight, both seem to do about as well. (e.g. hit ratio, latency).

  16. Quirk Consider a • Web page with many images, or • Heavily browsing node In the Directory scheme, Many home nodes pointing to one delegate Home-store: natural load balancing .. evaluation on trace-based workloads ..

  17. Trace characteristics

  18. 105 No web cache 100 (in GB) [lower is better] Total external bandwidth 95 Directory Home-store 90 Centralized cache 85 0.001 0.01 0.1 1 10 100 Per-node cache size (in MB) Total external bandwidth Redmond

  19. 6.1 No web cache 6 5.9 Directory (in GB) [lower is better] Total external bandwidth 5.8 Home-store 5.7 5.6 Centralized cache 5.5 0.001 0.01 0.1 1 10 100 Per-node cache size (in MB) Total external bandwidth Cambridge

  20. 100% 80% 60% Fraction of cacheable requests 40% 20% 0% 0 1 2 3 4 5 6 Total hops within the LAN Centralized Home-store Directory LAN Hops Redmond

  21. 100% 80% 60% Fraction of cacheable requests 40% 20% 0% 0 1 2 3 4 5 Total hops within the LAN Centralized Home-store Directory LAN Hops Cambridge

  22. Load in requests per sec 100000 Home-store Directory 10000 1000 Redmond Number of such seconds 100 10 1 0 10 20 30 40 50 Max objects served per-node / second

  23. Load in requests per sec 1e+07 Home-store Directory 1e+06 100000 10000 Cambridge Number of such seconds 1000 100 10 1 0 10 20 30 40 50 Max objects served per-node / second

  24. Load in requests per min 100 Home-store Directory 10 Redmond Number of such minutes 1 0 50 100 150 200 250 300 350 Max objects served per-node / minute

  25. Load in requests per min Home-store Directory 10000 1000 Cambridge Number of such minutes 100 10 1 0 20 40 60 80 100 120 Max objects served per-node / minute

  26. Conclusion Possible to decentralize web caching Performance comparable to centralized cache Is better in terms of cost, administration, scalability and fault tolerance.

  27. (backup) Storage utilization

  28. (backup) Fault tolerance

  29. (backup) Full home-store protocol other req other req req (LAN) (WAN) a : object or notmod from home b : req home client 1 b b : object or notmod from origin 2 3 origin server

  30. other req other req req a : no dir, go to origin. Also d a , d : req 1 1 home 2 2 client b : not-modified dir a , d origin c ,e : object 3 3 2 4 c ,e : req server 1 1 dele- gate object or e 3 not-modified origin e : cGET req 2 server (backup) Full directory protocol

More Related