1 / 27

Porcupine: A Highly Scalable, Cluster-based Mail Service

Porcupine: A Highly Scalable, Cluster-based Mail Service. Yasushi Saito Brian Bershad Hank Levy. Discussion Led by Jeremy Shaffer Presentation modified from Porcupine paper and slides for SOSP 99. Do we agree these are good areas to look at?. Goals.

emmy
Download Presentation

Porcupine: A Highly Scalable, Cluster-based Mail Service

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Porcupine: A Highly Scalable, Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy Discussion Led by Jeremy Shaffer Presentation modified from Porcupine paper and slides for SOSP 99

  2. Do we agree these are good areas to look at? Goals Use commodity hardware to build a large, scalable mail service (1 SA, 100 Million Users, 1 Billion Messages/Day) Three facets of scalability ... • Performance: Linear increase with cluster size • Manageability: React to changes automatically - self- heal/self-configure • Availability: Survive failures gracefully

  3. Is email really a good service to test or just the best for this technique? Why Email? Mail is important Real demand Cluster research has focused on web services Mail is an example of a write-intensive application • disk-bound workload • reliability requirements • failure recovery Mail servers have relied on “brute force” approach to scaling

  4. The Porcupine Difference • Cluster based solution to email • Functional Homogeneity – any node can perform any service - Key to manageability - No centralized control point • For improved performance must find harmony in two main • concepts -- load balancing and affinity Traditional Mailserver: Performance problems: No dynamic load balancing Manageability problems: Manual data partition decision – personal intensive Availability problems: Limited fault tolerance

  5. Key Techniques and Relationships Functional Homogeneity “any node can perform any task” Framework Automatic Reconfiguration Load Balancing Techniques Replication Goals Availability Manageability Performance

  6. Basic Data Structures • Mailbox Fragment: The collection • of messages stored for a user at any node. • Multiple Mailbox frags per user. • Mailbox Frag. List: Nodes that • contain fragments for a given user. • User Profile Dbase: Usernames, passwords, etc. • User Profile Soft State: One node keeps info for a given user • User Map: Maps hash value of each user to a node where prof. state is • Cluster Membership List: Each nodes view of the set of other nodes

  7. SMTP server POP server IMAP server Load Balancer User map Membership Manager RPC Porcupine Architecture Replication Manager Mail map Mailbox storage User profile ... ... Node A Node B Node Z

  8. Internet Porcupine Operations Protocol handling User lookup Load Balancing Message store C A DNS-RR selection 1. “send mail to bob” 4. “OK, bob has msgs on C and D 3. “Verify bob” 6. “Store msg” ... ... A B C B 5. Pick the best nodes to store new msg  C 2. Who manages bob?  A

  9. Measurement Environment 30 node cluster of not-quite-all-identical PCs 100Mb/s Ethernet + 1Gb/s hubs Linux 2.2.7 42,000 lines of C++ code Synthetic load Compare to sendmail+popd

  10. How does Performance Scale? 68m/day 25m/day

  11. Replication is Expensive Porcupine replication is very resource intensive. Is it worth it? Would hardware fail-over be better? Mirrored disk drives? …

  12. Load balancing: Deciding where to store messages Goals: Handle skewed workload well Support hardware heterogeneity No magic parameter tuning Strategy: Spread-based load balancing Spread: soft limit on # of nodes per mailbox Large spread  better load balance Small spread  better affinity Load balanced within spread Use # of pending I/O requests as the load measure

  13. Load Balancing Non-Replicated Throughput on a 30-node System

  14. How Well does Porcupine Support Heterogeneous Clusters? Performance improvement on a 30-node Porcupine cluster without replication when disks are added to small number of nodes.

  15. Load Balancing Issues with Porcupine Load Balancing? Security? Client Dependence? … If have homogenous cluster why not just use Random? If heterogenous isn’t spread a “magic parameter” that needs tuning?

  16. Availability Goals: Maintain function after failures React quickly to changes regardless of cluster size Graceful performance degradation / improvement Strategy: Two complementary mechanisms Hard state: email messages, user profile Optimistic fine-grain replication Soft state: user map, mail map  Reconstruction after membership change

  17. How does Porcupine React to Configuration Changes?

  18. Availability Issues with Porcupine Availability… Node failure can reduce performance of entire cluster. Do you really want users to have inconsistent information? Would more hardware based solution be easier?

  19. Conclusions Fast, available, and manageable clusters can be built for write-intensive service Key ideas may be extended beyond mail Functional homogeneity Automatic reconfiguration Replication Load balancing

  20. Discussion • Note to self … get in on VC Funding for Porcupine • Offer them all jobs with nice stock options • Give them a small grant to see how it turn it out • Next paper please… • Send the paper to your competitors to mislead them What do you do…..

  21. Reference Slides Following slides are available for reference and to be available to help in answering any questions that arise.

  22. Load Balancing Replicated throughput on a 30-node system

  23. How Well does Porcupine Support Heterogeneous Clusters? Performance improvement on a 30-node Porcupine cluster with replication when disks are added to small number of nodes.

  24. A look at Replication Effects Throughput of the system configured with infinitely fast disks

  25. A look at Replication Effects Summary of single-node throughput in a variety of configurations

  26. B B B B B C C C C C A A A A A B B B B B A A A A A B B B B B A A A A A C C C C C Soft-state Reconstruction 2. Distributed disk scan 1. Membership protocol Usermap recomputation B A A B A B A B A C A C A C A C A bob: {A,C} bob: {A,C} bob: {A,C} suzy: suzy: {A,B} B A A B A B A B A C A C A C A C B joe: {C} joe: {C} joe: {C} ann: ann: {B} suzy: {A,B} C suzy: {A,B} suzy: {A,B} ann: {B} ann: {B} ann: {B} Timeline

  27. Hard-state Replication Goals: Keep serving hard state after failures Handle unusual failure modes Strategy: Exploit Internet semantics Optimistic, eventually consistent replication Per-message, per-user-profile replication Efficient during normal operation Small window of inconsistency

More Related