Porcupine: A Highly Available Cluster-based Mail Service

Porcupine: A Highly Available Cluster-based Mail Service Y. Saito, B. Bershad, H. Levy U. Washington SOSP 1999 Presented by: Fabián E. Bustamante

Porcupine – goals & requirements Use commodity hardware to build a large, scalable mail service Main goal – scalability in terms of • Manageability - large but easy to manage • Self-configure w/ respect to load and data distribution • Self-heal with respect to failure & recovery • Availability – survive failures gracefully • Failure may prevent some users to access email • Performance – scale linear with cluster size • Target – 100s of machines ~ billions of mail msgs/day

Functional Homogeneity “any node can perform any task” Framework Dynamic Scheduling Automatic Reconfiguration Techniques Replication Availability Goals Manageability Performance Key Techniques and Relationships

Why Email? • Mail is important • Real demand – Saito now works for Google • Mail is hard • Write intensive • Low locality • Mail is easy • Well-defined API • Large parallelism • Weak consistency

Conventional Mail Solution Static partitioning • Performance problems: • No dynamic load balancing • Manageability problems: • Manual data partition • Availability problems: • Limited fault tolerance SMTP/IMAP/POP Luca’s mbox Jeanine’s mbox Joe’s mbox Suzy’s mbox NFS servers

SMTP server POP server IMAP server Load Balancer User map Membership Manager RPC Replication Manager Mail map Mailbox storage User profile ... ... Node A Node Z Node B Porcupine Architecture

Protocol handling User lookup Load Balancing Message store Internet C A 4. “OK, luca has msgs on C and D 1. “send mail to luca” 3. “Verify luca” 6. “Store msg” ... ... A B C B 5. Pick the best nodes to store new msg  C 2. Who manages luca?  A Porcupine Operations DNS-RR selection

B C A C A B A C B B C C A A C C A A B B A A C C Basic Data Structures “luca” Apply hash function User map Mail map /user info Luca: {A,C} suzy: {A,C} joe: {B} ann: {B} Suzy’s MSGs Ann’s MSGs Suzy’s MSGs Luca’s MSGs Joe’s MSGs Bob’s MSGs Mailbox storage A B C

Porcupine Advantages • Advantages: • Optimal resource utilization • Automatic reconfiguration and task re-distribution upon node failure/recovery • Fine-grain load balancing • Results: • Better Availability • Better Manageability • Better Performance

Performance • Goals • Scale performance linearly with cluster size • Strategy: Avoid creating hot spots • Partition data uniformly among nodes • Fine-grain data partition

Measurement Environment • 30 node cluster of not-quite-all-identical PCs • 100Mb/s Ethernet + 1Gb/s hubs • Linux 2.2.7 • 42,000 lines of C++ code • Synthetic load • Compare to sendmail+popd

How does Performance Scale? 68m/day 25m/day

Availability • Goals: • Maintain function after failures • React quickly to changes regardless of cluster size • Graceful performance degradation / improvement • Strategy: Two complementary mechanisms • Hard state: email messages, user profile •  Optimistic fine-grain replication • Soft state: user map, mail map •  Reconstruction after membership change

B A A B A B A B A C A C A C A C luca: {A,C} luca: {A,C} luca: {A,C} B B B B B C C C C C A A A A A B B B B B A A A A A B B B B B A A A A A C C C C C suzy: suzy: {A,B} B A A B A B A B A C A C A C A C joe: {C} joe: {C} joe: {C} ann: ann: {B} suzy: {A,B} suzy: {A,B} suzy: {A,B} ann: {B} ann: {B} ann: {B} Soft-state Reconstruction 2. Distributed disk scan 1. Membership protocol Usermap recomputation A B C Timeline

Reaction to Configuration Changes

Hard-state Replication • Goals: • Keep serving hard state after failures • Handle unusual failure modes • Strategy: Exploit Internet semantics • Optimistic, eventually consistent replication • Per-message, per-user-profile replication • Efficient during normal operation • Small window of inconsistency

Replication Efficiency 68m/day 24m/day

Replication Efficiency 68m/day 33m/day 24m/day Pretending – remove disk flushing from disk logging routines.

Load balancing: Storing messages • Goals: • Handle skewed workload well • Support hardware heterogeneity • No voodoo parameter tuning • Strategy: Spread-based load balancing • Spread: soft limit on # of nodes per mailbox • Large spread  better load balance • Small spread  better affinity • Load balanced within spread • Use # of pending I/O requests as the load measure

Support of Heterogeneous Clusters Relative performance improvement. +16.8m/day (+25%) Node heterogeneity – 0% all nodes ~ at same speed, 3,7 & 10% - percentage of nodes w/ very fast disks +0.5m/day (+0.8%)

Conclusions • Fast, available, and manageable clusters can be built for write-intensive service • Key ideas can be extended beyond mail • Functional homogeneity • Automatic reconfiguration • Replication • Load balancing • Ongoing work • More efficient membership protocol • Extending Porcupine beyond mail: Usenet, Calendar, etc • More generic replication mechanism

Porcupine: A Highly Available Cluster-based Mail Service

Porcupine: A Highly Available Cluster-based Mail Service

Presentation Transcript

Siemens OPENLink ™ ICO Cluster Solution Overview

High Performance Cluster Computing Architectures and Systems

The University of Sunderland Cluster Computer

Spatial and Temporal Data Mining

Past, Present and Future of Windows-based NAS: A Growing Market for Highly Available Solutions

Cluster sampling

FT NT: A Tutorial on Microsoft Cluster Server ™ (formerly “Wolfpack”)

Cluster Analysis

IMB Full Service February 19, 2014

Data Mining Cluster Analysis Basics

Dr. Timothy Spangler The COMET Program

Cluster Analysis: Basic Concepts and Algorithms

聚类分析：基本概念和算法

High Performance Cluster Computing Architectures and Systems

Applied Cryptography

Data Mining Cluster Analysis: Basic Concepts and Algorithms

Chapter 7. Cluster Analysis

Destination

CLUSTER BOMBS

How to make PC Cluster Systems?