280 likes | 531 Views
Herald: Achieving a Global Event Notification Service. Luis Felipe Cabrera, Michael B. Jones, Marvin Theimer Microsoft Research. Global Event Notification Services.
E N D
Herald: Achieving a Global Event Notification Service Luis Felipe Cabrera, Michael B. Jones, Marvin Theimer Microsoft Research
Global Event Notification Services • Communication via event notification (also called publish/subscribe) is well-suited for loosely-coupled eCommerce applications, as well as Internet-scale distributed applications (e.g. instant messaging and multi-player games). • General event notification systems currently: • scale to tens of thousands of clients, • do not have global reach.
Internet-scale Issues • Scaling requirements are millions and billions, perhaps more. • There will (probably) not be a single organization that owns the entire event notification infrastructure. Hence a federated design is required. • Global reach implies that failures and network partitions will be common-place.
Focus on the Basic Distributed Systems Primitives • Focus on the scalability of basic message delivery and distributed state management capabilities. • Employ a very simple message-oriented design and assume – until proven otherwise – that richer event notification semantics can be layered on top.
Creator Herald Service Rendezvous Point Subscriber Publisher Herald Event Notification Model 1: Create Rendezvous Point 4: Notify 3: Publish 2: Subscribe
Design Criteria • The “usual” criteria: • Scalability • Resilience • Self-administration • Timeliness • Additional criteria: • Heterogeneous federation • Security • Support for disconnection • Partitioned operation
Scalability • 1011 Rendezvous Points (RPs) • 1011 publishers & subscribers in aggregate • 1010 publishers & subscribers per RP • 1010 federation members • 102 events/sec/RP
Resilience • “Fail last, fail least” semantics. • Correct operation in the presence of malicious/corrupt participants.
Self-administration • System decides where to place state and how to propagate information about state changes. • System dynamically adapts to changing loads and the presence of faults and network partitions. • No manual tuning.
Timeliness • Event notification should normally take seconds not hours.
Heterogeneous Federation • Federation of machines within cooperating but mutually suspicious domains of trust. • Federated parties may include both small and large domains.
Security • Support restricted access to Herald facilities. • Support concepts such as groups and roles.
Support for Disconnection • Eventual delivery to disconnected subscribers. • Event histories to allow a posteriori examination of the past.
Partitioned Operation • Continued operation on both sides of a network partition. • Eventual (out-of-order) delivery after partition healing.
Non-Goals • What’s the “best” way to do: • Naming • Filtering • Complex subscription queries • In-order delivery (except as layered on top)
Applying Lessons of the Internet and Web • Assume things are broken: • Mutual suspicion and no dependence on correct behavior by others. • Don’t try to fix everything: • All distributed state is maintained in a weakly-consistent soft-state manner and is aged. • All distributed state is incomplete and may be inaccurate.
Design Overview • We think we only need these mechanisms: • Replication. • Overlay distribution networks. • Time contracts. • Event histories. • Administrative rendezvous points.
Herald@L1 RP1@L1 RP2@L1 RP1@L2 RP1@L3 Herald@L2 Herald@L3 Pub1 Sub2 Pub2 Sub1 Sub4 Sub5 Pub3 Sub3 Replication
RP1@L1 RP1@L2 RP1@L3 RP1@L3 Herald@L2 Herald@L3 Herald@L3 Pub1 Sub2 Sub1 Sub4 Pub2 Overlay Distribution Networks Herald@L1
Creator Herald Service RP1 Sub1 Pub1 Time Contracts RP1 Creator 60 Pub1 10 Sub1 30
Creator Herald Service RP1 Sub1 Pub1 Event Histories RP1 Creator 60 Pub1 10 Sub1 30 History 50
Name Service Herald Service RP1 Administrative Rendezvous Points 1. Subscribe RP1@ 2. Notify(change)
Engineering & Research Issues • Baseline scalability numbers • Dynamic system reconfiguration • Federation and security
Baseline Scalability Numbers • How scalable are single-node servers and server clusters? • What are multicast-style delivery systems actually capable of, especially in aggregate?
Dynamic System Reconfiguration • Reconfiguring distributed RP state in response to aggregate workloads and global state changes. • Dealing with “flash crowd” loads. • Placement of RP state to minimize the effects of network partitions and disconnection. • Placement of RP state to enable efficient implementations of higher-level pub/sub semantics.
Federation and Security • Can we define simple, open protocols? • Will we need heavy-weight mechanisms to deal with malicious/corrupt servers? • How should anonymity and privacy be dealt with/supported?
Related Work • Non-global event notification systems (Gryphon, Ready, Siena, …) • Netnews • P2P systems such as Gnutella and Farsite • Overlay & multicast networks • CDNs • OceanStore
Conclusion • Global event notification is emerging as a key Internet technology. • Herald is exploring scalability of the basic message and distributed state management aspects of an event notification system: • Gain engineering experience with scalable pub/sub systems. • Explore dynamic system reconfiguration. • Understand the implications of federation and security.