170 likes | 186 Views
Explore the comprehensive overview of scalable and distributed stream processing techniques with a focus on collaborative efforts across administrative domains. Learn about Aurora and Medusa distribution methods, architectural issues, load management, and high availability strategies. Discover key challenges like key partitioning and high availability implementation through failure detection and recovery. Dive deep into the intricacies of communications, naming, and routing in decentralized stream processing environments.
E N D
Scalable Distributed Stream Processing Presented by Ming Jiang
Situation when distributed • A distributed federation of participating nodes in different administrative domains • Collaboration between different domains required
Two complementary efforts for the situation • Aurora* intra-participant distribution • Medusa inter-participant distribution
Three pieces to be shard • Aurora • An overlay network of communication • Algorithms for high-availability
Three architectural issues • Communications • Load sharing • High availability in the presence of failure
Communications • Naming (participants, entity-name) • Routing 1. a data source or an administrator registers a schema and a stream 2. When DS produce an event, labels
Communications • Message Transport multiplexing all the message streams on a single TCP connection • Remote definition: process migration is too complicated
Load Management Repartitioning Aurora Networks, based on loads and resources: • Box Sliding • Box Splitting
Box Sliding • Takes a box on the edge of a sub-network on one machine and shifts it to its neighbor. upstream box sliding
Box Splitting • Create a copy of a box that is intended to run on second machine, to offload • Need a filter as router
Box splitting Tumble Merge: Box splitting has to be transparent
Box splitting • If predicate in filter is: B<3 A machine: 1,2,3,4,7 B machine: 5,6 A machine B machine final result after merge
Key partitioning Challenges • Choosing what to offload • Choosing what to split • Choosing filters • Others…
High Availability Utilize the push-based nature
Failure detection and Recovery • 1. periodically send heartbeat msgs to upstream neighbors • 2. if any server does not reply for pre-defined time, we assume it failed • 3. initiate recovery phase, emulating the process of failed server (load shedding can be used)