1 / 26

Fault Tolerant Stream Processing using Distributed Replicated File System

Fault Tolerant Stream Processing using Distributed Replicated File System. Introduction. Stream Processing Engines Is a monitoring tool which monitors the activities

Download Presentation

Fault Tolerant Stream Processing using Distributed Replicated File System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fault Tolerant Stream Processing using Distributed Replicated File System

  2. Introduction • Stream Processing Engines • Is a monitoring tool which monitors the activities • In SPE, a query takes the form of loop-free, directed graph of operators. Each operator processes data arriving on its input and produces data on its output stream

  3. These query are called query diagrams • Every server runs a instance of SPE and it is called processing node • When any nodes fail, these failure causes SPE to give wrong results or it blocks SPE from processing • Challenge is fault tolerance

  4. Fault tolerant techniques • Replication  Most of the servers hold replicated state in memory until servers fails and these replicated states are frequently updated through a process called check pointing to insure that backups are updated with the current state

  5. In those techniques, SPE not only streams data in memory but also it does checkpoints in memory. • Since these processing are done in memory without going to disks, at least half of the cluster resources are dedicated to fault tolerance • So when replication is done and is updated, the number of servers available for normal stream processing is cut at least in half and that the memory capacity also cut at least in half. • Costly

  6. Recovery  with this approach, only one primary replica of each node process streams. This primary nodes periodically takes snapshots of these states and send to the other nodes • When failure, one of these backups works as primary

  7. SGuard • Based on Rollback recovery technique • Each SPE node takes periodic checkpoints of its state and write these checkpoints to stable storage. • To take a checkpoint, node suspends its processing and makes copy of its state. • Used DFS as stable storage

  8. Challenge in using Disk • Saving checkpoints to disk saves the memory but it has high write latency. • To avoid its flaws, we use DFS. • Some important properties • Supports for large data files • Fault-tolerance through replication

  9. SGuard System Architecture

  10. Three main points -uses disks to store checkpoints with DFS -uses peace scheduler -uses Memory Management Middleware (MMM)

  11. SGuard Software Architecture

  12. Make checkpoints asynchronous so that operators can continue processing during checkpoints • SGuards extends the DFS Co-ordinator with a schedular called Peace that reduces the total time to write the state of a single HA unit while maintaining good overall resource utilizations • HA units are the group of interconnected groups of operators under a node

  13. The SGuard techniques introduces new middleware layer includes MMM • Also it includes chkptMngr and IOService • ChkptMngr  manages checkpoint and recovery operations. It checkpoints HA unit in 5 steps

  14. It informs HAInput that check point is starting • Prepares the state of the HAUnit • It writes the prepared states into the DFS • Informs the co-ordinator about the new checkpoints • Notifies HAInput Operator that checkpoint is completed

  15. MMM Memory Manager • To enable concurrent check points where the state of an operator is copied to disk while the operator continues executing, SGuard must control the memory of stream processing operators. • MMM partitions SPE memory into collection of pages where operators states are stored

  16. To checkpoint the state of the operator, its pages are recopied to disk • So when checkpoint begins, there operator is briefly suspended and all pages are marked as read only • The operator execution resumes and pages are written in the disk in the back ground

  17. MMM has two layers: Page Manager(PM) and Data Structure(DS) • PM allocates, frees, and check points pages • DS implements data structure abstractions on top of the PM’s page abstraction. MMM also has library of data structure wrappers on top of each page called the Page Layout (PL) library

  18. Page Manager • It maintains the list of free and allocated pages • Controls all the requests for allocating and freeing these pages • PM maintains and exposes a page table that maps PageId onto the address of the memory

  19. Page Layout Library • Is a wrapper for a page and has two main features • Provides data structure abstraction on top of each page • Provides level of indirection between the data structure and the underlying pages enabling copy-on-write of the pages during checkpoints

  20. DS Layer • Creats the meaningful relationship between pages

  21. Peace Scheduler • Addresses resource contention problem • Schedules the writes in a manner that reduces the time to write each set of chunks while keeping the total time for completing all writes small • It does so by scheduling only as many concurrent writes as there are available resources, scheduling all writes from the same set close together, and by selecting destinations for each write in a way that avoids resource contention

  22. Nodes submit write request to the co-ordinator in forms of triples(w,r,k) • The algorithm finds out the best destination using the min-cost max-flow problem

  23. Conclusion • SGuard improves SPE checkpoints transparency through MMM which enables efficient asynchronous checkpointings

  24. Thank you

More Related