1 / 27

Real Time Processing With Storm

Learn about real-time processing with Apache Storm, its components, fault tolerance, parallelism, and popular use cases like social media feeds and payment transactions.

wilbertm
Download Presentation

Real Time Processing With Storm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Real Time Processing With Storm Mahender Immadi Software Engineer @ Cerner www.linkedin.com/in/mahenderimmadi/ ThirupathiGuduru Software Engineer @ Cerner www.linkedin.com/in/thirupathireddyguduru/

  2. Batch vs. Real-Time processing • Batch processing - Gathering of data and processing as a group at one time. - Jobs run to completion - Data might be out of date • Real-time processing - Processing of data that takes place as the information is being entered. - Run for ever

  3. Real Time Use Cases • Social Media Feeds • Network Sensors • App/Web Logs • Stock Tick Data • Weather Data • Auctions • Payment Transactions

  4. Storm Introduction • Created by Nathan Marz @ BackType • Open sourced on 19th September, 2011

  5. Storm Apache Storm is a free and open source distributed realtimecomputation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing

  6. Storm Is • Stream Processing • Fast • Scalable • Fault Tolerant • Reliable

  7. Storm Components • Tuple • Stream • Spout • Bolt • Topology

  8. Tuple

  9. Streams

  10. Spouts

  11. Bolts

  12. Topologies

  13. Reliable Processing

  14. Reliable Processing

  15. Stream Grouping • Groupings are used to decide to which task in the subscribing bolt (group) a tuple is sent to. • Possible Groupings: - Shuffle - Fields - All - Global - None - Direct - Local or Shuffle

  16. Storm Cluster View

  17. Fault Tolerance

  18. Fault Tolerance

  19. Fault Tolerance

  20. Fault Tolerance

  21. Fault Tolerance

  22. Parallelism

  23. Parallelism

  24. Companies & Projects Using Storm

  25. References • https://storm.incubator.apache.org/ • http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_user-guide/content/ch_storm-using.html Books : • Getting Started with Storm - Jonathan Leibiusky, Gabriel Eisbruch, Dario Simonassi • Storm Blueprints: Patterns for Distributed Real-time Computation - P. Taylor Goetz, Brian O'Neill

  26. Demo

  27. Q & A

More Related