140 likes | 159 Views
This presentation gives an overview of the Apache Flink project. It explains Flink in terms of its architecture, use cases and the manner in which it works. <br> <br>Links for further information and connecting<br><br>http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/<br><br>https://nz.linkedin.com/pub/mike-frampton/20/630/385<br><br>https://open-source-systems.blogspot.com/
E N D
What Is Apache Flink ? ● A stream processing framework ● Open source / Apache 2.0 license ● Written in Java and Scala ● For batch and stream processing ● For high volume , low latency ● Develop in Java, Scala, Python, SQL ● Automatic compilation/optimization into data flows
How Does Flink Work ? ● Process Unbounded and Bounded Data ● Uses file systems to consume/persistently store data i.e. –local, hadoop-compatible, Amazon S3, MapR FS, OpenStack Swift FS, Aliyun OSS and Azure Blob Storage ● Leverages In-Memory Performance ● Provides a rich function set for handling – Streams, state and time – When building applications ● Provides layered API's which provides a balance between – Conciseness and expressiveness – See next slide
How Does Flink Work ? Flink layered API's
Flink API's ● SQL & Table API ● DataStream API ● ProcessFunctions – event processing ● Flink also has libraries for common data processing – Complex Event Processing (CEP) – DataSet API – Gelly - library for scalable graph processing/analysis
Flink Deployment ● Deploy Flink to use the following cluster managers – YARN – Mesos – Kubernetes – Stand alone ● All application control communications via REST calls ● Deploy at any scale – multiple trillions of events per day – multiple terabytes of state – thousands of cores
Flink Stateful Functions ● Simplifies building distributed stateful applications ● Provides a runtime built for serverless architectures ● Key Benefits – Dynamic Messaging – Consistent State – Multi-language Support – No Database Required – Cloud Native – "Stateless" Operation
Flink Use Cases ● Event-driven Applications i.e. – Fraud detection – Anomaly detection ● Data Analytics Applications – Quality monitoring of Telco networks – Analysis of product updates & experiment evaluation in mobile applications ● Data Pipeline Applications – Real-time search index building in e-commerce – Continuous ETL in e-commerce
Available Books ● See “Big Data Made Easy” Apress Jan 2015 – See “Mastering Apache Spark” ● Packt Oct 2015 – See “Complete Guide to Open Source Big Data Stack ● “Apress Jan 2018” – ● Find the author on Amazon www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ – Connect on LinkedIn ● www.linkedin.com/in/mike-frampton-38563020 –
Connect ● Feel free to connect on LinkedIn –www.linkedin.com/in/mike-frampton-38563020 ● See my open source blog at open-source-systems.blogspot.com/ – ● I am always interested in – New technology – Opportunities – Technology based issues – Big data integration