160 likes | 259 Views
( ** Apache Spark Training - https://www.edureka.co/apache-spark-s... ** ) <br>This Edureka tutorial on MapReduce vs Spark will help you to understand the differences between MapReduce and Spark by comparing them on various parameters like: <br><br>1. Current Market Situation <br>2. Hadoop Map-Reduce vs Apache Spark <br>a. Performance <br>b. Ease of Use <br>c. Cost <br>d. Data Processing <br>e. Security <br>f. Fault Tolerance <br>3. Real-time example of Map-Reduce <br>4. Real-time example of Spark <br><br>Check our complete Apache Spark and Scala playlist here: https://goo.gl/ViRJ2K
E N D
www.edureka.co/big-data-and-hadoop www.edureka.co/apache-spark-scala-training Hadoop Certification Training Spark Certification Training
Parameters to Compare Performance Cost Fault Tolerance Ease of Use Security Data Processing www.edureka.co/big-data-and-hadoop www.edureka.co/apache-spark-scala-training Hadoop Certification Training Spark Certification Training
Current Market Situation 47% + (2017) 14% + (2017) (2016) (2016) www.edureka.co/big-data-and-hadoop www.edureka.co/apache-spark-scala-training Hadoop Certification Training Spark Certification Training
Performance Performance Ease of Use Moves data through disk and network Cost Data Processing Security Performs better as data is cached in the memory Fault Tolerance www.edureka.co/big-data-and-hadoop www.edureka.co/apache-spark-scala-training Hadoop Certification Training Spark Certification Training
Ease of Use Performance Uses Java API’s and doesn’t support real time processing Ease of Use Cost Data Processing Uses Rich API’s and supports Interactive Mode in real time Security Fault Tolerance www.edureka.co/big-data-and-hadoop www.edureka.co/apache-spark-scala-training Hadoop Certification Training Spark Certification Training
Cost Performance Comparatively less costlier because of hard disk storage Ease of Use Cost Data Processing Security More costlier because of large amounts of RAM Fault Tolerance www.edureka.co/big-data-and-hadoop www.edureka.co/apache-spark-scala-training Hadoop Certification Training Spark Certification Training
Data Processing Performance Ease of Use Batch Processing Cost Data Processing Real-time as well as Batch Processing Security Fault Tolerance www.edureka.co/big-data-and-hadoop www.edureka.co/apache-spark-scala-training Hadoop Certification Training Spark Certification Training
Security Performance Ease of Use More secure & supports all security benefits like Knox Gateway Cost Data Processing Less secure & Authentication via Shared Secret Security Fault Tolerance www.edureka.co/big-data-and-hadoop www.edureka.co/apache-spark-scala-training Hadoop Certification Training Spark Certification Training
Fault Tolerance Performance Ease of Use Uses replication for fault Tolerance Cost Data Processing Security Uses RDD and other storage models Fault Tolerance www.edureka.co/big-data-and-hadoop www.edureka.co/apache-spark-scala-training Hadoop Certification Training Spark Certification Training
Real Time Use Case of Map-Reduce Copyright © 2018, edureka and/or its affiliates. All rights reserved.
ETL & Data Analytics Data center 2 Data center 1 EXTRACT TRANSFORM LOAD www.edureka.co/big-data-and-hadoop www.edureka.co/apache-spark-scala-training Hadoop Certification Training Spark Certification Training
Real Time Use Case of Apache Spark Copyright © 2018, edureka and/or its affiliates. All rights reserved.
Credit Card Fraud Detection Credit card data Spark Engine Spark Streaming Input data Batches of input data Batches of processed data Data Ingestion HDFS & HBase Storage Spark Streaming Analytic Interface www.edureka.co/big-data-and-hadoop www.edureka.co/apache-spark-scala-training Hadoop Certification Training Spark Certification Training
www.edureka.co/apache-spark-scala-training Spark Certification Training