220 likes | 490 Views
MapReduce Simplified Data Processing on Large Clusters. Outline. Motivation MapReduce execution overview Lifecycle of MapReduce operation Optimizations techinques Pros and Cons Conclusion. Motivation. Provide a programming model For processing large data set
E N D
MapReduce Simplified Data Processing on Large Clusters
Outline • Motivation • MapReduce execution overview • Lifecycle of MapReduce operation • Optimizations techinques • Pros and Cons • Conclusion
Motivation • Provide a programming model • For processing large data set • Exploits large sets of commodity computers and High-speed Ethernet interconnect • Executes process in distributed manner • Error handling and reliability • Above all: • Simple and maybe suitable for impossible tasks !!!
Partitioning Function MapReduce Operation • Map: • Accepts input key/value pair • Emits intermediate key/value pair • Write to local disk R E D U C E M A P Very big data Result
Partitioning Function MapReduce Operation • Partitioning Function: • Partition intermediate data • Emits intermediate key/value pair • Default partition function: hash(key) mod R R E D U C E M A P Very big data Result
Partitioning Function MapReduce Operation • Reduce: • Derive intermediate key/value pair through RPC • Sort and group value with same key • Emits output key/value pair • No reduce can begin until map is complete R E D U C E M A P Very big data Result
Example: map phase inputs tasks (M=4) partitions (intermediate files) (R=2) When in the course of human events it … (when,1), (course,1) (human,1) (events,1) (best,1) … map (in,1) (the,1) (of,1) (it,1) (it,1) (was,1) (the,1) (of,1) … Over the past five years, the authors and many… It was the best of times and the worst of times… map (over,1), (past,1) (five,1) (years,1) (authors,1) (many,1) … (the,1), (the,1) (and,1) … This paper evaluates the suitability of the … (this,1) (paper,1) (evaluates,1) (suitability,1) … map (the,1) (of,1) (the,1) … Note: partition function places small words in one partition and large words in another.
Example: reduce phase partition (intermediate files) (R=2) reduce task sort (in,1) (the,1) (of,1) (it,1) (it,1) (was,1) (the,1) (of,1) … run-time function (the,1), (the,1) (and,1) … (and, (1)) (in,(1)) (it, (1,1)) (the, (1,1,1,1,1,1)) (of, (1,1,1)) (was,(1)) (the,1) (of,1) (the,1) … reduce user’s function (and,1) (in,1) (it, 2) (of, 3) (the,6) (was,1) Note: only one of the two reduce tasks shown
Partitioning Function MapReduce Operation • Setup Phase • What should user do? • z R E D U C E M A P Very big data Result
One-time Setup • User to do list: • indicate: • Input/output files • M: number of map tasks • R: number of reduce tasks • W: number of machines • Write user-defined mapand reduce functions • Submit the job • This requires no knowledge of parallel and distributed systems
Master • Propagates intermediate file (location + size) • from map tasks to reduce tasks • For each map task and reduce task, master stores the possible states. (3 states) • O(MR) states in memory • Termination condition • All tasks are in the “completed” state.
Failures • Failure detection mechanism • Master pings workers periodically. • Map worker failure • Map tasks completed or in-progress at worker are reset to idle • Reduce workers are notified when task is rescheduled on another worker • Reduce worker failure • Only in-progress tasks are reset to idle • Master failure • MapReduce task is aborted and client is notified • Master checkpoint its data structures
Fault Tolerance • Input file blocks stored on DFS • On errors, workers send “last gasp” UDP packet to master • Master notices particular input key/values cause crashes in map(), and skips those values on re-execution. • Work around bugs in third-party libraries
Load Balancing • Number of mapper and reducer are much larger than the number of worker machines. • When a worker fails, the many tasks assigned to it can be spread out across all the other workers. • Backup Tasks • A slow running task (straggler) prolong overall execution
Backup Task • Stragglers often caused by circumstances local to the worker on which the straggler task is running • Overload on worker machined due to scheduler • Frequent recoverable disk errors • Significantly reduces the time to complete large MapReduce operations • Schedule backup(replacement) tasks to idle worker • First to complete “wins” • Can significantly improve overall completion time
Implementations • Google • Not available outside Google • Hadoop • An open-source implementation in Java • Uses HDFS for stable storage • Aster Data • Cluster-optimized SQL Database that also implements MapReduce
MapReduce Benefits • Ease of use, “out of box” experience • Not available outside Google • Portable regardless of computer architecture • Better in homogeneous network • Scalability • 4000 node in practice • Fault tolerance • Only failed tasks are re-executed. • Load balancing • Master I/O scheduling .
Negative Points • Not perform well enough in structured data set. • DBMS is better in this case • Only linearly separable input data • The “central” Master failure • Locality issue can be better addressed • Why not from computation to data?