1 / 41

MapReduce : Simplified Data Processing on Large Clusters

MapReduce : Simplified Data Processing on Large Clusters. 2009-21146 Lim JunSeok. Contents. 1. Introduction 2. Programming Model 3. Structure 4. Performance & Experience 5. Conclusion. Introduction. Introduction. What is MapReduce ?

munin
Download Presentation

MapReduce : Simplified Data Processing on Large Clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MapReduce: Simplified Data Processing on Large Clusters 2009-21146 Lim JunSeok

  2. Contents 1. Introduction 2. Programming Model 3. Structure 4. Performance & Experience 5. Conclusion

  3. Introduction

  4. Introduction • What is MapReduce? • A simple and powerful interface that enables automatic parallelization and distribution of large-scale computations. • A programming model • executes process in distributed manner • exploits large set of commodity computers • for large data set(> 1 TB) • with underlying runtime System • parallelizes the computation across large-scale clusters of machines • handles machine failures • schedules inter-machine communication to make efficient use of the network and disk

  5. Motivation • Want to process lots of data( > 1TB) • E.g. • Raw data: crawled documents, Web request logs, … • Derived data: inverted indices, summaries of the number of pages, a set of most frequent queries in a given day. • Want to parallelize across hundreds/thousands of CPUs • And, want to make these easy The Digital Universe 2009-2020 Google Data Centers – File System Distributed

  6. Motivation • Application: Sifting through large amounts of data • Used for • Generating the Google search index • Clustering problems for Google News and Froogle products • Extraction of data used to produce reports of popular queries • Large scale graph computation • Large scale machine learning • … Google Search PageRank Machine learning

  7. Motivation • Platform: clusters of inexpensive machines • Commodity computers(15,000 Machines in 2003) • Scale to large clusters: thousands of machines • Data distributed and replicated across machines of the cluster • Recover from machine failure • Hadoop, Google File System Google File System Hadoop

  8. Programming Model

  9. MapReduce Programming Model • MapReduce framework • Partitioning function: (default) • Related well balanced partitions • Partitioning function and can be specified by users. Map Partitioning function Reduce

  10. MapReduce Programming Model • Map phase • Local computation • Process each record independently and locally • Reduce phase • Aggregate the filtered output Result Local Storage Reduce Map Commodity computers

  11. Example: Word Counting File 2: Hello Map Bye Reduce File 1: Hello World Bye SQL Map procedure <Hello, 1> <World, 1> <Bye, 1> <SQL, 1> <Hello, 1> <Map, 1> <Bye, 1> <Reduce, 1> Partitioning Function <Hello, {1,1}> <World, 1> <Map,1> <Bye, {1,1}> <SQL, 1> <Reduce,1> <Hello, 2> <World, 1> <Map,1> <Bye, 2> <SQL, 1> <Reduce,1> Reduce procedure

  12. Example: PageRank • PageRank review: • Link analysis algorithm • : set of all Web pages • : set of pages that link to the page • : the total number of links going out of • : the PageRank of page

  13. Example: PageRank • Key ideas for Map Reduce • RageRank calculation only depends on the PageRank values of previous iteration • PageRank calculation of each Web pages can be processed in parallel • Algorithm: • Map: Provide each page’s PageRank ‘fragments’ to the links • Reduce: Sum up the PageRank fragments for each page

  14. Example: PageRank • Key ideas for Map Reduce

  15. Example: PageRank • PageRank calculation with 4 pages

  16. Example: PageRank • Map phase: Provide each page’s PageRank ‘fragments’ to the links PageRank fragment computation of page 2 PageRank fragment computation of page 1

  17. Example: PageRank • Map phase: Provide each page’s PageRank ‘fragments’ to the links PageRank fragment computation of page 3 PageRank fragment computation of page 4

  18. Example: PageRank • Reduce phase: Sum up the PageRank fragments for each page

  19. Structure

  20. Execution Overview (1) Split the input files into M pieces of 16-64MB per piece. Then start many copies of program (2) Master is special: the rest are workers that are assigned work by the master. • M map tasks and R reduce tasks (3) Map phase • Assigned worker read the input files • Parse the input data into key/value pairs • Produce intermediate key/value pairs (7) Return to user code

  21. Execution Overview (4) Buffered pairs are written to local disk, partitioned into R regions by the partitioning function • The locations are passed back to the master • Master forwards these locations to the reduce workers (5) Reduce phase 1: read and sort • Reduce workers read the data from intermediate data for its partition • Sort intermediate key/value pairs to group data by same key (7) Return to user code

  22. Execution Overview (6) Reduce phase 2: reduce function • Iterate over the sorted intermediate data in the reduce function • The output is appended to a final output file for the reduce function (7) Return to user code • The master wakes up the user program • Return back to the user code (7) Return to user code

  23. Failure Tolerance • Handled via re-execution: worker failure • Failure detection: heartbeat • The master pings every worker periodically • Handling Failure: re-execution • Map task: • Re-execute completed and in-progress map tasks since map tasks are performed in the local • Reset the state of map tasks and re-schedule • Reduce tasks • Re-execute in-progress map tasks since the data is stored in local • Completed reduce tasks do NOT need to be re-executed • The results are stored in global file system

  24. Failure Tolerance • Master failure: • Job state is checkpointed to global file system • New master recovers and continues the tasks from checkpoint • Robust to large-scale worker failure: • Simply re-execute the tasks! • Simply make new masters! • E.g. • Lost 1600 of 1800 machines once, but finished fine.

  25. Locality • Network bandwidth is a relatively scarce resource • Input data is stored on the local disks of the machines • GFS divides each file into 64MB blocks • Store several copies of each block on different machines • Local computation: • Master takes the information of location of input data’s replica • Map task is performed in the local disk that contains the replica of the input data • If it fails, master schedules the map task near a replica • E.g.: worker on the same network switch • Most input data is read locally and consumes no network bandwidth

  26. Task Granularity • Fine granularity tasks • Many more map tasks than machines • The many map tasks can be completed by spread out across all the other worker machines • Practical bounds on the size of M and R • for scheduling • for state in memory • The constant factors for memory usage are small • One piece of the state is approximately one byte of data per map/reduce task pair

  27. Backup Tasks • Slow workers significantly lengthen completion time • Other jobs consuming resources on machine • Bad disks with soft errors • Data transfer very slowly • Weird things • Processor cashes disabled • Solution: Near end of phase, spawn backup copies of tasks • Whichever, one finishes first wins • As a result, job completion time dramatically shortened • E.g. 44% longer to complete if backup task mechanism is disabled

  28. Performance & Experience

  29. Performance • Experiment setting • 1,800 machines • 4 GB of memory • Dual-processor 2 GHz Xeons with Hyperthreading • Dual 160 GB IDE disks • Gigabit Ethernet per machine • Approximately 100-200 Gbps of aggregate bandwith

  30. Performance • MR_Grep: Grep task with MapReduce • Grep: search relatively rare three-character pattern through 1 terabyte • 80 sec to hit zero • Computation peaks at over 30GB/s when 1764 workers are assigned • Locality optimization helps • Without this, rack switches would limit to 10GB/s Data transfer rate over time

  31. Performance • MR_Sort: Sorting task with MapReduce • Sort: sort 1 terabyte of 100 byte records • Takes about 14 min. • Input rate is higher than the shuffle rate and the output rate; locality • Shuffle rate is higher than output rate • Output phase writes two copies for reliability

  32. Performance • MR_Sort: Backup task and failure tolerance • Backup tasks reduce job completion time significantly • System deal well with failures

  33. Experience • Large-scale indexing • MapReduce used for the Google Web search service • As a results, • The indexing code is simpler, smaller, and easier to understand • Performance is good enough • Locality makes it easy to change the indexing process • A few months  a few days • MapReduce takes care of failures, slow machines • Easy to make indexing faster by adding more machines

  34. Experience • The number of MapReduce instances grows significantly over time • 2003/02: first version • 2004/09: almost 900 • 2006/03: about 4000 • 2007/01: over 6000 MapReduce instances over time

  35. Experience • New MapReduce Programs Per Month • The number of new MapReduce programs increases continuously

  36. Experience • MapReduce statistics for different months

  37. Conclusion

  38. Are every tasks suitable for MapReduce? • NOT every tasks are suitable for MapReduce: • NOT Suitable if… • Suitable if…

  39. Is it trend? Really? • Job market trend: • ‘World says 'No' to NoSQL’ – written by IBM (2011.9, BNT RackSwitchG8264) • Comparing to SQL, • Much harder to learn • It cannot solve all problems in the world • E.g. Fibonacci: • Main stream enterprise don’t need it • they already have skillful engineers of another languages. SQL: 4% MapReduce: 0……% Percentage of matching job postings

  40. Conclusion • Focus on problem: • let library deal with messy details • Automatic parallelization and distribution • MapReduce has proven to be a useful abstraction • MapReduce Simplifies large-scale computations at Google • Functional programming paradigm can be applied to large-scale application

  41. EOD

More Related