1 / 28

MAP-REDUCE

MAP-REDUCE. SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS : SYSTEMS, WEB. What is MapReduce?. A programming model and an associated implementation(library) for processing and generating large data sets (on large clusters).

curtisk
Download Presentation

MAP-REDUCE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MAP-REDUCE SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB

  2. What is MapReduce? • A programming model and an associated implementation(library) for processing and generating large data sets (on large clusters). • A new abstraction allowing to express the simple computations that hides the messy details of parallelization, fault-tolerance, data distribution and load balancing in a library.

  3. Motivation • Large-Scale Data Processing • Want to use 1000s of CPUs • But don’t want hassle of managing things • MapReduce provides • Automatic parallelization & distribution • Fault tolerance • I/O scheduling • Monitoring & status updates

  4. MAP-REDUCE AT GOOGLE • The MapReduce programming model has been successfully used at Google for many different purposes. • First, the model is easy to use, even for programmers without experience with parallel and distributed systems, since it hides the details of parallelization, fault-tolerance, locality optimization, and load balancing. • Second, a large variety of problems are easily expressible as MapReduce computations. For example, MapReduce is used for the generation of data for Google's production web search service, for sorting, for data mining, for machine learning, and many other systems. • Third, developed an implementation of MapReduce that scales to large clusters of machines comprising thousands of machines. The implementation makes efficient use of these machine resources and therefore is suitable for use on many of the large computational problems encountered at Google.

  5. Count, Illustrated map(key=url, val=contents): For each word w in contents, emit (w, “1”) reduce(key=word, values=uniq_counts): Sum all “1”s in values list Emit result “(word, sum)” bob 1 run 1 see 2 spot 1 throw 1 see 1 bob 1 run 1 see 1 spot 1 throw 1 see bob throw see spot run

  6. More Examples • Distributed grep: • Map: (key, whole doc/a line)  (the matched line, key) • Reduce: identity function • Count of URL Access Frequency: • Map: logs of web page requests  (URL, 1) • Reduce: (URL, total count) • Reverse Web-Link Graph: • Map: (source, target)  (target, source) • Reduce: (target, list(source))  (target, list(source)) • Inverted Index: • Map: (docID, document)  (word, docID) • Reduce: (word, list(docID))  (word, sortedlist(docID))

  7. Execution

  8. Parallel Execution

  9. Implementation • In Google clusters, comprised of the top of the line PCs. • Intel Xeon 2 x 2MB, HyperThreading • 2-4GB Memory • 100 M– 1G network • Local IDE disks + Google F.S. • Submit job to a scheduling system

  10. M R R

  11. Fault Tolerance • Fault Tolerance – in a word: redo • Master pings workers, re-schedules failed tasks. • Note: Completed map tasks are re-executed on failure because their output is stored on the local disk. • Master failure: redo • Semantics in the presence of failures: • Deterministic map/reduce function: Produce the same output as would have been produced by a non-faulting sequential execution of the entire program • Rely on atomic commits of map and reduce task outputs to achieve this property.

  12. Refinements • Partitioning • Ordering guarantees • Combiner function • Side effects • Skipping bad records • Local execution • Status information • Counters

  13. Backup tasks • Straggler: a machine that takes an unusually long time to complete one of the last few map or reduce tasks in the computation. • Cause: bad disk, … • Resolution: schedule backup of in-progress tasks near the end of MapReduce Operation

  14. ‥ ‥ ‥ ‥ ‥ ‥ ‥ ‥ ‥ Partitioning • Partition output of a map task to R pieces • Default: hash(key) mod R • User provided • E.g. hash(Hostname(url)) mod R M One Partition R

  15. Ordering guarantees • Guarantee: within a given partition, the intermediate key/value pairs are processed in increasing key order. • MapReduce Impl. of distributed sort • Map: (key, value)  (key for sort, value) • Reduce: emit unchanged.

  16. Combiner function • E.g. : word count, many <the, 1> • Combine once before reduce task, for saving network bandwidth • Executed on machine performing map task. • Typically the same as reduce function • Output to an intermediate file • Example: count words

  17. Skipping Bad Records • Skipping Bad Records • Ignoring certain records makes tasks crash • An optional mode of execution • Install a signal handler to catch segmentation violations and bus errors.

  18. Status Information • Status Information • The master runs an internal HTTP server and exports a set of status pages • Monitor progress of computation: how many tasks have been completed, how many are in progress, bytes of input, bytes of intermediate data, bytes of output, processing rates, etc. The pages also contain links to the standard error and standard output files generated by each task. • In addition, the top-level status page shows which workers have failed, and which map and reduce tasks they were processing when they failed.

  19. Counters

  20. Performance • Tests on grep and sort • Cluster characteristics • 1800 machines (!) • Intel Xeon 2x2MB, HyperThreading • 2-4 GB Bytes Memory • 100 M– 1G network • Local IDE disks + Google F.S.

  21. Grep • 1 terabyte: 10^10 100-byte records • Rare three-character pattern (10^5 freq.) • Split input into 64 MB pieces, M=15000 • R=1 (output is small)

  22. Grep

  23. Grep: Observations • Peak at 30 GB/s (1764 workers) • 1 minute startup time • Propagation of program to workers • GFS: open 1000 input files • Locality optimization • Completed in <1.5 minutes

  24. Sort • 1 terabyte: 10^10 100-byte records • Extract 10-byte sorting key • Map: Emit <key,val.>: <10-byte,100-byte> • Reduce: identity • 2-way replication of output • For redundancy, typical in GFS • M=15000, R=4000 • May need pre-pass MapReduce for computing distribution of keys

  25. Sort: Observations • Input rate less than for grep • Two humps: 2*1700 ~ 4000 • Final output delayed because of sorting • Rates: input>shuffle,output (locality!) • Rates: shuffle>output (writing 2 copies) • Effect of backup • Effect of machine failures

  26. Sort

  27. CONCLUSIONS • Restricting the programming model makes it easy to parallelize and distribute computations and to make such computations fault-tolerant. • Network bandwidth is a scarce resource. A number of optimizations in the system are therefore targeted at reducing the amount of data sent across the network: the locality optimization allows to read data from local disks, and writing a single copy of the intermediate data to local disk saves network bandwidth. • Redundant execution can be used to reduce the impact of slow machines, and to handle machine failures and data loss.

More Related