MapReduce Implementation for Distributed Computing

Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License.

Last Class • Input Handling • Map Function • Partition Function • Compare Function • Reduce Function • Output Writer

map (Functional Programming) Creates a new list by applying f to each element of the input list; returns output in order. map f lst: (’a->’b) -> (’a list) -> (’b list)

Fold Moves across a list, applying f to each element plus an accumulator. f returns the next accumulator value, which is combined with the next element of the list fold f x0 lst: ('a*'b->'b)->'b->('a list)->'b

Advantages of MapReduce • Flexible for a wide range of problems • Fault tolerant • Scalable

Overview • Hardware • Task assignment • Failure • Non-Determinism • Optimizations

Commodity Hardware • Cheap Hardware • 2 – 4 GB memory • 100 megabit / sec • x86 processors running Linux • Cheap Hardware + Lots of It = Failure!

Master vs Worker • Users submit jobs into scheduling system • Implement map and reduce • Specify M map tasks and R reducers • Many copies of program started • One task is the master • Master assigns map/reduce tasks to idle workers

Map Tasks • Input broken up into 16MB - 64MB chunks • M map tasks processed in parallel

Reduce Tasks • R reduce tasks • Assigned by partitioning function • Typically: hash(key) mod R • Sometimes useful to customize

Master Data Structures • For each map / reduce task, store state and identity of machine • State: Idle, In-Progress, Complete • For each complete map task, store locations of output (R locations)

Worker with Map Tasks • Parses input data into key/value pairs • Applies map • Buffered pairs written to disk, partitioned into R regions • Locations of output eventually passed to master

Worker with Reduce Tasks • Read data from map machines via RPC • Sorts data • Applies reduce • Output appended to final output file

After Reduce • When all complete, master wakes up user program • Output available in R output files, with names specified by user

How do you pick M and R • How many scheduling decisions? • O(M+R) • How much state in memory by master? • O(M*R) • M: much larger than number of machines • R: small multiple of number of machines

Failures & Issues • Worker Failure • Master Failure • Stragglers • Crashes, Etc

Worker Failure • Master pings worker • No response -> assumes failed • Failed map tasks • Completed & In-Progress tasks set to idle • Failed reduce tasks • In-Progress tasks set to idle

Master Failure • You could write checkpoints • In practice: just let the user deal with it

Stragglers (Causes) • Why? • Bad disk but correctable errors • Too many other tasks • No caching

Stragglers (Solutions) • Re-schedule remaining tasks when operation is close to completion • A task is complete when either primary or secondary task is complete

Crashes, Etc • Causes: • Bad Records • Bug in Third Party Code • Solution: Skip over errors?

Non-Determinism • Deterministic = distributed implementation produces same result as sequential execution • Non-Deterministic = map or reduce are non-deterministic

Non-Determinism • Guarantee: output for a specific reduce task is equivalent to some sequential operation • But: output from different reduce tasks may correspond to different sequential operations

Non-Determinism • There may be no sequential operation that matches the full output • Why? • Because R1 and R2 may have read outputs for the differentexecution of M

Advanced Stuff • Input Types • Combiner Function • Counters

Input Types • May need to change how input is read • Implement reader interface

Combiner • “Combiner” functions can run on same machine as a mapper • Causes a mini-reduce phase to occur before the real reduce phase, to save bandwidth Under what conditions is it sound to use a combiner?

Combiner Function • Can only be used if communicative and associative • Communicative: a + b + c = b + c + a • Associative: (a × b) × c = a × (b × c)

Counters • Global Counter • Masters handles issue of duplicate executions • Useful for sanity checking or debugging

Discussion Questions • 1. Give an example of a MapReduce problem not listed in the reading. In your example, what are the map and reduce functions (including inputs and outputs)? • 2. What part of the MapReduce implementation do you find most interesting? Why? • 3. Give an example of a distributable problem that should not be solved with MapReduce. What are the limitations of MapReduce that make it ill-suited for your task?

Discussion Questions • 1. Assuming you had a corpus of webpages as input such that the key for each mapper is the URL and the value is the text of the page, how would you design a mapper and a reducer to construct an inverse graph of the web - that is, for each URL output the list of web pages that point to it? 2. TF–IDF is a statistical value assigned to words in a document corpus that indicates the relative importance of the word. As part of computing it, the Inverse Document Frequency of a word is found from: The number of documents in the corpus divided by the number of documents containing the word. Given a corpus of documents, and given that you know how many documents are in the corpus, how would you use map reduce to find this quantity for every word in the corpus simultaneously?

MapReduce Implementation for Distributed Computing