220 likes | 232 Views
A Model of Computation for MapReduce. Karloff, Suri and Vassilvitskii (SODA ’ 10). Presented by Ning Xie. Why MapReduce. Tera- and petabytes data set (search engines, internet traffic, bioinformatics, etc) Need parallel computing Requirement: easy to program, reliable, distributed.
E N D
A Model of Computation for MapReduce Karloff, Suri and Vassilvitskii (SODA’10) Presented by Ning Xie
Why MapReduce • Tera- and petabytes data set (search engines, internet traffic, bioinformatics, etc) • Need parallel computing • Requirement: easy to program, reliable, distributed
What is MapReduce • A new framework for parallel computing originally developed at Google (before ’04) • Widely adopted and became standard for large scale data analysis • Hadoop (open source version) is being used by Yahoo, Facebook, Adobe, IBM, Amazon, and many institutions in academia
What is MapReduce (cont.) • Three-stage operations: • Map-stage: mapper operates on a single pair <key, value>, outputs any number of new pairs <key’, value’> • Shuffle-stage: all values that are associated to an individual key are sent to a single machine (done by the system) • Reduce-stage: reducer operates on the all the values and outputs a multiset of <key, value>
What is MapReduce (cont.) • Map operation is stateless (parallel) • Shuffle stage is done automatically by the underlying system • Reduce stage can only start when all Map operations are done (interleaving between sequential and parallel)
An example: kth frequency moment of a large data set • Input: x 2Σn , Σ is a finite set of symbols • Let f(¾) be the frequency of symbol ¾ note: ¾f(¾)=n • Want to compute ¾fk(¾)
An example (cont.) • Input to each mapper: <i, xi> • M1(<i, xi>)= <xi , i> (i is the index) • Input to each reducer: <xi,{i1, i2,…, im}> • R1(<xi,{i1, i2,…, im}>)=<xi, mk> • Each mapper: M2(<xi, v>)=<$, v> • A single reducer: R2(<$,{v1,…,vl}>=<$, ivi>
Formal Definitons • A MapReduce program consists of a sequence <M1, R1, M2, R2,…, Ml, Rl> of mappers and reducers • The input is U0, a multiset of <key,value>
Formal Definitons • Execution of the program For r=1,2,…,l 1. feed each <k,v> in Ur-1 to mapper Mr Let the output be U’r 2. for each key k, construct the multiset Vk,r s.t. <k, vi> 2 Ur-1 3. for each k, feed k and some perm. of Vk,r to a separate instance of Rr. Let Ur be the multiset of <key, value> generated by Rr
The MapReduce Class (MRC) • On input {<key,value>} of size n • Memory: each mapper/reducer uses O(n1-² ) space • Machines: There are £(n1-²) machines available • Time: each machine runs in time polynomial in n, not polynomial in the length of the input they receive • Randomized algorithms for map and reduce • Rounds: Shuffle is expensive • MRCi: num. of rounds=O(login) • DMRC: the deterministic variant
Comparing MRC with PRAM • Most relevant classical computation model is the PRAM (Parallel Random Access Machine) model • The corresponding class is NC • Easy relation: MRCµP • Lemma: If NC P, then MRC* NC • Open question: show that DMRC P
Comparing with PRAM (cont.) • Simulation lemma: Any CREW (concurrent read exclusive write) PRAM algorithm using O(n2-2²) total memory and O(n2-2²) processors and runs in time t(n) can be simulated by an algorithm in DMRC which runs in O(t(n)) rounds
Example: Finding an MST • Problem: find the minimum spanning tree of a dense graph • The algorithm • Randomly partition the vertices into k parts • For each pair of vertex sets, find the MST of the bipartite subgraph induce by these two sets • Take the union of all the edges in the MST of each pair, call the graph H • Compute an MST of H
Finding an MST (cont.) • The algorithm is easy to parallelize • The MST of each subgraph can be computed in parallel • Why it works? • Theorem: the MST tree of H is an MST of G • Proof: we did not discard any relevant edge when sparsify the input graph G
Finding an MST (cont.) • Why the algorithm in MRC? • Let N=|V| and m=|E|=N1+c • So input size n satisfies N=n1/1+c • Pick k=Nc/2 • Lemma: with high probability, the size of each bipartite subgraph has size N1+c/2 • so the input to any reducer is n1-² • The size of H is also n1-²
Functions Lemma • A very useful building block for designing MapReduce algorithms • Definition [MRC-parallelizable function]: Let S be a finite set. We say a function f on S is MRC-parallelizable if there are functions g and h so that the following hold: • For any partition of S, S = T1[ T2[…[ Tk • f can be written as: f(S) =h(g(T1), g(T2),… g(Tk)). • g and h each can be represented in O(logn) bits. • g and h can be computed in time polynomial in |S| and all possible outputs of g can be expressed in O(logn) bits.
Functions Lemma (cont.) • Lemma (Functions Lemma) Let U be a universe of size n and let S = {S1,…, Sk} be a collection of subsets of U, where k· n2-3² and i=1k|Si|· n2-2². Let F = {f1, …, fk} be a collection of MRC-parallelizable functions. Then the output f1(S1), …, fk(Sk) can be computed using O(n1-²) reducers each with O(n1-²) space.
Functions Lemma (cont.) • The power of the lemma • Algorithm designer may focus only on the structure of the subproblem and the input • Distribution the input across reducer is handled by the lemma (existence theorem) • Proof of the lemma is not easy • Use universal hashing • Use Chernoff bound, etc
Application of the Functions Lemma: s-t connectivity • Problem: given a graph G and two nodes, are they connected in G? • Dense graphs: easy, powering adjacency matrix • Sparse graphs?
A logn-round MapReduce algorithm for s-t connectivity • Initially every node is active • For i=1,2,…, O(logn) do • Each active node becomes a leader with probability ½ • For each non-leader active node u, find a node v in the neighbor of u’s current connected component • If the connected component of v is non-empty, then u become passive and re-label each node in u’s connected component with v’s label • Output TRUE if s and t have the same label, FALSE otherwise
Conclusions • A rigorous model for MapReduce • Very loose requirements for the hardware requirements • Call for more research in this direction