130 likes | 283 Views
Abhishek Verma , Xavier Llora , David E. Goldberg, Roy H. Campbell. Scaling Genetic Algorithms using MapReduce. Motivation. Genetic Algorithms ( GAs ) applied to very large scale data- intensive problems Current approach: MPI Requires detailed knowledge of h/w architecture
E N D
AbhishekVerma, Xavier Llora, David E. Goldberg, Roy H. Campbell Scaling Genetic Algorithms using MapReduce
Motivation • Genetic Algorithms (GAs) • applied to very large scale data-intensiveproblems • Current approach: MPI • Requires detailed knowledge of h/w architecture • Complicated to program, debug, checkpoint • Does not scale on commodity clusters • MapReduce: simple and scalable abstraction • Use MapReduce to scale GAs Intelligent Systems Design and Applications 2009
Outline • Motivation • MapReduce • Genetic Algorithm • Approach • Experimental Results • Conclusion Intelligent Systems Design and Applications 2009
k1 v1 k1 v1 k2 v2 k1 v3 k1 v3 k1 v5 k2 v2 k2 v4 k2 v4 k1 v5 MapReduce Overview Input records h(k1) Output records Map Reduce h(k1) h(k2) Split h(k1) Reduce Map h(k2) Split Shuffle Intelligent Systems Design and Applications 2009
Genetic Algorithm • Initialize population with random individuals. • Evaluate fitness value of individuals. • Select good solutions by using tournament selection without replacement. • Create new individuals by recombining the selected population using uniform crossover. • Evaluate the fitness value of all offspring. • Repeat steps 3-5 until some convergence criteria are met. Intelligent Systems Design and Applications 2009
Genetic Algorithm • Initialize population with random individuals. • Evaluate fitness value of individuals. • Repeat steps 4-5 to 2 until some convergence criteria are met. • Select good solutions by using tournament selection without replacement. • Create new individuals by recombining the selected population using uniform crossover. Map Reduce Intelligent Systems Design and Applications 2009
MapReducing Genetic Algorithm Random partitioner 00010 10000 01001 <00010, 1> <10000, 1> <01001, 2> Map 10110 00001 Reduce <01001, 2> 10001 01000 10001 01000 Reduce 10101 10000 00000 <10101, 3> <10000, 1> <00000, 0> Map <10101, 3> Distributed File System Intelligent Systems Design and Applications 2009
MapReducing Genetic Algorithm (2) • Modifications • Mappers write to DFS so that clients can evaluate convergence criteria and control next iteration • Random partitioner function • Maintain a window of individuals in each reducer • Optimizations • Create the initial population in 0th MapReduce • Compactly represent bits in array of long ints Intelligent Systems Design and Applications 2009
Experimental Results • Experimental setup • 52 nodes: 16GB RAM, 2TB hard drives • Each node runs 5 mappers + 3 reducers • Population set to nlog(n) Intelligent Systems Design and Applications 2009
Scaling GAs to 100 million variables Intelligent Systems Design and Applications 2009
Conclusion • Modeled GAs in MapReduce • Scales on a commodity clusters to 100 million variables • Can also use Pthreads(Phoenix), GPUs(Mars), … • Future Work • Demonstrate scalability for practical applications • MapReduce Compact GAs and Extended Compact GAs • Comparison with MPI implementation Intelligent Systems Design and Applications 2009