1 / 35

Scaling Simple and Compact Genetic Algorithms using MapReduce

Distributed systems Spring 2011. Scaling Simple and Compact Genetic Algorithms using MapReduce. Puya Ghazizadeh. Evolutionary algorithm.

amber
Download Presentation

Scaling Simple and Compact Genetic Algorithms using MapReduce

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed systems Spring 2011 Scaling Simple and Compact Genetic Algorithmsusing MapReduce PuyaGhazizadeh

  2. Evolutionary algorithm • EA uses some mechanisms inspired by biological evolution: reproduction, mutation, recombination, and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the environment within which the solutions • Evolutionary algorithms often perform well approximating solutions to all types of problems because they ideally do not make any assumption about the underlying fitness landscape • Related techniques • Genetic algorithm • Ant colony optimization • Bees algorithm • Particle swarm optimization

  3. Genetic Algorithms - History • Pioneered by John Holland in theuniversity of Michigan • Got popular in the late 1980’s • Based on ideas from Darwinian Evolution • Can be used to solve a variety of problems that are not easy to solve using other techniques

  4. Some Applications of GAs Control systems design Software guided circuit design Optimization GA search Path finding Mobile robots Internet search Trend spotting Data mining Stock prize prediction

  5. Simple Genetic Algorithm produce an initial population of individuals evaluate the fitness of all individuals while termination condition not met do select fitter individuals for reproduction recombine between individuals mutate individuals evaluate the fitness of the modified individuals generate a new population End while

  6. Algorithmic Phases Initialize the population Select individuals for the mating pool Perform crossover Perform mutation Insert offspring into the population Stop? no yes The End

  7. Example:the MAXONE problem Suppose we want to maximize the number of ones in a string of l binary digits Is it a trivial problem? It may seem so because we know the answer in advance However, we can think of it as maximizing the number of correct answers, each encoded by 1, to l yes/no difficult questions`

  8. Representing Genomes... Representation Example string 1 0 1 1 1 0 0 1 array of strings http avalayubc net ~apopovic

  9. Selection • Roulette Wheel Selection • Rank Selection • Elitism • it prevents losing the best found solution

  10. Crossover • combine two individuals to create new individuals • for possible inclusion innext generation • main operator for local search (looking close to • existing solutions) • perform each crossover with probability pc{0.5,…,0.8} • crossoverpoints selected at random

  11. Initial Strings Offspring Single-Point 11000101 01011000 01101010 11000101 01011001 01111000 00100100 10111001 01111000 00100100 1011100001101010 Two-Point 11000101 01011000 01101010 11000101 0111100101101010 00100100 10111001 01111000 001001001001100001111000 Uniform 11000101 01011000 01101010 010001010111100001111010 00100100 10111001 01111000 101001001001100101101000

  12. Mutation • each component of every individual is modified with • probability pm • main operator for global search (looking at new • areas of the search space) • pm usually small {0.001,…,0.01} • rule of thumb = 1/no. of bits in chromosome

  13. Map Reducer and GA • MPI-based parallel GAs require detailed knowledge about machine architecture • how genetic algorithms can be modeled into the MapReducemodel • demonstrate a transformation of genetic algorithms into the map and reduce primitives • implement the MapReduce program and demonstrate its scalability to large problem sizes

  14. Map • Map evaluates the fitness of the given individual • Also, it keeps track of the thebest individual and finally, writes it to a global file in the Distributed File System (HDFS) • The client, which has initiated the job, reads these values from all the mappers at the end of the MapReduce and checks if the convergence criteria has been satisfied.

  15. Partitioner • The partitioner splits the intermediate key/value pairs among the reducers. • The function getPartition() returns the reducer to which the given (key, value) should be sent to.

  16. Partitioner • default implementation, it uses Hash(key) % numReducers • all the values corresponding to a given key end up at the same reducer which can then apply the Reduce function. • this does not suit the needs of genetic algorithms because of two reasons • Firstly, the Hash function partitions the namespace of the individuals N into r distinct classes -> artificial spatial constraint -> convergence take more or never converge • Secondly, as the genetic algorithm progresses, the same (close to optimal) individual begins to dominate the population. All copies of this individual will be sent to a single reducer which will get overloaded • Finally, when the GA converges, all the individuals will be processed by that single reducer. Thus, the parallelism decreases as the GA converges and hence, it will take more iterations.

  17. override the default partitioner • shuffles individuals randomly across the different reducers

  18. Reduce • Tournament selection without replacement • This process is repeated population number of times • when the tournament window is full, SelectionAndCrossover is carried out • When the crossover window is full, we use the Uniform Crossover operator • For our implementation, we set the S to 5 and crossover is performed using two consecutively selected parents

  19. Optimizations • larger problem sizes • the serial initialization of the population takes a long time • According to Amdahl’s law, the speedup is bounded because of this serial component. • Amdahl’s law : parallel computing is limited by the time needed for the sequential fraction of the program • create the initial population in a separate MapReduce phase, in which the Map generates random individuals and the Reduce is the Identity Reducer

  20. Map Reduce : Initialization • create the initial population in a separate MapReduce phase, in which the Map generates random individuals and the Reduce is the Identity Reducer • Due to the inability of expressing loops in the MapReduce model, each iteration consisting of a Map and Reduce, has to executed till the convergence criteria is satisfied.

  21. MapReducing Compact Genetic Algorithms • The Compact Genetic Algorithm • replaces traditional variation operators of genetic algorithms by building a probabilistic model of promising solutions • When these two individuals compete, individual a will win. At the level of the gene, however, a decision error is made on the second position. That is, selection incorrectly prefers the schema

  22. Compact Genetic Algorithms • The update step of the compact GA has a constant size of . While the simple GA needs to store nbits for each gene position, the compact GA only needs to keep the proportion of ones (and zeros), a finite set of numbers that can be stored with

  23. Compact Genetic Algorithms • vector is updated by shifting its value by the contribution of a single individual to the total frequency assuming a particular population size. • cGAsignificantly reduces the memory requirements when compared with simple genetic algorithms

  24. Compact Genetic Algorithm and Hadoop • encapsulate each iteration of the CGA as a seperate single MapReducejob • client accepts the commandline parameters, creates the initial probability vector splits and submits the MapReduce job • Let the probability vector be P = {pi : pi = P robability of the variable(i) = 1}. • Such an approach would allow us to scale over a billion variables, if P is partitioned into m different partitions P1, P2, . . . , Pm where m is the number of mappers.

  25. Map • Generation of the two individuals matches the Map function • Map takes a probability split Pi as input and outputs the tournamentSize individuals splits, as well as the probability split. • Also, it keeps track of the number of ones in both the individuals and writes it to a global file in the Distributed File System (HDFS). All the reducers, later read these values.

  26. Reduce • Tournament selection without replacement • among tournamentSize generated individuals and the winner and the loser is selected

  27. Optimizations • use optimizations similar to the simple GA • for larger problem sizes, the serial initialization of the population takes a long time. • create the initial population in a seperateMapReducephase • Map generates the initial probability vector • Reduce is the Identity Reducer.

  28. Results • OneMaxProblem • ran it on 416 core (52 nodes) Hadoopcluster • Each node runs a two dual Intel Quad cores, 16GB RAM and 2TB hard disks • Each node can run 5 mappers and 3 reducers in parallel • Whichever node finishesfirst, writes the output and the other speculated jobs are killed • For each experiment, the population for • the GA is set to n log n where n is the number of variables

  29. Simple GA Experiments • Convergence Analysis • GA converges in 220 iterations taking an average of 149 seconds per iteration

  30. Simple GA Experiments 2. Scalability with constant load per node • load set to 1,000 variables per mapper

  31. Simple GA Experiments 3. Scalability with constant overall load • keep the problem size fixedto 50,000 variables and increase the number of mappers • saturation of the map capacity causes a slight increase in the time per iteration after 250 mappers • However, the overall speedup gets bounded by Amdahl’s law introduced by Hadoop’s overhead

  32. Simple GA Experiments 4. Scalability with increasing the problem size • implementation scales to n = 10^5 variables, keeping the population set to n log n

  33. Compact GA Experiments • keep the load set to 200,000 variables per mapper

  34. Compact GA Experiments • implementation scales to n = 108 variables, keeping the population set to n log n

  35. Discussion of Related Work • Message Passing Interface (MPI) has been used for implementing parallel GAs • MPIs do not scale well on clusters where failure is the norm • if a node in an MPI cluster fails, the whole program is restarted • In a large cluster, a machine is likely to fail during the execution of a long running program, and hence efficient fault tolerance is necessary • This forces the user to handle failures by using complex checkpointing techniques • Map Reducer is a programming model that enables the users to easily develop large-scale distributed applications

More Related