1 / 53

PARALLEL GENETIC ALGORITHMS AND THE SCIENCE OF ASTEROSEISMOLOGY

PARALLEL GENETIC ALGORITHMS AND THE SCIENCE OF ASTEROSEISMOLOGY. A Review of the Doctoral Dissertation Research of Dr. Travis Metcalfe. Outline. Introduction The Science of Asteroseismology The Genetic Algorithm Parallel Computing Conclusion. Introduction.

gordy
Download Presentation

PARALLEL GENETIC ALGORITHMS AND THE SCIENCE OF ASTEROSEISMOLOGY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PARALLEL GENETIC ALGORITHMS AND THE SCIENCE OF ASTEROSEISMOLOGY A Review of the Doctoral Dissertation Research of Dr. Travis Metcalfe

  2. Outline • Introduction • The Science of Asteroseismology • The Genetic Algorithm • Parallel Computing • Conclusion

  3. Introduction Astronomers observe the universe and gather information about it. They then fit this information into mathematical models. The process of “fitting” involves adjusting the many parameters of the model. When they have a good fit, they use the parameter settings to tell them something about the object or phenomenon they are studying. The author uses a parallel genetic algorithm to solve this problem of optimization.

  4. The Goal of the Research To Further the Understanding of the Composition and Characteristics of White Dwarves More Generally, Since White Dwarves are the Endpoint for all but the most massive stars, this research can lead to a better understanding of stellar evolution

  5. * Source

  6. Traditional Technique • Make an initial “guess” for parameter values • Use some iterative technique to improve upon the initial guesses.

  7. Adjustable Input Parameters • Mass • Temperature • H and He layer masses • Convective Efficiency • Core composition

  8. Problem with this technique • Results often depend on the initial guess • The initial guess is inherently subjective, often the result of intuition or past experience

  9. The Genetic Algorithm • A genetic algorithm provides a more systematic approach to optimizing the results • The genetic algorithm used was PIKAIA • PIKAIA is a general purpose “function optimization” genetic algorithm • Public domain software • Fortran-77

  10. Outline • Introduction • The Science of Asteroseismology • The Genetic Algorithm • Parallel Computing • Conclusion

  11. White dwarves which show a regular variation in light intensity are known as pulsating white dwarves • Using photometric techniques, this variation in intensity can be very accurately measured with such instruments as the Whole Earth Telescope (WET)

  12. The pulsation is the result of seismic activity within the white dwarf • Just as seismological information can be used to study the internal nature of the earth, seismological data, as expressed in varying stellar luminosity, can be used to determine the characteristics of these pulsating white dwarves.

  13. Observed Light Curve for the White Dwarf GD 358.

  14. Outline • Introduction • The Science of Asteroseismology • The Genetic Algorithm • Parallel Computing • Conclusion

  15. Initial Conditions • Population size: 1000 ( in later work this was reduced to 128). • No rationale was given for how the initial population value was chosen, or why it was changed. • For each member of the initial population, parameter values are randomly set

  16. Duration • Until the difference between the average fitness and the best fitness in the population were less than 1%. • In later work, he used a constant 200 generations.

  17. Fitness Measurement • The model is then run using these initial values • Fitness is based on the root-mean-square differences between the observed and calculated pulsation periods

  18. Fitness Measurement • The fitness value is converted to a survival probability by normalizing with respect to the most fit member • The next generation is chosen randomly. This random selection is weighted, based on each member’s survivability ratio

  19. Crossover • Numerical encoding • Each of the initial parameter values are concatenated into one long string • A single point crossover technique is used. The position along the string is picked randomly

  20. Mutation • Mutation is achieved by randomly selecting a number in the string and changing it to a new, randomly chosen value

  21. Illustration • Consider two members, each with two parameters. • M1 has X=2.573 and Y= 4.457. • M2 has parameter values X=3.547 and Y=2.332. • After encoding, M1=25734457 and M2=35472332

  22. Illustration • The crossover point is randomly chosen, and the string segments swapped M1 25734|457  25734332 M2 35472|332  35472457

  23. Illustration • Mutating M1 involves picking a random spot along the string, and changing that value: M1 257|3|4332  25784332

  24. Illustration* • The strings would then be parsed back into parameter values. For M1, this would be: M1 X= 2.578 Y=4.332 * Modified from [1]

  25. Crossover and Mutation Rate • The cross over rate: 65% • The mutation rate: 0.3%. • In later work, the author increased the crossover rate to 85% and varied the mutation rate from 0.1% to 16.6%, depending on the variation between the mean fitness value, and the best fitness value

  26. Elitism • The most fit solution was passed unaltered the next generation

  27. Rationale • The idea behind the relatively low crossover and mutation rate is to prevent removing promising solutions from each generation too rapidly

  28. Repetition • The paper states: “Repeating this procedure many times with different random number seeds helps to ensure that the minimum found is truly global” • It does not elaborate on how many Many times is, though

  29. Repetition • In a later paper, he uses 5 repetitions • This result was obtained in the following way…

  30. Values were put in for the model, and pulsation periods generated. • The genetic algorithm attempted to find the original parameters based on the output of the model • This was done 20 times, and the results were as follows…

  31. Results (second paper) • First Order Solution…

  32. The genetic algorithm found the exact result 9/20 times, and was close enough on four other occasions for the correct result to be determined by the addition of some other iterative technique, for a total of 65% accuracy.

  33. If the GA was rerun, and the best result selected, the accuracy increased to 88% • After 5 runs, the accuracy was over 99% • Because no correct answer was found after 200 iterations, the number of generations was reduced to 200

  34. Output Curve

  35. Outline • Introduction • The Science of Asteroseismology • The Genetic Algorithm • Parallel Computing • Conclusion

  36. Problem Division • Part one: running the numerical model using a large number of different initial parameters. • Part two: determining fitness, selecting the next generation, and performing crossover/mutation

  37. Master-Slave Paradigm • Part one – running the model with a given set of parameters was performed by the slave nodes • Part two – fitness evaluation, selection/crossover/mutation was performed by the master node

  38. PVM • PVM was used as the message passing library

  39. Execution • The master machine generates a job pool of parameter values that it passes to the slave machines. • The slave machines in turn run the model and return the results to the master. • If there are more parameter sets available, the node is given another job.

  40. Execution • The master calculates variance. • Determines fitness. • After the models have been run for a given generation, the master determines the members of the next generation and runs the crossover/mutation methods on the appropriate portion of the new population. • As the new parameters are created, they are sent to the workstations.

  41. The Network • The Cluster is composed of one master computer and 64 slave nodes • The cluster of computers is divided into three subnets • Each subnet is connected to the master serially, using coaxial cable and a 10base-2 (thin Ethernet) system

  42. Darwin • Pentium-II 333 MHz system with 128 MB RAM • Two 8.4 GB hard disks. • Three NE-2000 compatible network cards, one for each of the segments

  43. Darwin

  44. Nodes • Motherboard • Processor • Single 32 MB RAM chip • NE-2000 compatible network card • No Hard drive!

  45. Nodes • Half of the nodes contain Pentium-II 300 MHz processors, while the other half are AMD K6-II 450 MHz chips

  46. The Cluster

  47. Conclusion • Based on initial results, the use of genetic algorithms appears to be a promising method for minimizing the residual difference between observational data and the Wilson—Devinney model

  48. Conclusion • It is also a wonderful example of how parallel computing, open source software and clusters of workstations can have a profound impact on the course of research.

  49. PIKAIA Namesake “Pikaia Gracilens, a little worm-like beast that crawled in the mud of a long gone seafloor of the Cambrian era, 530 million years ago. While not particularly impressive in the tooth and claw department, Pikaia is believed to be the founder of the phylum Chordata, whose subsequent evolution had consequences still very much felt today by the rest of the ecosystem”

More Related