530 likes | 645 Views
PARALLEL GENETIC ALGORITHMS AND THE SCIENCE OF ASTEROSEISMOLOGY. A Review of the Doctoral Dissertation Research of Dr. Travis Metcalfe. Outline. Introduction The Science of Asteroseismology The Genetic Algorithm Parallel Computing Conclusion. Introduction.
E N D
PARALLEL GENETIC ALGORITHMS AND THE SCIENCE OF ASTEROSEISMOLOGY A Review of the Doctoral Dissertation Research of Dr. Travis Metcalfe
Outline • Introduction • The Science of Asteroseismology • The Genetic Algorithm • Parallel Computing • Conclusion
Introduction Astronomers observe the universe and gather information about it. They then fit this information into mathematical models. The process of “fitting” involves adjusting the many parameters of the model. When they have a good fit, they use the parameter settings to tell them something about the object or phenomenon they are studying. The author uses a parallel genetic algorithm to solve this problem of optimization.
The Goal of the Research To Further the Understanding of the Composition and Characteristics of White Dwarves More Generally, Since White Dwarves are the Endpoint for all but the most massive stars, this research can lead to a better understanding of stellar evolution
Traditional Technique • Make an initial “guess” for parameter values • Use some iterative technique to improve upon the initial guesses.
Adjustable Input Parameters • Mass • Temperature • H and He layer masses • Convective Efficiency • Core composition
Problem with this technique • Results often depend on the initial guess • The initial guess is inherently subjective, often the result of intuition or past experience
The Genetic Algorithm • A genetic algorithm provides a more systematic approach to optimizing the results • The genetic algorithm used was PIKAIA • PIKAIA is a general purpose “function optimization” genetic algorithm • Public domain software • Fortran-77
Outline • Introduction • The Science of Asteroseismology • The Genetic Algorithm • Parallel Computing • Conclusion
White dwarves which show a regular variation in light intensity are known as pulsating white dwarves • Using photometric techniques, this variation in intensity can be very accurately measured with such instruments as the Whole Earth Telescope (WET)
The pulsation is the result of seismic activity within the white dwarf • Just as seismological information can be used to study the internal nature of the earth, seismological data, as expressed in varying stellar luminosity, can be used to determine the characteristics of these pulsating white dwarves.
Outline • Introduction • The Science of Asteroseismology • The Genetic Algorithm • Parallel Computing • Conclusion
Initial Conditions • Population size: 1000 ( in later work this was reduced to 128). • No rationale was given for how the initial population value was chosen, or why it was changed. • For each member of the initial population, parameter values are randomly set
Duration • Until the difference between the average fitness and the best fitness in the population were less than 1%. • In later work, he used a constant 200 generations.
Fitness Measurement • The model is then run using these initial values • Fitness is based on the root-mean-square differences between the observed and calculated pulsation periods
Fitness Measurement • The fitness value is converted to a survival probability by normalizing with respect to the most fit member • The next generation is chosen randomly. This random selection is weighted, based on each member’s survivability ratio
Crossover • Numerical encoding • Each of the initial parameter values are concatenated into one long string • A single point crossover technique is used. The position along the string is picked randomly
Mutation • Mutation is achieved by randomly selecting a number in the string and changing it to a new, randomly chosen value
Illustration • Consider two members, each with two parameters. • M1 has X=2.573 and Y= 4.457. • M2 has parameter values X=3.547 and Y=2.332. • After encoding, M1=25734457 and M2=35472332
Illustration • The crossover point is randomly chosen, and the string segments swapped M1 25734|457 25734332 M2 35472|332 35472457
Illustration • Mutating M1 involves picking a random spot along the string, and changing that value: M1 257|3|4332 25784332
Illustration* • The strings would then be parsed back into parameter values. For M1, this would be: M1 X= 2.578 Y=4.332 * Modified from [1]
Crossover and Mutation Rate • The cross over rate: 65% • The mutation rate: 0.3%. • In later work, the author increased the crossover rate to 85% and varied the mutation rate from 0.1% to 16.6%, depending on the variation between the mean fitness value, and the best fitness value
Elitism • The most fit solution was passed unaltered the next generation
Rationale • The idea behind the relatively low crossover and mutation rate is to prevent removing promising solutions from each generation too rapidly
Repetition • The paper states: “Repeating this procedure many times with different random number seeds helps to ensure that the minimum found is truly global” • It does not elaborate on how many Many times is, though
Repetition • In a later paper, he uses 5 repetitions • This result was obtained in the following way…
Values were put in for the model, and pulsation periods generated. • The genetic algorithm attempted to find the original parameters based on the output of the model • This was done 20 times, and the results were as follows…
Results (second paper) • First Order Solution…
The genetic algorithm found the exact result 9/20 times, and was close enough on four other occasions for the correct result to be determined by the addition of some other iterative technique, for a total of 65% accuracy.
If the GA was rerun, and the best result selected, the accuracy increased to 88% • After 5 runs, the accuracy was over 99% • Because no correct answer was found after 200 iterations, the number of generations was reduced to 200
Outline • Introduction • The Science of Asteroseismology • The Genetic Algorithm • Parallel Computing • Conclusion
Problem Division • Part one: running the numerical model using a large number of different initial parameters. • Part two: determining fitness, selecting the next generation, and performing crossover/mutation
Master-Slave Paradigm • Part one – running the model with a given set of parameters was performed by the slave nodes • Part two – fitness evaluation, selection/crossover/mutation was performed by the master node
PVM • PVM was used as the message passing library
Execution • The master machine generates a job pool of parameter values that it passes to the slave machines. • The slave machines in turn run the model and return the results to the master. • If there are more parameter sets available, the node is given another job.
Execution • The master calculates variance. • Determines fitness. • After the models have been run for a given generation, the master determines the members of the next generation and runs the crossover/mutation methods on the appropriate portion of the new population. • As the new parameters are created, they are sent to the workstations.
The Network • The Cluster is composed of one master computer and 64 slave nodes • The cluster of computers is divided into three subnets • Each subnet is connected to the master serially, using coaxial cable and a 10base-2 (thin Ethernet) system
Darwin • Pentium-II 333 MHz system with 128 MB RAM • Two 8.4 GB hard disks. • Three NE-2000 compatible network cards, one for each of the segments
Nodes • Motherboard • Processor • Single 32 MB RAM chip • NE-2000 compatible network card • No Hard drive!
Nodes • Half of the nodes contain Pentium-II 300 MHz processors, while the other half are AMD K6-II 450 MHz chips
Conclusion • Based on initial results, the use of genetic algorithms appears to be a promising method for minimizing the residual difference between observational data and the Wilson—Devinney model
Conclusion • It is also a wonderful example of how parallel computing, open source software and clusters of workstations can have a profound impact on the course of research.
PIKAIA Namesake “Pikaia Gracilens, a little worm-like beast that crawled in the mud of a long gone seafloor of the Cambrian era, 530 million years ago. While not particularly impressive in the tooth and claw department, Pikaia is believed to be the founder of the phylum Chordata, whose subsequent evolution had consequences still very much felt today by the rest of the ecosystem”