200 likes | 212 Views
The Standard Genetic Algorithm. Dr. Chrisantha Fernando Systems Biology Centre University of Birmingham. DIY Evolution. Genotype. Represents length of left leg. Binary String Real numbered String. Size of head, etc…. 1101011111110101. 0.2, 0.5, 0.2, 0.83, 0.01, 0.3. 1101000101110101.
E N D
The Standard Genetic Algorithm Dr. Chrisantha Fernando Systems Biology Centre University of Birmingham
Genotype Represents length of left leg. • Binary String • Real numbered String Size of head, etc… 1101011111110101 0.2, 0.5, 0.2, 0.83, 0.01, 0.3 1101000101110101 0.6, 0.1, 0.5, 0.8, 0.9, 1.0
Evaluation • Interpret the genotype to produce the phenotype. • In the simplest case they are the same thing. • E.g. Imagine we desire the string • 000000000 • We can define fitnessof any string as the number of places where it is the same as the above string, e.g. • 000000000 • 010101010 • ------------- 101010101 = 5 = Fitness
A Trivial Example • int evaluate(int *g) { • int i, • r=0; • for (i=0;i<10;i++) • r += (g(i) == 0); • return(r); • } I. Harvey
So first initialize a population • int popn[30][10]; • void initialise_popn() { • int i,j; • for (i=0;i<30;i++) • for (j=0;j<10;j++) • popn[i][j]= flip_a_bit(); • } I. Harvey
Main Loop • For n times round generation loop • evaluate all the population (of 30) • select preferentially the fitter ones as parents • for 30 times round repro loop • pick 2 from parental pool • recombine to make 1 offspring • mutate the offspring • end repro loop • throw away parental generation and replace with offspring • End generation loop I. Harvey
More complicated Evaluations • Genotype encodes a neural network. Dario Floreano’s Lab
Problems in Practice • An evaluation may take a long time, in which case the genetic algorithm will be slow. • An evaluation may be noisy, so the same agent may have different fitness each time you make an evaluation. If it is too noisy, then good agents may be lost from the population.
Selection Methods • Truncation Selection • All parents come from top 50% or top 20% etc.. • Fitness Proportionate Selection • E.g. if all fitnesses are 2, 4, 6, 8, 9, then select parent using roulette wheel selection with probability 2/29, 4/29, 6/29, 8/29, 9/29 • Problems with this are • If early on one agent dominates there is too much selective pressure. • If later agents have very similar fitnesses there is too little selective pressure. • Scaling methods can be used to get around these problems.
Rank Selection: Ignore absolute fitness, can use other slopes, but here the best is selected 2x more than the average. 2 0
Elitism • Force a direct un-mutated copy of the best of the last generation. • Never loose the best. • Useful if there is a lot of noise in fitness assessments, or if mutation is very likely to produce a low fitness offspring.
Mutation (Asexual Reproduction) • Mutate at randomly chosen loci with a small probability. • Mutate all loci by a very small amount (vector mutation). • With binary you do bit flips, with real valued mutation you might multiply the value by a Gaussian distributed random number.
Recombination (Sexual Reproduction) Parent A 10100010101111 10100010000111 Parent B 00000011000111 00000011101111 1-point random equal crossover Uniform crossover 00100010100111
The Competing Conventions Problem Head, Eye, Face, Leg Face, Eye, Leg, Head Head, Eye, Face, Head Face, Eye, Leg, Leg OK, so two heads, and two legs. Sometimes competing conventions can help, some- times it can hinder.
Schema Theorem • John Holland. • A theory of how GAs work. Not everyone agrees with this, but it is worth reading his book if you are interested.
A Black Art • No universal algorithm suitable for all cases. • Need to get a feeling for it by doing it.
Relationship to Real Genomes • Usually GAs use haploid genomes, not diploid ones, i.e. there is only one copy of each ‘gene’.
Homework. I’m happy to help if you need. C.T.Fernando@cs.bham.ac.uk • The Card Problem • You have 10 cards numbered from 1 to 10. You have to choose a way of dividing them into 2 piles, so that the cards in Pile_0 SUM to a number as close as possible to 36, and the remaining cards in Pile_1 MULTIPLY to a number as close as possible to 360. • Genotype encoding • Each card can be in Pile_0 or Pile_1, there are 1024 possible ways of sorting them into 2 piles, and you have to find the best. Think of a sensible way of encoding any possible solution-attempt as a genotype. • Fitness • Some of these solution-attempts will be closer to the target than others. Think of a sensible way of evaluating any solution-attempt and scoring it with a fitness-measure. • The GA • Write a program, in any sensible programming language, to run a GA with your genotype encoding and Fitness function. Run it 100 times and see what results you get. Who can write the GA that solves the problem in the least number of generations?