990 likes | 1.01k Views
Fitness Functions. Fitness Evaluation. A key component in GA Time/quality trade off Multi-criterion fitness. Fitness function. Nature : only survival and reproduction count how well do I do in my environment Fitness space structure : Defined by kinship of genotypes and fitness function
E N D
Fitness Evaluation • A key component in GA • Time/quality trade off • Multi-criterion fitness
Fitness function Nature: only survival and reproduction count how well do I do in my environment Fitness space structure: Defined by kinship of genotypes and fitness function Advantage: visual representation can be useful when thinking about model design Limitation: ideas might be too simplistic when not working on toy-problems - complex spaces and movements (think crossover!)
Fitness space or landscape Schema of genetic kinship How we “move” in that landscape over generations is defined by our variability operators, usually mutation and recombination Now add fitness… 0110 0010 0011 Hamming Distance =1 0100 0000 0001 1100 1000 1001
Fitness space or landscape Schema of genetic kinship • How we “move” in that landscape over generations is defined by our variability operators, usually mutation and recombination • Now add fitness… 0110 0010 0011 0100 0000 0001 fitness 1100 1000 1001
Fitness landscapes contd. x/y axes: kinship, i.e. the more genetic resemblance the closer together z axis: fitness Every “snowflake” one individual, search focuses on “promising” regions (due to differential reproduction) Animation adapted from Andy Keane, Uni. Of Southampton
Fitness space – Good design Easy to find the optimum by local search neighboring genotypes have similar fitness (smoothcurvehigh evolvability) Fitness Genotypes
Fitness space - Bad design Here we will have a hard time finding the optimum Low evolvability (fitness is right/wrong) Either problem not well suited for GA or bad design Fitness Genotypes
Fitness space – Mediocre design Many local optima, so we are likely to find one However not much of a gradient to find global optimum, random search could do as well Fitness Genotypes
Dynamic fitness landscape Fitness does not need to be static over generations Can allow to reach regions otherwise uncovered Natural fitness certainly very dynamic Animation by Michael Herdy, TU Berlin
Fitness Function Purpose • Parent selection • Measure for convergence • For Steady state: Selection of individuals to die • Should reflect the value of the chromosome in some “real” way • Next to coding, this is the most critical part of a GA
Fitness scaling • Fitness values are scaled by subtraction and division: • so that worst value is close to 0 • and the best value is close to a certain value, typically 2 • Chance for the most fit individual is 2 times the average • Chance for the least fit individual is close to 0 • Problems when the original maximum is very extreme (super-fit) or when the original minimum is very extreme (super-unfit) • Can be solved by defining a minimum and/or a maximum value for the fitness Example of Fitness Scaling
Fitness windowing • Fitness windowing is the same as window scaling, except the amount subtracted is the minimum observed in the n previous generations, with ne.g. 10 • Same problems as with scaling
Fitness ranking • Individuals are numbered in order of increasing fitness • The rank in this order is the adjusted fitness • Starting number and increment can be chosen in several ways and influence the results • No problems with super-fit or super-unfit • Fitness ranking is often superior to scaling and windowing
Multi-Criterion Fitness • Dominance and indifference • For an optimization problem with more than one objective function (fi, i=1,2,…n) • given any two solution X1 and X2, then • Solution X1dominatesX2 ( X1 X2), if • fi(X1) fi(X2), for all i = 1,…,n • Solution X1 is indifferent with X2 ( X1~ X2), if X1 does not dominate X2, and X2 does not dominate X1
Multi-Criterion Fitness • Pareto Optimal Set • If there exists no solution in the search space which dominates any member in the set P, • then the solutions belonging to the setPconstitute a global Pareto-optimal set. • Pareto optimal front • Dominance Check global Pareto-optimal set
Multi-Criterion Fitness • Weighted sum • F(x) = w1f1(x1) + w2f2(x2) +…+wnfn(xn) • Problems with using weighted sum? • Convex andconvex Pareto optimal front • Sensitive to the shape of the Pareto-optimal front • Selection of weights? • Need some pre-knowledge • Not reliable for problem involving uncertainties
Multi-Criterion Fitness • Optimizing single objective • Maximize: fk(X) Subject to: fj(X) <= Ki, i <> k X in F where F is the solution space. minimum constraints maximum
Multi-Criterion Fitness • Weighted sum • F(x) = w1f1(x1) + w2f2(x2) +…+wnfn(xn) • Problems? • Convex and convex Pareto optimal front • Sensitive to the shape of the Pareto-optimal front • Selection of weights? • Need some pre-knowledge • Not reliable for problem involving uncertainties
Multi-Criterion Fitness • Preference based weighted sum (ISMAUTImprecisely Specific Multiple Attribute Utility Theory) • F(x) = w1f1(x1) + w2f2(x2) +…+wnfn(xn) • Preference • Given two know individuals X and Y, ifwe prefer X than Y, thenF(X) > F(Y), that isw1(f1(x1)-f1(y1)) +…+wn(fn(xn)-fn(yn)) > 0
Multi-Criterion Fitness • All the preferences constitute a linear spaceWn={w1,w2,…,wn} • w1(f1(x1)-f1(y1)) +…+wn(fn(xn)-fn(yn)) > 0 • w1(f1(z1)-f1(p1)) +…+wn(fn(zn)-fn(pn)) > 0, etc • For any two new individuals Y’ and Y’’, how to determine which one is more preferable?
Other parameters of GA (1) • Initialization: • Population size • Random • Dedicated greedy algorithm • Reproduction: • Generational: as described before (insects) • Generational with elitism: fixed number of most fit individuals are copied unmodified into new generation • Steady state: two parents are selected to reproduce and two parents are selected to die; two offspring are immediately inserted in the pool (mammals)
Other parameters of GA (2) • Stop criterion: • Number of new chromosomes • Number of new and unique chromosomes • Number of generations • Measure: • Best of population • Average of population • Duplicates • Accept all duplicates • Avoid too many duplicates, because that degenerates the population (inteelt) • No duplicates at all
Example run Maxima and Averages of steady state and generational replacement
Integrating problem knowledge Always it is integrated to some degree in representation/ mapping Create more complex fitness function Start population chosen instead of a uniform random one Useful e.g. if constraints on range of solutions Possible problems: Loss of diversity and bias
Design decisions GAs: high flexibility and adaptability because of many options: Problem representation Genetic operators with parameters Mechanism of selection Size of the population Fitness function Decisions are highly problem dependent Parameters not independent, you cannot optimize them one by one
Hints for the parameter search Find balance between: Exploration (new search regions) Exploitation (exhaustive search in current region) Parameters can be adaptable, e.g. from high in the beginning (exploration) to low (exploitation), or even be subject to evolution themselves Balance influenced by: Mutation, recombination: create indiviuals that are in new regions (diversity!!) fine tuning in current regions Selection: focus on interesting regions
Keep in mind Start population has a lot of diversity “Invest” search time in areas that have proven good in the past Loss of diversity over evolutionary time Premature convergence: quick loss of diversity poses high risk of getting stuck in local optima Evolvability: Fitness landscape should not be too rugged Heredity of traits Small genetic changes should be mapped to small phenotype changes
GA Evolution 100 50 Accuracy in Percent 10 0 20 40 60 80 100 120 Generations http://www.sdsc.edu/skidl/projects/bio-SKIDL/
genetic algorithm learning Fitness criteria -70 -60 -50 -40 0 50 100 150 200 Generations http://www.demon.co.uk/apl385/apl96/skom.htm
Fitness value (scaled) iteration
Percent improvement over hillclimber iteration
An example after Goldberg ‘89 (1) • Simple problem: max x2 over {0,1,…,31} • GA approach: • Representation: binary code, e.g. 01101 13 • Population size: 4 • 1-point crossover, bitwise mutation • Roulette wheel selection • Random initialization • We show one generational cycle done by hand
Simple example: f(x) = x² Finding the maximum of a function: f(x) = x² Range [0, 31] Goal: find max (which is 31² = 961) Binary representation: string length 5 = 32 numbers (0-31) = f(x) We calculate fitness on the phenotype
binary value fitness String 1 00110 6 36 String 2 00011 3 9 String 3 01010 10 100 String 4 10101 21 441 String 5 00001 1 1 F(x) = x² - Start Population • Fitness = function that will be maximized • Fittest individual = maximum of function
binary value fitness String 1 00110 6 36 String 2 00011 3 9 String 3 01010 10 100 String 4 10101 21 441 String 5 00001 1 1 F(x) = x² - Selection • Worst one removed
binary value fitness String 1 00110 6 36 String 2 00011 3 9 String 3 01010 10 100 String 4 10101 21 441 String 5 00001 1 1 F(x) = x² - Selection • Best individual: reproduces twice keep population size constant
binary value fitness String 1 00110 6 36 String 2 00011 3 9 String 3 01010 10 100 String 4 10101 21 441 String 5 00001 1 1 F(x) = x² - Selection • All others are reproduced once
x2 example: selection • Roulette wheel selection
F(x) = x² - Crossover partner x-position String 1 String 2 4 String 3 String 4 2 • Parents and x-position randomly selected (equal recombination) 0 0 1 1 1 0 0 1 1 0 String 1: 0 0 0 1 0 0 0 0 1 1 String 2:
F(x) = x² - Crossover partner x-position String 1 String 2 4 String 3 String 4 2 • Parents and x-position randomly selected (equal recombination) 0 1 1 0 1 0 1 0 1 0 String 3: 1 0 0 1 0 1 0 1 0 1 String 4:
F(x) = x² - Mutation • bit-flip: • Offspring -String 1: 00111 (7) 10111 (23) • String 4: 10101 (21) 10001 (17) Not necessarily only one bit flipped, randomly selected, can be 0 or at most 5 Other strings are recombined, only best mutated
X2 example: mutation Suppose this mutation did not happen