180 likes | 201 Views
Speciation/Niching. The original SGA (Simple GA) is designed to rapidly search the landscape (exploration) and zoom in (exploitation) on a single solution .
E N D
Speciation/Niching The original SGA (Simple GA) is designed to rapidly search the landscape (exploration) and zoom in (exploitation) on a single solution. This scheme suffers from genetic drift and premature convergence. In these cases the entire population often converges to a single individual, in the process loosing the diversity originally contained in the original population. For these reasons there has been considerable research in multimodal optimization. That is, simultaneous searching for multiple solutions (peaks) in a population.
Speciation There have been numerous techniques and methodologies that promote speciation in GA’s. We will look at two well known methods carefully. The majority of methods attempt to keep multiple subpopulations running in parallel. Individuals in these “species” are only allowed to interbreed among themselves. Each species (niche) is usually small and usage of some sort of incest prevention methods might be desired. [Goldberg & Richardson 1987] This is the most well known method and is usually termed fitness sharing. [Spears 1994] Simple subpopulation schemes using tag bits.
Fitness Sharing“Genetic Algorithms with Sharing for Multimodal Function Optimization.” by Goldberg and Richardson • In fitness sharing the fitness function is modified to “allow” only so many individuals to occupy a single peak at a time. • In this case the entire population is spread out over many different peaks and each cluster around each peak interbreed and exploit that peak.
Fitness Sharing Technique • Fitness sharing restricts the number of individuals climbing a peak by reducing the fitness of these individuals as the niche count increases. • Individuals are said to be in the same niche if the distance between them is restricted by some value.
Fitness Sharing • The shared fitness of an individual is Fs(i) = fi/mi Where fi is the normal fitness of the individual And mi is the called the niche count. • In order to calculate we need a distance metric d(i,j). This may be either Euclidean or maybe a hamming distance. d(i,j) = sqrt(x1*x1+x2*x2 +…+ xn*xn)
Fitness sharing • The shared fitness of an individual is calculated using a sharing function Sh() which makes distant individuals share less. sh(dij)= 1- ( dij/radius)k for 0<=dij< radius sh(dij)= 0 otherwise • The value of mi is calculated by mi = Σsh(dij) from j=1 to pop size.
Fitness Sharing(let k=1) • sh(dij)= 1- ( dij/radius) for 0<=dij< radius • sh(dij)= 0 otherwise i dij j dik k
Fitness Sharing Flaws • Radius is set prior to execution. Determining this value is not easy in most landscapes. This is really a statement that the peaks are evenly distributed throughout the space. • Also Radius is fixed so all the peaks are required to have the same radius. • There are as a result several extension by other authors that allow variable niche sizes. • We need to know the number of peaks in the space. • Complexity is O(n2) ( Why?)
Speciation with Tag Bitshttp://www.aic.nrl.navy.mil/~spears/papers/ep94.pdf • Spears developed this method that does not require a distance metric. • A “Label” is used instead for each individual • Restricted mating and sharing with labels can now be accomplished efficiency. • Spears uses tag bits to label individuals
Tag Bits • In standard fitness proportional selection the expected number of offspring for an individual is fi/ave_fit where ave_fit is the average fitness of the population. • Here fitness is shared over a set of subpopulations {S0,S1,…Sn-1} • All individuals in the same set have the same tag bits. • Each individual is in only one set.
Tag Bits • The shared fitness is therefore calculated using the set cardinalities • Fi = fi/||Sj|| where i ε Sj • The sum of these finesses divided by N, the pop. count, is the new average fitness F • Hence the expected number of offspring is now Fi/F • Restricted mating is now done by allowing mating to be performed only between individuals that have the same tag bits.
Tag bits • If we use n tag bits then the population can represent 2n species. • Population size is of course related to the number of tag bits we wish to use. A population of size 100 and 4 tag bits imply that we can have approx. 100/24 = 100/16 individuals in a species. • Tag bits can be mutated as well.
Spears thought experiment • Suppose that we have only one tag bit and two peaks, one twice as high as the other. • Allocate tag bits to each individual randomly. • After random sampling the two population could settle on either of the two peaks, the same or different. • With fitness sharing the higher peak can only support twice as many individuals as the lower peak. This means in a population of 100, 66 individuals will be in the higher peak and 34 will be in the lower peak.
Mutating Tag Bits • Tag bits will not be modified by crossover. • Mutation can be used to allow individuals to move from one subpopulation to another. • One can control this by the amount of mutation you allow to occur to tags.
Tag Bit Structure What data structure would you use to store the individuals so that it is efficient/easy to select two individuals from the same species.
Example Problems that these schemes may apply to. • The magic square problem is a nice example. There are lots of solutions to a 4 by 4 magic square and each has exactly the same fitness. If done using real numbers the landscape has quite a few peaks (at least 880 unique ones) • The Diophantine problem is also interesting since they often have several solutions to each equation.