360 likes | 433 Views
GA Applications. Peaks function C- code GOAT package for MATLAB minimization and maximization Traveling Salesman Problem genotype and phenotype encoding customizing operators rankscaling Hillis Sorting Problem Sequence Alignment
E N D
GA Applications • Peaks function • C- code • GOAT package for MATLAB • minimization and maximization • Traveling Salesman Problem • genotype and phenotype encoding • customizing operators • rankscaling • Hillis Sorting Problem • Sequence Alignment • Floating point GAs • Constraint optimization • Multi-objective optimization • The schemata theorem
Components of binary GA in Feature Selection R2 = Goodness of fit Problem:max R2 Selected Population 0.1 Fitness Population Selection 110101 111111 000000 f1 = 0.60 f2 = 0.30 f3 = 0.10 110101 110101 000000 0.3 0.6 Crossover point 111100 000011 111111 000000 Crossover Selected gene Mutated gene Mutation 111111 111110
Uniform Crossover 5 24 131 534 603 Parent 1 19 33 255 334 508 Parent 2 19 33 131 534 603 Child 1 Child 2 5 24 255 334 508 • Mutation 5 24 131 534 603 Parent Child 5 24 344 534 603 Genetic Operators
void main(int argc, char *argv[]) {char mombassa[80], root[80]; data b; double alpha, beta; //user data int num_cities; MATRIX distances; Container box; //user data to objective function in box double (* fptr) (data*, VECTOR); //function pointer to objective fnctn genotype pop; fptr = Salesman3; MatrixAllocate(&distances, 500, 500); userData(&b, &box); // tells pointer of userdata in data struct for b Read_User_Data(&alpha, &beta, &num_cities, distances); box.pop = &pop; box.alpha = alpha; box.beta = beta; box.num_cities = num_cities; box.distances = distances; if (argc == 2) strcpy( mombassa, argv[1]); Allocate_GA(&pop, &b, argc, mombassa, root, fptr); b.print_flag=0; Loop_GA(&b, &pop, root, fptr); Write_User_Data(&b, &pop, root, fptr); De_Allocate_GA(&pop, &b, root, fptr); MatrixFree(distances, 500); }
double Salesman2(data *a, VECTOR x) { int i, isum=0;double tour= 0, pen1=0, pen2=0; double alpha, beta;int num_cities, one, two, help; Container * box = (Container *)(a->ud); alpha = box->alpha; beta = box->beta; num_cities = box->num_cities; help = num_cities/2*(num_cities-1); if (num_cities%2 == 1) help = help+num_cities%2; for (i = 0; i < num_cities-1;i++) { one = (int) x[i]; two = (int) x[i+1]; tour = tour + box->distances[one][two]; } one = (int) x[num_cities-1]; two = (int) x[0]; tour = tour + box->distances[one][two]; for (i = 0; i < num_cities;i++) isum += (int) x[i]; if (isum!=help) pen1=alpha; getche(); box->penn1=pen1; box->penn2=pen2; return tour + pen1; }
SCHEMATA THEOREM (Holland) • h(i) raw fitness for population sample i • f(i) = normalized fitness f(i) = h(i)/Σh(i) • A schema denotes a set of substrings that have identical • values at certain loci: 1#101 = {10101, 11101} • m(S,t) number of scheme exemplars in pop at generation t • Number of schema of inividual S present in next generation is • proportional to chance of an individual being picked that has • the schema according to: • m(S,t+1) = m(S,t) n f(S)/Σf = m(S,t) f(S)/fave= m(S,t) fave(1+c) • m(S,t+1) = m(S,0) (1+c)t • Better than average schemata grow exponentially
Initial Population Evaluation Fitness proportional Crossover Parents Tournament Selection Selected Population Mutation Rank selection Parents Offspring Elitist strategy Evaluation Next Generation Make sure that best individual survives Genetic Algorithm cycle
Note: In the plot, fitnesses are plotted as (1-R2) and The problem can be thought as a minimization.
Source: A. Yasri andD. Hartsough, Toward an Optimal Procedure for Variable Selection and QSAR Model Building J. Chem. Inf. Comput. Sci. 2001 Vol. 41, No.5, pp. 1218-1227.
Search space in feature selection A data set with 10 features