1 / 37

CC282 Genetic Algorithm

Lecture 06 Outline. IntroductionGA terminologyGA basic descriptionEncoding of chromosomesSelection operator in GACrossover and mutation operators in GAApplicationsEvolving ANNGenetic ProgrammingToy exampleAdvantages and disadvantage of GA. Lecture 6 slides for CC282 Machine Learning, R.

caron
Download Presentation

CC282 Genetic Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. CC282 Genetic Algorithm Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 1

    2. Lecture 06 Outline Introduction GA terminology GA basic description Encoding of chromosomes Selection operator in GA Crossover and mutation operators in GA Applications Evolving ANN Genetic Programming Toy example Advantages and disadvantage of GA Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 2

    3. Genetic Algorithm (GA) - Introduction GA is a part of evolutionary computation GA is inspired by Darwins theory of evolution - problems are solved by an evolutionary process resulting in the survival of the fittest EC was introduced in 1960s by Recheneberg J. Holland invented GA in the 70s J. Koza used GA to evolve programs (GP) in 1992

    4. Genetic Algorithm (GA) - Terminology Living organisms consist of cells. Cells contains DNA carrying the genetic material of the organism defining its traits Chromosomes are strings of DNA and serve as a model for the whole organism (genetic material) Genes - blocks of DNA of which the chromosomes consist. It can be said that each gene encodes a trait or feature Alleles are possible values for a trait (i.e. the gene) Genome - a complete set of genetic material (i.e. all chromosomes), this is called a population in GA Crossover is the operation when genes from parents combine to form a whole new chromosome during reproduction producing offspring Mutation is when some elements of the genetic material is changed (normally through a random procedure) Fitness of an organism is measured by its degree of success/failure in survival

    5. Hypothesis/search space - revisited Each point is a possible solution and has a fitness value Fitness measure how good the solution is Fitness in this case is opposite to error measure GA searches for the best/optimal solution, though there is no guarantee that it will find it GA finds a solution in a evolutionary manner Other similar methods are hill climbing, tabu search, simulated annealing

    6. GA Basic description Steps in brief: GA begins with an initial population, i.e. a set of solutions/chromosomes Fitness of each chromosome is computed Selection operators are applied that favours more fit chromosomes Crossover - with the hope that by recombination of parents, offspring produced may be fitter than the parents -> chromosomes recombine to produce offspring Mutation operator is applied Assess the fitness of the new population stop if the optimal solution is achieved or if the maximum generation number is reached Else, repeat to next generation with selection, crossover, mutation operators

    7. The GA algorithm GA(Fitness, Fitness_threshold, max_generation, popsize, Pc, Pm) Fitness: A function that assigns an evaluation score, given a hypothesis Fitness_threshold: A threshold specifying the termination criterion Max_generation: The maximum generation number to terminate GA popsize: The size of the population Pc: Crossover probability, i.e. the fraction of the population to be replaced by crossover operator at each generation Pm: Mutation probability, i.e. the fraction of the population to be replaced by mutation operator at each generation Initialise population: P ? Generate popsize random hypotheses Evaluate: for each h in P, compute Fitness(h) While [maxh Fitness(h)] < Fitness_threshold | generation < max_generation 1. Selection: Select popsize members of P (with replacement) to add to Pnext 2. Crossover: Pairs of hypotheses are randomly selected using Pc. For each pair, <h1,h2>, produce two offspring by applying the crossover operator. Add all offspring to Pnext 3. Mutate: Invert a randomly selected bit in random members of Pnext using probability Pm 4. Update: P ? Pnext 5. Evaluate: for each h in P, compute Fitness(h) Return the hypothesis from P that has the highest fitness

    8. GA Some preliminary design questions Encoding GA operates on the coding of parameters rather than the parameter itself These parameters are called chromosomes and are a string of values which represent potential solutions to the given problem The encoding could be binary, decimal or continuous which to use? Constraints - Any constraint to the gene values? Fitness How to obtain the fitness for each chromosome? Selection - How to select candidate chromosomes? The other two operators - How to perform Crossover and Mutation?

    9. Chromosomes binary representation Chromosomes are mostly represented by a string of bits Each bit/group of bits represents some characteristic/attribute/feature Values of each feature are checked represent each feature with enough bits to cover all possible values Recall the play-tennis example: Wind : {strong, weak} can be represented by two bits Example: Wind =strong, ?{10}, , Wind =weak, ?{01}, Wind =strong or weak ?{11} Outlook: {cloudy, rainy, sunny} can be represented by three bits eg: Outlook =cloudy or rainy then this is represented as 110 So, a rule such as (Outlook=cloudy ? rain) ? (Wind=strong) ? the chromosome representation is 11010

    10. Binary and decimal coding chromosomes Let us consider a more general situation Assume we have three variables, x, y and z Decimal coding is simply the integer values for genes, eg: x=35, y=191, z=5 Binary coding the genes are coded in binary form Let us assume that these variables can take integer values from 0 to 255 So, we need 8 bits for each variable (i.e. gene) If x =35, y=191, z=5, we have x=00100011, y=10111111, z=00000101 And the chromosome ?001000111011111100000101 But why go through the hassle of representing integers using binary coding? Answer (see Exercise 6, question 4) Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 10

    11. Continuous coding chromosomes But what if we want genes to represent continuous values eg: x=0.67, y=1.56, z=3.45 Solution: use binary chromosome with approximation or use continuous valued chromosomes We will not cover continuous valued chromosomes in this course As they require special type of GA operators Binary chromosome with approximation eg: x=0.145 (assume 8 bits per gene) Use the general equation: With 8 bits, xmax=255 and xmin=0 0.145*255=36.975, round this to 37, so x =00100101 So, x=00100101 is an approximation of x=0.145 More bits will improve the approximation but computation becomes time consuming Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 11

    12. Fitness function and gene contraints an example Let us consider a linear programming problem, which arise naturally in production planning: Suppose a particular Ford plant can build Escorts at the rate of one per minute, Explorer at the rate of one every 2 minutes, and Lincoln Navigators at the rate of one every 3 minutes. The vehicles get 8, 5, and 4 miles per litre, respectively, and Parliament mandates that the average fuel economy of vehicles produced be at least 6 miles per litre. Ford loses 1000 on each Escort, but makes a profit of 5000 on each Explorer and 15,000 on each Navigator. What is the maximum profit this Ford plant can make in one 8-hour day? The fitness function here is the cost function, i.e. the profit Ford can make by building x Escorts, y Explorers, and z Navigators And we want to maximize it The fitness function is f=-1000x+5000y+15000z Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 12

    13. Gene constraints Using the same example in the previous slide: The constraints arise from the production times and Parliament mandate on fuel economy There are 480 minutes in an 8-hour day, and so the production times for the vehicles lead to the following limit: x+2y+3z ? 480 The average fuel economy restriction can be written: 8x+5y+4z ? 6(x+y+z) which simplifies to 2x-y-z ? 0 There is an additional implicit constraint that the variables are all non-negative: x, y, z ? 0 Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 13

    14. Selection Selection (aka reproduction) operator is applied many times to produce a mating pool of the new population There are a number of ways to do selection to ensure that the members of the population are drawn with the correct probability Roulette wheel (fitness proportionate) selection Tournament selection Steady-state selection Rank selection Elitism

    15. Roulette wheel (fitness proportionate) selection Chromosomes are selected according to their proportionate fitness The higher fitness they are, the more chances they have to be selected Sampling can be viewed as playing a game of roulette where the pocket sizes are proportional to the probability of selecting a particular individual Each new member of the population is drawn independently when the roulette wheel is spun randomly In computer, this spin is done using a randomly generated number [0,1] But the best (so far) found solution may be lost, eg: Pnext={B,B,C}

    16. Selection (ctd) Tournament selection Pick a few chromosomes (say, popsize/4 chromosomes) at random from the population From these few, select the one fittest (i.e. with highest fitness), replace the rest and repeat the process popsize times This method can retain some good chromosomes while giving chance for other weaker chromosomes to take part in mating Steady-state selection A few good (with high fitness) chromosomes are selected to replace the few bad (with low fitness) chromosomes The rest of population (the in-between fitness ones) are selected by other methods or all are selected to remain in Pnext

    17. Selection (ctd) Rank selection The other selection methods will have problems if the fitness differs a lot For example, if the best chromosome fitness is 90% of all the rest, then using roulette wheel, the other chromosomes will have very few chances to be selected Rank selection first ranks the population and then every chromosome receives fitness from this ranking (i.e. probability of selection is proportional to rank) The worst will have fitness 1, second worst 2 etc and the best will have fitness N (number of chromosomes in population) Then, using these new fitness values, roulette wheel selection method is performed Using this, all the chromosomes have a fair chance to be selected But this method can lead to slower convergence, because the best chromosomes do not differ so much from other ones Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 17

    18. Crossover Even though reproduction increases the percentage of better fitness chromosomes, the procedure is considerably sterile; it cannot create new and better chromosomes This function is left over to crossover and to a lesser but critical extent, to mutation Crossover process simulates the exchange of genetic material that occurs during biological reproduction In this process pairs in the breeding population are mated randomly with a crossover rate, Pc Typical crossover properties include that an offspring inherits the common feature from the parents along with the ability of the offspring to inherit two completely different features Popular crossover techniques: one point, two point and uniform crossover Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 18

    19. Crossover (ctd) First, randomly select a pair of parents (i.e. two chromosomes) Perform crossover (swapping of bits) to obtain offspring, repeat this process Pc*popsize/2 times with the used parent chromosomes not included Example: if Pc=0.5 and popsize=20, then do crossover 5 times Single point and two-point crossover: Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 19

    20. Crossover (ctd) The uniform crossover scheme works as follows A randomly generated bit string called the crossover mask generalises the process A bit value of 1 in this bit string indicates that corresponding bits in the parents are to be exchanged while a 0 bit indicates no bit interchange Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 20

    21. Mutation Mutation consists of making small alterations to the values of one or more genes in a chromosome Mutation randomly perturbs the populations characteristics, and prevents evolutionary dead ends Most mutations are damaging rather than beneficial and hence mutation rate must be low to avoid the destruction of species It works by randomly selecting a bit with a certain mutation rate in the string and reversing its value Mutation is applied to the randomly chosen bit in a chromosome chosen randomly If Pm is 0.01, with a popsize of 20 with 18 bits each, then the mutation is repeated for 0.01 x 18 x 20 =3.6 4 times Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 21

    22. Applications The possible applications of genetic algorithm are immense Any problem that has a large search domain could be suitably tackled by GA We shall explore (very briefly) on the use of GA to evolve neural network weights and to evolve function/programs in genetic programming Well also look at a simple toy example

    23. Evolving NN weights using GA a simple example GA has been used successfully to evolve NN weights GA is suitable for evolving the weights of a neural network standard learning techniques such as backpropagation would take thousands upon thousands of iterations to converge But GA could (given the appropriate direction) evolve suitable weights within a hundred or so iterations Example Obtain the weights for perceptron unit for learning the OR function (we saw this in the previous lecture) But rather than using backpropagation to update the weights, we can use GA

    24. Evolving NN weights using GA a simple example Initial parameters Fitness function: 1/MSE of desired to actual output, GA will maximise this fitness function Coding, binary approximation: w1, w2 and w0 weights, say with each 6 bits, so chromosome length is 18 Popsize=20, i.e. 20 chromosomes, initially generated randomly Pc=0.5, Pm=0.01 MSE_limit=0.1, so, fitness_threshold=10; max_generation=100 Gene constraints, w1, w2 and w0 in the range [-1,1] Apply selection (say, tournament selection), crossover (say one point) and mutation to produce a new population Repeat step 3 until convergence to an acceptable solution (fitness>fitness_threshold or generation>max_generation) Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 24

    25. Genetic programming (GP) An example In programming languages such as LISP, the mathematical notation is not written in standard notation, but in prefix notation Examples: + 1 2 : 1+2 * + 1 2 2 : (1+2)*2 * + - 2 1 4 9 : ((2-1)+4)*9 Notice the difference between the lefthand side and the right? Apart from the order being different, there are no use of parenthesis The prefix method makes life a lot easier for programmers and compilers alike, because order precedence is not an issue You can build expression trees out of these strings that then can be easily evaluated. For example, the trees for the previous three expressions are.

    26. Genetic programming (GP) An example (ctd) Having numerical data and primitive functions, but no expression to conjoin the data with the primitive functions, a genetic algorithm can be used to evolve an expression tree to create a very close fit to the data By splicing and grafting the trees and evaluating the resulting expression with the data and testing it to the primitive functions, the fitness function can return how close the expression is The limitations of genetic programming lie in the huge search space the GA have to search for - an infinite number of equations Therefore, normally before running a GA to search for an equation, the user tells the program which primitive functions to search under

    27. Genetic programming (GP) An example (ctd) Assume we have data like the following and we wish to obtain the function that maps z using x and y Assume the only available primitive functions are sin,?, sqr, sqrt GP will splice and graft the trees using these primitive functions with the fitness function to minimise prediction error of z using x and y data as above Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 27

    28. Genetic programming (GP) example (ctd) Crossover example in GP -> Mutation randomly changes the primitive function The actual function is Lecture 6 slides for CC282 Machine Learning, R. Palaniappan, 2008 28

    29. Toy example Consider: a + 2b + 3c + 4d = 30, where a, b, c, d are positive integers Use GA to find a, b, c and d Assume decimal coding is used Choose say 5 random initial solution sets (i.e. popsize=5) forming the initial population with the constraint 1 = a, b, c, d = 30

    30. Example (ctd) Calculate the fitness value for each chromosome, i.e. calculate the absolute difference of each expression to 30, take inverse, this will be our fitness value Eg: Chromosome 1, expression=1+2*28+3*15+4*3=114 Since expression values that are lower are closer to the desired answer (30), these values are more desirable So, take the inverse of the absolute difference as fitness value Now, GA will try to maximise higher fitness values In order to create a system where chromosomes with more desirable fitness values are more likely to be chosen as parents, we have to do selection Assume we use the roulette wheel (fitness proportionate) method

    31. Example (ctd) Calculate the fitness proportion (likelihood) for each chromosome to be picked/selected as parent. e.g. take the sum of the all fitness values (0.135266), and calculate the percentages from there Use

    32. Example (ctd) Spin the roulette wheel for 5 times Assume the result was Since chromosome 4 had a poor fitness, its chances of survival was slim and died out in the selection process

    33. Example (ctd) Do crossover, say single point The offspring of each of these parents contains the genetic information of both father and mother For example; a father has the solution set a1, b1, c1, d1, and a mother has the solution set a2, b2, c2, d2, then there can be three pairs of possible crossovered offspring (| = crossover point):

    34. Example (ctd) Assume that through random parent selections, we have the following parent chromosomes Applying crossover to our example to produce one offspring for each pair of parents (assuming the crossover points are chosen randomly): Note: normally, there would be two offspring from parents but for simplicity of discussion, assume only one offspring is produced here

    35. Example (ctd) Apply mutation to a randomly chosen chromosome, say gene a in chromosome 1 Mutation here would change the randomly selected gene value from 0 to 30 (13, 28, 15, 3) ? (8, 28, 15, 3) Recalculate the fitness value for the offspring representing the new generation:

    36. Example - Commentary The average fitness value for the offspring chromosomes were 0.026, while the average fitness value for the parent chromosomes were 0.017 Progressing at this rate, one chromosome should eventually reach a very high fitness value (i.e. when absolute difference is close= 0), that is when an optimal solution is found If you tried and simulated this yourself, you may actually get a fitness average that is lower on some generations, but on the longrun, the fitness levels will increase For systems where the population is larger (say 50, instead of 5), the fitness levels should be more steadily and stably approach the desired level, i.e. nearly every generation will have better solutions than previous ones

    37. GA strengths and weaknesses Advantage Often achieves good results In most cases, fitness function can be designed easily to fit the hypothesis (solution) Can be easily hybridised with many other ML algorithms to yield improved results There is no hard and fast rules, many users use variations freely in their applications Disadvantage There is no guarantee that GA converges to the optimal solution Because of incomplete searches Because of hypothesis crowding, i.e. most chromosomes become similar and the fitness is high but not best and GA cant progress further due to lack of variety

    38. Lecture 6: Study guide At the end of this section, you should be able to Define chromosome, gene, allele, crossover, mutation, fitness function Describe how GA work using a flowchart or an algorithm Explain how chromosomes and hypothesis are represented in GA, i.e. coding in GA Estimate the fitness function of a given population Describe chromosome selection mechanisms Perform crossover between two chromosomes using a single, two-point and uniform masks Perform mutation Explain how GA can be used to evolve NN weights State the main advantages and disadvantage of GA

More Related