440 likes | 610 Views
CSNB234 ARTIFICIAL INTELLIGENCE. Chapter 9 Genetic Algorithms. (Chapter 12, pp. 509-519, Textbook) (pp. 116-119, Ref. #3). Instructor: Alicia Tang Y. C. Genetic Algorithms - I. Genetic Algorithms (GAs) Is an efficient and robust search technique in complex searching areas
E N D
CSNB234ARTIFICIAL INTELLIGENCE Chapter 9 Genetic Algorithms (Chapter 12, pp. 509-519, Textbook) (pp. 116-119, Ref. #3) Instructor: Alicia Tang Y. C. UNIVERSITI TENAGA NASIONAL
Genetic Algorithms - I • Genetic Algorithms (GAs) • Is an efficient and robust search technique in complex searching areas • It normally find near optimal solutions to problems • Solutions are based on the natural selection and genetic, i.e. survival of the fittest • In a GA work: • a number of solutions are evaluated simultaneously UNIVERSITI TENAGA NASIONAL
Genetic Algorithms - II • developed by John Holland in 1975 • inspired by biological mutation & evolution • stochastic (means non-deterministic) search techniques based on the mechanism of natural selection and genetics • implicit parallel search of solution space UNIVERSITI TENAGA NASIONAL
Genetic Algorithms - III • Each solution is evaluated for its fitness. • The better solution the greater its chance of survival • use an iterative process, with better solutions normally evolving over time UNIVERSITI TENAGA NASIONAL
Applications • GAs are used for optimization problems • such as maximising profits or minimising costs/time • Some problem areas are: • scheduling • design • financial management UNIVERSITI TENAGA NASIONAL
The Algorithm …Evolutionary Algorithm start Generate initial population Evaluate function Criteria met? no selection crossover mutation yes Best result stop UNIVERSITI TENAGA NASIONAL
Genetic Representation • Chromosomes (Strings) • an instance of the problem to be solved • which is also called a genetic structure • a chromosome consists of one or more bits • e.g. 01001001 • a chromosome with n bits represents 2n solutions (of which some may be invalid) UNIVERSITI TENAGA NASIONAL
Chromosomes • Representation • bit strings (0011 .. 1101) - as seen in earlier slide • permutation of elements (e3 e4 e1 e7 e9 .. e15) • list of rules (R1 R2 R3 .. R10) • Tree-structured expression (+ a b) • etc. GAs have been successful with chromosomes sizes of 1000’s of bit, 100’s rules and 1000’s of permutation elements UNIVERSITI TENAGA NASIONAL
Genes • A chromosome may be divided into parts (positions) called genes • Each gene may represent a particular aspect or parameter of a problem • And, its values is called allele • As a gene is in binary a 1 bit gene can hold 1 or 2 values, a 2 bit gene can hold 3 or 4 values, a 3 bit gene 5 to 8 values, etc. • using this: e.g. n bits can hold from 2 n-1 + 1 to 2 n values UNIVERSITI TENAGA NASIONAL
Population • A GA uses a number of chromosome at a time (e.g 50) called population each representing solutions for a problem • The population of chromosomes compete to survive based on their fitness and are manipulated by genetic operators • The population evolves over a number of generations towards a better solution UNIVERSITI TENAGA NASIONAL
Genetic Operators • Main genetic operators are • reproduction • crossover • mutation • inversion UNIVERSITI TENAGA NASIONAL
Reproduction • “Reproduce” by “Selection” of individual chromosomes that are to reproduce • selecting individuals to be parents • chromosomes with a higher fitness value will have a higher probability of contributing one or more offspring in the next generation UNIVERSITI TENAGA NASIONAL
Roulette wheel selection technique • It is one of the chromosome selection techniques. Each chromosome is given a slice of the circular roulette wheel. The area of the slice within the wheel is equal to the chromosome fitness ratio. • To select a chromosome for mating, a random number is generated in the interval [0, 100], and the chromosome whose segment spans the random number is selected. UNIVERSITI TENAGA NASIONAL
Crossover • Mixing of genetic material (mating) • Two structures in the current generation are allowed to mate randomly • with each pair producing two chromosomes (offspring) • A crossover point is selected at random and parts of the two parent chromosomes are swapped to create two child chromosomes. UNIVERSITI TENAGA NASIONAL
Crossover in action Select at random pos & swap the two bits 0 1 0 0 1 0 0 1 0 0 1 1 0 0 1 0 0 1 0 0 0 0 0 1 0 0 1 1 1 0 1 0 Before crossing over After it is done Crossover can lead to effective combination of partial solutions on different chromosomes By doing this, it helps accelerates the search at an early stage of evolution UNIVERSITI TENAGA NASIONAL
Types of crossover Single-point multi-point arithmetic reduced surrogate uniform shuffle tree, etc. UNIVERSITI TENAGA NASIONAL
Mutation • Mutation is a small copy error from one generation to the next • the mutation rate is the probability a bit changes from 0 to 1 or 1 to 0 • the mutation rate must be very small (e.g. 0.001) or it may result in a random search, rather than the guided search UNIVERSITI TENAGA NASIONAL
Mutation in action Mutation may be random or heuristics Movement can be made global or local to some sub units 0 1 0 0 1 0 0 1 0 1 0 1 1 0 0 1 becomes UNIVERSITI TENAGA NASIONAL
An Example • Data population: RGB colours • Aim: to obtain darkest colour represented by (0, 0, 0) • This is a minimisation problem, i.e. a good colour is one that fits for (colour) --> 0. • We now tabulate our data as shown (see next slide): UNIVERSITI TENAGA NASIONAL
GA: Fitness (I) • Start at a random pattern like this: Colour Red Green Blue C1 80 170 689 C2 130 690 15 C3 24 8 317 where Fitness for (C1) = 80 + 170 + 689 = 939 Fitness for (C2) = 130 + 690 + 15 = 835 Fitness for (C3) = 24 + 8 + 317 = 349 UNIVERSITI TENAGA NASIONAL
GA: Fitness (II) • Start at a random pattern like this: Colour Red Green Blue C1 80 170 689 C2 130 690 15 C3 24 8 317 FITTEST PLACED TOP C3 C2 C1 Fitness (C1) = 80 + 170 + 689 = 939 Fitness (C2) = 130 + 690 + 15 = 835 Fitness (C3) = 24 + 8 + 317 = 349 UNIVERSITI TENAGA NASIONAL
GA: Selection • After a Selection is done on the sample: ** Remember, this is a minimisation problem.. Colour Fitness C3 349 C2 835 C1 939 UNIVERSITI TENAGA NASIONAL
GA: Reproduction & Crossover • So far, we have this Colour Red Green Blue C1 80 170 689 C2 130 690 15 C3 24 8 317 Colour Fitness C3 349 C2 835 C1 939 Next step is to reproducethe pattern, like this, by crossing over: C4 is crossover(C3,C2)= (24, 8, 15) C5 is crossover(C3,C1)= (24, 8, 689) C6 is crossover(C2,C1)= (130, 690, 689) Colour Red Green Blue C4 24 8 15 C5 24 8 689 C6 130 690 689 UNIVERSITI TENAGA NASIONAL
Mutation in GA • perform mutation, and we have C7 is obtained by mutating(4) =(24, 8, 13) C8 is obtained by mutating(5) =(25, 9, 689 ) C9 is obtained by mutating(6) =(128, 688, 689) New population of 3 chromosomes Colour Red Green Blue C7 24 8 13 C8 25 9 689 C9 128 688 689 UNIVERSITI TENAGA NASIONAL
Conclusion (up to “mutation” to get a new data set) • Some solutions have improved (after first iteration): Getting better rather fast Slightly improved of the answer Fitness for C7 = (24 + 8 + 13) = 45 Fitness for C8 = (25 + 9 + 689) = 743 Fitness for C9 = (128 + 688 + 689 ) = 1505 worse here.. If the process is iterated, population will converge to have fitness near to zero (colour) --> 0. UNIVERSITI TENAGA NASIONAL
We shall look at the so-called‘fitness function’with an example UNIVERSITI TENAGA NASIONAL
Fitness Function • The GA performs a search amongst possible solutions • The GA search is guided by a fitness function which returns a single numeric value indicating the fitness of a chromosome • the fitness is maximised or minimised depending on the problems UNIVERSITI TENAGA NASIONAL
Problem Description • To make the best use of disk space and avoid fragmentation, files should be allocated to minimize the number of locations used. • Assumptions: • files must be placed in a single location So that it could free some storage for other files use UNIVERSITI TENAGA NASIONAL
Problem Description Disk space Location Size (KB) 0 1.0 1 1.5 2 4.0 3 0.3 Total 6.8 UNIVERSITI TENAGA NASIONAL
Files to be stored are: Identifier Size (KB) A 0.2 B 0.1 C 1.2 D 3.0 E 0.9 UNIVERSITI TENAGA NASIONAL
SOLUTION • Step 1: design the structure of the chromosome • use one gene per file • 5 genes (for this example) • the first gene represent the location for file A, etc. • as there are 4 locations, 2 bit genes can be used to indicate a files location • (00 = 0, 01 = 1, 10 = 2, 11 = 3) - base two • chromosome size = 5 * 2 = 10 bits UNIVERSITI TENAGA NASIONAL
So that all files are allocated Not being used • Step 2: determine the fitness function • The fitness is to be maximised • Factors affecting the fitness function • Free memory locations (min = 0, max = 4) • Memory overflow (disk space Kb) • Valid/invalid solutions. • Fitness calculation • Valid solution: fitness = bonus for getting a valid solution + number of free memory • The bonus in this case is the maximum that can be obtained for an invalid solution • Invalid solution: fitness = total disk space - memory overflow Not ‘fit’ if there are many overflow 6.8+0 6.8+4 UNIVERSITI TENAGA NASIONAL
Genetic Algorithm - revisited • Algorithm overview • Initialise the population • Evaluate the fitness of the population • WHILE the termination condition has not been satisfied • select chromosomes for the new reproduction • perform crossover on the new population • perform mutation on the new population • evaluate the fitness of the new population • make the new population the old population • ENDWHILE UNIVERSITI TENAGA NASIONAL
Procedures • Step 1: Pick a population of random codes. (Population size = 4) • Step 2: Evaluate the fitness of each chromosome • Step 3: Select for reproduction with a probability based on the fitness value • Step 4: Crossover chromosomes to form the new generation • randomly select pairs • crossover UNIVERSITI TENAGA NASIONAL
Step 5: Mutation • Step 6: Loop back to step 2 • Control Parameters • The main GA control parameters are: • Number of generations and trials • Population size • Crossover rate • Mutation rate UNIVERSITI TENAGA NASIONAL
Exercise UNIVERSITI TENAGA NASIONAL
Supplementary slides UNIVERSITI TENAGA NASIONAL
Number of generations • The number of cycles (generations) of processing • Not all chromosomes in a population need to be evaluated in each generation as they do not change • Number of trials • the fitness evaluation of a chromosome takes 1 trial • the number of trials is approximately equal to population size * number-of-generations Genesis default: 1000 UNIVERSITI TENAGA NASIONAL
Population size • The number of chromosomes in the population • Constant throughout a GA run • Genesis default: 50 UNIVERSITI TENAGA NASIONAL
Crossover rate • The proportion of chromosomes that are crossed over, the remainder are copied to the new population unchanged • For example a crossover rate of 0.6 results in 60% of the chromosomes selected for reproduction being crossed over, and the other 40% being carried through unchanged to the new population Genesis default: 0.6 UNIVERSITI TENAGA NASIONAL
Mutation rate • The probability of a bit being mutated (changed) • Usually in the range 0.01 to 0.001 • For example a mutation rate of 0.001 means there is a 1 in 1000 chance of a bit being changed • Genesis default: 0.001 UNIVERSITI TENAGA NASIONAL
Genetic Algorithms Software Packages • ANT: PC implementation of 'John Muir Trail' experiment • CFS-C: Domain Independent Subroutines for Implementing Classifier Systems in Arbitrary, User-Defined Environments • DGENESIS: Distributed GA • EM: Evolution Machine • GAucsd: Genetic Algorithm Software Package • GAC: Simple GA in C • GACC: Genetic Aided Cascade-Correlation • GAGA: A Genetic Algorithm for General Application • GAGS: Genetic algorithm application generator and C++ class library / GAL: Simple GA in Lisp • GAME: Genetic Algorithms Manipulation Environment UNIVERSITI TENAGA NASIONAL
Genetic Algorithms Software Packages • GAMusic: Genetic Algorithm to Evolve Musical Melodies • GANNET: Genetic Algorithm / Neural NETwork • GAW: Genetic Algorithm Workbench • GECO: Genetic Evolution through Combination of Objects • GENALG: Genetic Algorithm package written in Pascal • GENESIS: GENEtic Search Implementation System • GENEsYs: Experimental GA based on GENESIS • GenET: Domain-independent generic GA software package • Genie: GA-based modeling/forecasting system • GENITOR: Modular GA package with floating-point support. • GENlib: Genetic Algorithms and Neural Networks • mGA: C and Common Lisp implementations of a messy GA UNIVERSITI TENAGA NASIONAL