800 likes | 812 Views
A tutorial on evolutionary computation, covering the basic concept, general framework, components, and paradigms. Includes examples and applications.
E N D
Evolutionary Computation: A TutorialSeptember 2009 Biointelligence Laboratory School of Computer Sci. & Eng. Seoul National University http://bi.snu.ac.kr/
Contents • Part I: Introduction to Evolutionary Computation • Part II: Genetic Programming • Part III: Advanced Topics • Part IV: Applications • Part V: Summary and Further Info (C) 2000-2009 SNU CSE Biointelligence Lab
Overview • The Basic Concept • General Framework of Evolutionary Computation • Components of Evolutionary Computation • Paradigms in Evolutionary Computation (C) 2000-2009 SNU CSE Biointelligence Lab
Basic Concept • Use of Darwinian-like evolutionary processes to solve difficult computational problems. • Hence the name, “Evolutionary Computation” • Biological basis • Biological systems adapt themselves to a new environment by evolution. • Generations of descendants are produced that perform better than do their ancestors. • Biological evolution • Production of descendants changed from their parents • Selective survival of some of these descendants to produce more descendants (C) 2000-2009 SNU CSE Biointelligence Lab
Basic Concept • What is the Evolutionary Computation? • Stochastic search (or problem solving) techniques that mimic the metaphor of natural biological evolution. • Metaphor EVOLUTION Individual Fitness Environment PROBLEM SOLVING Candidate Solution Quality Problem (C) 2000-2009 SNU CSE Biointelligence Lab
General Framework of EC Generate Initial Population Fitness Function Evaluate Fitness Termination Condition? Yes Best Individual No Select Parents Crossover, Mutation Generate New Offspring (C) 2000-2009 SNU CSE Biointelligence Lab
Components of EC (1/9) • Representations • Population • Fitness function • Selection mechanism • Variation operators • Crossover/Mutation • Initialization / Termination (C) 2000-2009 SNU CSE Biointelligence Lab
Components of EC (2/9) • An example: the 8 queens problem • Place 8 queens on an 8x8 chessboard in such a way that they cannot check each other. (C) 2000-2009 SNU CSE Biointelligence Lab
Components of EC (3/9) • Representations • How to represent the space to be searched? • Genotype • Internal representation of solutions in EC. • Minimum domain knowledge. • Phenotype • External representation of solutions for the problem. • Require domain knowledge. • Be careful so that every feasible solution can be represented in genotype space. (C) 2000-2009 SNU CSE Biointelligence Lab
Phenotype: a board configuration 1 3 5 2 6 4 7 8 Obvious mapping Genotype: a permutation of the numbers 1 - 8 Representation Example:8 Queens Problem (C) 2000-2009 SNU CSE Biointelligence Lab
Components of EC (4/9) • Population • Usually has a fixed size and is a multiset of genotypes • Some sophisticated EAs also assert a spatial structure on the population e.g., a grid. • Diversity of a population refers to the number of different fitnesses / phenotypes / genotypes present (note not the same thing). • Population sizing • Parent population size (M), offspring population size (K). • Typically M = K. • In typical ES (explained later), M << K. • In steady state algorithms, M > K. (C) 2000-2009 SNU CSE Biointelligence Lab
Components of EC (5/9) • Fitness function • Represents the requirements that the population should adapt to • a.k.a. quality function or objective function • Assigns a single real-valued fitness to each phenotype which forms the basis for selection • So the more discrimination (different values) the better • Typically we talk about fitness being maximised • Some problems may be best posed as minimisation problems, but conversion is trivial. (C) 2000-2009 SNU CSE Biointelligence Lab
2 1 1 3 5 2 6 4 7 8 1 1 0 1 2 2 Fitness Function Example:8 Queens Problem • Penalty for a queen: • The number of queens she can check • Penalty for a configuration: • The sum of penalties of all queens • Note: penalty is to be minimized • Fitness of a configuration: • inverse penalty to be maximized (C) 2000-2009 SNU CSE Biointelligence Lab
Components of EC (6/9) • Parent selection mechanism • Assigns probabilities for an individual to be selected as parents. • Selection probabilities are relative to current population. • Different probabilities can be assigned to the same individuals. • Usually depends on the individual’s fitness and probabilistic. • High quality solutions more likely to become parents than low quality ones • Selection pressure – the degree of correlation between the individual’s fitness and its selection probability. • High selection pressure results in reducing search scope. • Even worst in current population usually has non-zero probability of becoming a parent • This stochastic nature can aid escape from local optima. (C) 2000-2009 SNU CSE Biointelligence Lab
8 1 6 6 1 6 3 3 5 5 4 7 3 2 2 8 7 5 6 1 2 5 4 1 2 8 8 3 7 4 7 4 Selection Mechanism Example:8 Queens Problem (1) 1/9 = 0.11 (2) 1/10 = 0.1 (3) 1/11 = 0.091 (4) 1/6 = 0.17 Different selection mechanisms assign different selection probabilities for the same individual. (C) 2000-2009 SNU CSE Biointelligence Lab
1 2 6 6 5 2 4 4 4 3 8 3 5 1 1 8 8 1 3 3 7 7 5 5 4 2 2 6 7 7 6 8 Components of EC (7/9) • Crossover (Recombination) • Mix information from parents into offspring in stochastic way. • Most offspring may be worse, or the same as the parents. • Hope is that some are better by combining elements of genotypes that lead to good traits. • Principle has been used for millennia by breeders of plants and livestock • Example (C) 2000-2009 SNU CSE Biointelligence Lab
1 3 5 2 6 4 7 8 1 3 7 2 6 4 5 8 Components of EC (8/9) • Mutation • It is applied to one genotype and delivers a (slightly) modified mutant, the child or offspring of it. • Element of randomness is essential. • The role of mutation in EC is different in various EC subtypes. • Example • swapping values of two randomly chosen positions (C) 2000-2009 SNU CSE Biointelligence Lab
Components of EC (9/9) • Initialization usually done at random • Need to ensure even spread and mixture of possible allele values • Can include existing solutions, or use problem-specific heuristics, to “seed” the population • Termination condition checked every generation • Reaching some (known/hoped for) fitness • Reaching some maximum allowed number of generations • Reaching some minimum level of diversity • Reaching some specified number of generations without fitness improvement (C) 2000-2009 SNU CSE Biointelligence Lab
Paradigms in EC • Evolutionary Programming (EP) • [L. Fogel et al., 1966] • FSMs, mutation only, tournament selection • Evolution Strategy (ES) • [I. Rechenberg, 1973] • Real values, mainly mutation, ranking selection • Genetic Algorithm (GA) • [J. Holland, 1975] • Bitstrings, mainly crossover, proportionate selection • Genetic Programming (GP) • [J. Koza, 1992] • Trees, mainly crossover, proportionate selection (C) 2000-2009 SNU CSE Biointelligence Lab
Evolution Strategy (ES) • Problem of real-valued optimization Find extremum (minimum) of function F(X): Rn ->R • Operate directly on real-valued vector X • Generate new solutions through Gaussian mutation of all components • Selection mechanism for determining new parents (C) 2000-2009 SNU CSE Biointelligence Lab
ES: Representation One individual: The three parts of an individual: : Object variables Fitness : Standard deviations Variances : Rotation angles Covariances (C) 2000-2009 SNU CSE Biointelligence Lab
ES: Operator - Recombination , where rx, r , r {-, d, D, i, I, g, G}, e.g. rdII (C) 2000-2009 SNU CSE Biointelligence Lab
ES: Operator - Mutation • m{,’,} : I Iis an asexual operator. • n = n, n = n(n-1)/2 • 1 < n < n, n = 0 • n = 1, n = 0 (C) 2000-2009 SNU CSE Biointelligence Lab
ES: Illustration of Mutation Hyperellipsoids Line of equal probability density to place an offspring (C) 2000-2009 SNU CSE Biointelligence Lab
ES: Evolution Strategy vs. Genetic Algorithm Create random initial population Create random initial population Evaluate population Evaluate population Insert into population Insert into population Select individuals for variation Vary Vary Selection (C) 2000-2009 SNU CSE Biointelligence Lab
Overview • Genetic Programming • Tree-based Representations • Setting Up a Genetic Programming Run • Example: Wall-Following Robot • Result (C) 2000-2009 SNU CSE Biointelligence Lab
Genetic Programming (GP) • GP is an domain-independent method that evolves a population of programs to solve a problem. • GP uses variable-size tree-representations rather than fixed-length strings of binary values in typical GA. • GP has been successful in many domains. (C) 2000-2009 SNU CSE Biointelligence Lab
Tree-based Representations (1/6) • Function set: internal nodes • Functions, predicates, or actions which take one or more arguments • Terminal set: leaf nodes • Program constants, actions, or functions which take no arguments S-expression: (+ 3 (/ ( 5 4) 7)) Terminals = {3, 4, 5, 7} Functions = {+, , /} (C) 2000-2009 SNU CSE Biointelligence Lab
Tree-based Representations (2/6) • Trees can represent: • Arithmetic formula • Logical formula • Program • And many others depending on the definition of terminals and nonterminals. (x true) (( x y ) (z (x y))) i =1; while (i < 20) { i = i +1 } (C) 2000-2009 SNU CSE Biointelligence Lab
Tree-based Representations (3/6) (C) 2000-2009 SNU CSE Biointelligence Lab
Tree-based Representations (4/6) (x true) (( x y ) (z (x y))) (C) 2000-2009 SNU CSE Biointelligence Lab
Tree-based Representations (5/6) i =1; while (i < 20) { i = i +1 } (C) 2000-2009 SNU CSE Biointelligence Lab
Tree-based Representations (6/6) • In GA, ES, EP chromosomes are linear structures (bit strings, integer string, real-valued vectors, permutations) • Tree shaped chromosomes are non-linear structures. • In GA, ES, EP the size of the chromosomes is fixed. • Trees in GP may vary in depth and width. (C) 2000-2009 SNU CSE Biointelligence Lab
Setting up a GP Run • Users need to specify: • The terminal / function set • The fitness measure • Algorithm parameters to control the run • Population size, maximum number of generations • Selection / variation operators and their probabilities • The termination criterion • Maximum depth of a GP tree, etc. (C) 2000-2009 SNU CSE Biointelligence Lab
Example: Wall-Following Robot (Step 1) • Program Representation in GP • Functions • AND (x, y) = 0 if x = 0; else y • OR (x, y) = 1 if x = 1; else y • NOT (x) = 0 if x = 1; else 1 • IF (x, y, z) = y if x = 1; else z • Terminals • Actions: move the robot one cell to each direction {north, east, south, west} • Sensory input: its value is 0 whenever the coressponding cell is free for the robot to occupy; otherwise, 1. {n, ne, e, se, s, sw, w, nw} (C) 2000-2009 SNU CSE Biointelligence Lab
Program Example (C) 2000-2009 SNU CSE Biointelligence Lab
Example: Wall-Following Robot (Step 1) • Be careful when specifying function / terminal set. • To avoid syntactic error • All programs in the initial population should be valid, executable programs. • The genetic operators during the run should produce valid, executable programs as offspring. • To avoid run-time error • All functions can take any terminal or the results produced by any other functions as input. (C) 2000-2009 SNU CSE Biointelligence Lab
Example: Wall-Following Robot (Step 2) • Fitness measures can be • Error between program’s output and the desired output. • Accuracy in recognizing patterns or classifying objects into classes. • Payoff a game-playing program produces. • Often each program is executed over a representative sample of fitness cases. • In our wall-following robot problem • the number of cells next to the wall that are visited during 60 steps over 10 independent runs. • Perfect score (320) : One Run (32) 10 randomly chosen starting points (C) 2000-2009 SNU CSE Biointelligence Lab
+ b a b Example: Wall-Following Robot (Step 3) • Genetic operators – crossover (subtree exchange) + + b a a b + + + a b b b a b a (C) 2000-2009 SNU CSE Biointelligence Lab
Example: Wall-Following Robot (Step 3) • Genetic operators – mutation + + + / / - b a b b b b a a a (C) 2000-2009 SNU CSE Biointelligence Lab
Example: Wall-Following Robot (Step 3) • Population size: 5,000 • Termination condition: found perfect solution • Creating Next Generation • 500 programs (10%) are copied directly into next generation (elitism). • Tournament selection • 7 programs are randomly selected from the population 5,000. • The most fit of these 7 programs is chosen. • 4,500 programs (90%) are generated by crossover. • Parents are each chosen by tournament selection. • In this example, mutation was not used. (C) 2000-2009 SNU CSE Biointelligence Lab
Example: Wall-Following Robot (Step 3) (C) 2000-2009 SNU CSE Biointelligence Lab
Result (1/5) • Generation 0 • The most fit program (fitness = 92) • Starting in any cell, this program moves east until it reaches a cell next to the wall; then it moves north until it can move east again or it moves west and gets trapped in the upper-left cell. (C) 2000-2009 SNU CSE Biointelligence Lab
Result (2/5) • Generation 2 • The most fit program (fitness = 117) • Smaller than the best one of generation 0, but it does get stuck in the lower-right corner. (C) 2000-2009 SNU CSE Biointelligence Lab
Result (3/5) • Generation 6 • The most fit program (fitness = 163) • Following the wall perfectly but still gets stuck in the bottom-right corner. (C) 2000-2009 SNU CSE Biointelligence Lab
Result (4/5) • Generation 10 • The most fit program (fitness = 320) • Following the wall around clockwise and moves south to the wall if it doesn’t start next to it. (C) 2000-2009 SNU CSE Biointelligence Lab
Result (5/5) • Fitness Curve • Fitness as a function of generation number • The progressive (but often small) improvement from generation to generation (C) 2000-2009 SNU CSE Biointelligence Lab