390 likes | 410 Views
Explore the different varieties of Genetic Programming (GP) structures and mutation operators applied in tree-based GP. Learn about Linear GP, Evolutionary Program Induction, Developmental Genetic Programming, and Machine Language in GP.
E N D
Different Varieties of Genetic Programming Je-Gun Joung
9.1 GP with Tree Genomes • Mutation Operators Applied in Tree-based GP
Point Mutation + + * + * * - - - - * - * - x 1 x 1 x 1 x 1 - - x 1 x 1 - - x 1 x 1 x 1 x 1
Permutation + + * + * * - - - - * - * - x 1 x 1 x 1 x 1 - - x 1 x 1 - - x 1 1 x x 1 x 1
Hoist + * * * - - - - * - x 1 x 1 x 1 x 1 x 1 - - x 1 x 1
Expansion Mutation + + * * * * - - - - * - * - x x 1 x * x 1 - - x 1 x 1 - - 1 x 1 x 1 x 1 x 1 - - x 1 x 1
Collapse Subtree Mutation + + * * * * - - - - x - * - x 1 x 1 x 1 x 1 x 1 x 1 - - x 1 x 1
Subtree Mutation + + - * * * x 1 - - * - * - x 1 x 1 - - x 1 x 1 - - x 1 x 1 x 1 x 1
Module CrossoverCrossover Operators Applied within Tree-based GP
9.2 GP with Linear Genomes • Linear GP acts on linear genomes, like program code represented by bit strings or code for register machines. • The influence of change in a linear structure can be expected to follow the linear order in which the instructions are executed. • Tree-based GP is that all operators uniformly select nodes from a tree. • Linear GP is that all operators uniformly select nodes from a sequence.
9.2.1 Evolutionary Program Induction with Introns • Wineberg and Oppacher [1994] have formulated an evolutionary programming method they call EPI (evolutionary program induction). • They use fixed length strings to code their individuals and a GA-like crossover. • The code is constructed to maintain a fixed structure within the chromosome that allows similar alleles to compete against each other at a locus during
9.2.2 Developmental Genetic Programming • Developmental genetic programming (DGP) is extension of GP by a developmental step. • In tree based GP, the space of genotypes (search space) is usually identical to the space of phenotypes (solution space) • DGP maps binary sequences, genotype, through a developmental process into separate phenotypes
The Genotype-phenotype Mapping Genotype-Phenotype Mapping (GPM) Genotype Penotype Search Space (unconstrained) Solution space (constrained) Constraint implementation
9.2.3 An Example: Evolution in C • Symbolic function regression
An Example Result • Runs lasted for 50 generations at most, with a population size of 500 individuals. • In one experimental run, the genotype 1100 0010 1000 0111 1001 0010 1101 1001 0111 1100 0000 1011 1001 1110 1001 1010 1101 0011 1100 1111 0101 1010 0110 1110 0001 • The raw symbol sequence T*(a)*R)aE+C)E)SRDT)vSqE* • Repairing transforms this illegal sequence into {T((a)*R(a+m)+(S(D((v+q+D} • This sequence is unfinished, repairing terminates by completing the sequence into {T((a)*R(a+m))+(S(D((v+q+D(m)))))}
Finally, editing produces double ind(double m, double v, double a) {return T((a)*R(a+m))+(S(D((v+q+D(m))))); } • A C compiler takes over to generate an executable that is valid on the underlying hardware platform • This executable is the final phenotype encoded by the genotype
9.2.4 Machine Language x 1: x=x-1 (x-1)2+(x-1)3 2: y=x*x 3: x=x*y 4: y=x+y • Figure 9.13 -1 * * + y
+ * * - - * - x 1 x 1 x 1 - - x 1 x 1 The representation of (x-1)2+(x-1) 3 in a tree-based genome
The reasons for using machine code in GP - as Opposed to Higher-level languages • The most efficient optimization can be done at the machine code level. • High-level tools might simply not be available for a target processor • It could be more convenient to let the computer evolve small pieces of machine code programs itself rather than learning to master machine code programming
Reasons for Using Binary Machine Code • The GP algorithm can be made very fast by having the individual programs in the population in binary machine code. • The system is also much more memory efficient than a tree based GP system. • An additional advantage is that memory consumption is stable during evolution with no need for garbage collection.
The JB Language 0 = BLOCK (group statements) 1 = LOOP 2 = SET 3 = ZERO (clear) 4 = INCREMENT Individual genome: 0 0 13 1 91 2 14 1 7 Block stat. 1 stat.2 register 1 = 0 repeat stat.1, register2 register1 = register1+1
The GEMS System • One of the most extensive systems for evolution of machine code is the GEMS system [Crepeau, 1995]. • The system includes an almost complete interpreter for the Z-80 8-bit microprocessor. • The Z-80 has 691 different instructions, and GEMS implements 660 instructions. • It has so far been used to evolve a “hello world” program consisting of 58 instructions.
9.3 GP with Graph Genomes • 9.3.1 PADO • The graph-based GP system PADO (Parallel Algorithm Discovery and Orchestration) [Teller and Veloso, 1995] • Each program has a stack and an indexed memory for its own use of intermediate values and for communication. • There are also the following special nodes in a program • Start node • Stop node • Subprogram calling nodes • Library subprogram calling nodes
The Representation of a Program and Subprogram in the PADO Main Program • Fig 9.19 START Stack Indexed Memory STOP START STOP Subprogram (private of public)
9.4 Other Genomes • 9.4.1 STROGANOFF • Iba, Sato, and deGaris [1995] have introduced a more complicated structure into the nodes of a tree that could represent a program. • They base their approach on the well-known Group Method of Data Handling (GMDH) • In order to understand STructured Representation On Genetic Algorithms for Nonlinear Function Fitting (STROGANOFF) • The STROGANOFF method applies GP crossover and mutation to a population of the polynominal nodes.
Group Method of Data Handling (GMDH) P1 P2 P4 P3 X5 X1 X2 X3 X4
Crossover of trees of GMDH P1 Pa P2 P4 Pb Pc X5 X2 X3 X4 X1 X2 X3 X4 P1 Pa P2 Pb P4 Pc X5 X2 X3 X4 X1 X2 X3 X4
P1 P2 P4 P3 X5 X1 X2 X3 X4 Different Mutation of trees of GMDH P1 P1 (a) (b) P2 P2 P3 P3 X1 X2 X3 X4 X1 X2 X4 X5 (c) (d) P1 P1 P2 P4 P2 P4 P3 P3 P3 X5 X1 X2 X3 X4 X5 X1 X2 X3 X4
9.4.2 GP Using Context-Free Grammars • By the use of a context-free grammar, typing and syntax are automatically assured throughout the evolutionary process • A Context-free grammar can be considered a four-tuple Definition 9.2Aterminal of a context-free grammaris a symbol for which no production rule exists in the grammar. Definition 9.3Aproduction rule is a substitution of the kind where and
A Grammatical Structure S S : the start symbol B : a binary expression T : a terminal x and 1 : variables and a constant B S B + * B B * B B - B B - B B - B B - B B T T T T T T T T X 1 X 1 X 1 X 1
9.4.3 Genetic Programming of L-Systems • Lindenmayer systems (also known as L-system [Lindenmayer, 1968][Prusinkiewicz and Lindenmayer, 1990] have been intorduced independently into the area of genetic programming by different researchers [Koza, 1993][Jacob, 1994][Hemmi et al., 1994] • L-systems were invented for the purpose of modeling biological structure formation • The rewriting all non-terminals in parallel is important in this respect. • L-system in their simplest form (0L-systems) are context-free grammars whose production rules are applied not sequentially but simultaneously to the growing tree of non-terminals.
Context-free L-system Individual Encoding a Production Rule System of Lindenmayer type 0L-System AxiomA LRule LRule LRule LRule pred succ pred succ pred succ