290 likes | 458 Views
6 Crossover -The Center of the Storm. Crossover and Building Blocks GP crossover mimic the process of sexual reproduction. GP search is more effective than systems based on random transformations (mutations) of the candidate solutions.
E N D
Crossover and Building Blocks • GP crossover mimic the process of sexual reproduction. • GP search is more effective than systems based on random transformations (mutations) of the candidate solutions. • GP works faster than systems just based on mutations, according to building block hypothesis, because good building blocks get combined into ever larger and better building blocks to form better individuals. • Crossover - The Controversy • Does the GP crossover operator outperform mutation-based systems by locating and combining good building blocks or is GP crossover, itself, a form of macromutation? • What sorts of improvements may be made to the crossover operator to improve its performance?
A Caveat • This chapter will focus at length on the undoubted shortcomings of the GP crossover operator. • It is important, nevertheless, to remember that something is going on with GP crossover. • GP crossover already has a substantial record of accomplishment. • Chapter Overview • The theoretical bases for both the building block hypothesis and the notion that GP crossover is really a macromutation operator • The empirical evidence about the effect of crossover • Several promising directions for improving GP crossover
The Theoretical Basis for the Building Block Hypothesis in GP • The schematheorem of Holland is one of the most influential and debated theorems in evolutionary algorithms in general and genetic algorithms in particular. • The schema theorem for fixed length genetic algorithms states that good schemata will tend to multiply exponentially in the population as the genetic search progresses and will thereby be combined into good overall solutions with other such schemata. • However, the GP case is much more complex because GP uses representations of varying length and allows genetic material to move from one place to another in the genome. • The crucial issue in the schema theorem is the extent to which crossover tends to disrupt or to preserve good schemata.
Koza’s Schema Theorem Analysis • A schema is a set of subtrees that contains (somewhere) one or many subtrees from a special schema defining set. • Koza’s argument is informal and he does not suggest an ordering or length definition for his schemata. • Koza’s statement that GP crossover tends to preserve, rather than disrupt, good schemata depends crucially on the GP reproduction operator. • Good schemata will be tested and combined by crossover operator more often than poorer schemata. • This process results in the combination of smaller but good schemata into bigger schemata and, ultimately, good overall solutions.
O’Reilly’s Schema Theorem Analysis • O’Reilly defines her schemata similarly to Koza but with the presence of a don’t care symbol (#) in one or more subtrees. • The order of a schema is the number of nodes which are not # symbols. • The length is the number of links in the tree fragments plus the number of links connecting them. f(H,t): mean fitness of all instances of a certain schema H (t): average fitness in generation t E[m(H,t)]: the expected value of the number of instances of H Pd(H,t): the maximum probability of disruption pc: crossover probability
Whigham’s Schema Theorem Analysis • Whigham has formulated a definition of schemata in his grammar-based GP system. • This approach leads to a simpler equation for the probability of disruption than O’Reilly’s approach. • Newer Schema Theorems • Poli and Langdon have formulated a new schema theorem that asymptotically converges to the GA schema theorem. • The result of their study suggests that there might be two different phases in a GP run: a first phase completely depending on fitness, and a second phase depending on fitness and structure of the individual (e.g., schema defining length). • Rosca’s schema theorem for rooted-tree schemata
Inconclusive Schema Theorem Results for GP • None of the existing formulations of a GP schema theorem predicts with any certainty that good schemata will propagate during a GP run. • The principal problem is the variable length of the GP representation. • In the absence of a strong theoretical basis for the claim that GP crossover is more than a macromutation operator, it is necessary to turn to other approaches.
Preservation and Disruption of Building Blocks: A Gedanken Experiment • Crossover as a Disruptive Force • As GP becomes more and more successful in assembling small building blocks into larger and larger blocks, the whole structure becomes more and more fragile because it is more prone to being broken up by subsequent crossover.
Assume that our building block is almost a perfect program. • But in this case, just before success, the probability that the perfect solution will be disrupted by crossover is 10/11 or 90.9%.
The conclusion is inevitable; crossover operator is a disruptive force as well as a constructive force - putting building blocks together and then tearing them apart. • The balance is impossible to measure with today’s techniques. • It is undoubtedly a dynamic equilibrium that changes during the course of evolution. • We note, however, that for most runs, measured destructive crossover rates stay high until the very end.
Reproduction and Crossover • The good building blocks in individuals duplicated by the reproduction operator will have many chances to try to find crossovers that are not disruptive. • This argument depends on the assumption that the high quality of the building block will somehow be reflected in the quality of the individual in which it appears. • It also depends on the balance between the reproduction operator and the destructive effects of crossover at any given time in a run. • Schema Theorem Analysis Is Still Inconclusive. • It is impossible to predict with any certainty yet whether GP crossover is only a macromutation operator or something more.
Empirical Evidence of Crossover Effects • The Effect of Crossover on the Fitness of Offspring • The effect of crossover on the relative fitness of parents and their offspring How can we measure the effect of crossover? It is not entirely clear what should be measured
Two basic approaches to measuring the effect of crossover • The Result of Measuring the Effect of Crossover • In all three cases (tree-based GP, linear GP, and graph GP), crossover has an overwhelmingly negative effect on the fitness of the offspring of the crossover. • The conclusion is compelling: crossover routinely reduces the fitness of offspring substantially relative to their parents in almost every GP system. The average fitness of all parents has been compared with the average fitness of all offspring The fitness of children and parents is compared on an individual basis.
The Relative Merits of Program Induction via Crossover versus Hill Climbing or Annealing • Headless Chicken Crossover • Onlyone parent is selected and an entirely new individual is created randomly. The selected parent is then crossed over with the new and randomly created individual. • The offspring is kept if it is better than or equal to the parent in fitness. Otherwise, it is discarded. Thus, headless chicken crossover is a form of hill climbing. • Mutation techniques may perform as well as and sometimes slightly better than traditional GP crossover.
Crossover vs. Non-Population-Based Operators • Mutate-simulated annealing and crossover-hill climbing • If the new solution has higher fitness, it replaced the original solution. Otherwise, it is discarded in crossover-hill climbing but kept probabilistically in mutate-SA. • The mutate-SA and crossover-hill climbing algorithms performed as well as or slightly better than standard GP on a test suite of six different problems. • Crossover seems to create children with large syntactic differences between parents and offspring.
Conclusions about Crossover as Macromutation • The empirical evidence lends little credence to the notion that traditional GP crossover is, somehow, a more efficient or better search operator than mutation-based techniques. • There is no serious support the conclusion that hill climbing outperforms GP. • On the state of the evidence as it exists today, one must conclude that traditional GP crossover acts primarily as a macromutation operator. • The failure of the standard GP crossover operator may be due to the stagnation of GP runs (“bloat” - in other words, the exponential growth of GP “introns”).
Improving Crossover - The Argument from Biology • Biological crossover works in a highly constrained and highly controlled context that has evolved over billions of years. • Crossover may be seen as the result of the evolution of evolvability. • Three principal constraints on biological crossover • In nature, most crossover events are successful - that is, they results in viable offspring (in standard GP, 25%). Biological crossover takes place only between members of the same species. Biological crossover occurs with remarkable attention to preservation of “semantics.” Biological crossover is homologous.
In the basic GP system, any subtree may be crossed over with any other subtree. There is no requirement that the two subtrees fulfill similar functions. • There is no requirement that a subtree, after being swapped, is in a context in the new individual that has any relation to the context in the old individual. • Were GP to develop a good subtree building block, it would be very likely to be disrupted by crossover rather than preserved and spread. • There is no reason to suppose that randomly initialized individuals in a GP population are members of the same species.
Improving Crossover - New Directions • Brood Recombination • pick two parents from the population • Perform random crossover on the parents N times, each time creating a pair of children as a result of crossover. • Evaluate each of the children for fitness. Sort them by fitness. Select the best two. • Time-Saving Evaluation • Is Brood Recombination Effective?
“Intelligent” Crossover • A Crossover Operator That Learns • Improving the rate of constructive crossover in PADO (a graph-based GP) by letting an intelligent crossover operator learn how to select good crossover points • A Crossover Operator Guided by Heuristics • The performance value for subtrees decides which subtrees are potential building blocks to be inserted into other trees, and which subtrees are to be replaced. • The intelligent operators found regularities in the program structures of very different GP systems There are blocks of code that are best left together - perhaps these are building blocks. These blocks of code have characteristics that can be identified by heuristics or a learning algorithm. GP produces higher constructive crossover rates and better results when these blocks of code are probabilistically kept together.
Context-Sensitive Crossover • Most crossover does not preserve the context of the code - yet context is crucial to the meaning of computer code. • Strong context preserving crossover (SCPC) that only permitted crossover between nodes that occupied exactly the same position in the two parents. • Modest improvements in results by mixing regular crossover and SCPC • This approach introduced an element of homology into the crossover operator. • Requiring crossover to swap between trees at identical locations is somewhat homologous.
Explicit Multiple Gene Systems • Fitness components are affected by all or some of the genes. • This system highly theoretical because the fitness of the individual is just the sum of the fitness components. • During evolution, a gene is periodically added. If it improves the fitness individual, it is kept; otherwise, it is discarded. • Between gene additions, the population evolved by intergene crossover. • Having mutiple fitness fuctions allows the genes to be more independent or, in biological terms, to be less epistatic.
Explicitly Defined Introns • An integer value (explicitly defined introns value - EDIV) is stored between every two nodes in the GP individual. • The probability that crossover occurs between any two nodes is the GP program is proportional to the integer value between the nodes. • The EDIV vector evolves during the GP run to identify the building blocks in the individual as an emergent phenomenon. • The EDIV values within a good building block should become low and, outside the good block, high. • Using real-valued EDIVs and constraining changes in the EDIVs by Gaussian distribution of permissible mutation to the EDIVs
Modeling Other Forms of Genetic Exchange • There are several ways in which individuals exchange genetic material in nature (conjugation, transduction, and transformation) • Conjugation • Simple conjugation in GAs - donor, recipient • To foster the spread of potentially advantageous genetic information, conjugation might be combined with tournament selection. • Multiple conjugation involving n donors could be combined preferentially with n+1-tournament selection.
Improving Crossover - A Proposal • “Homologous” crossover in GP • What result does homologous crossover have? The mechanism by which biology cause homology, i.e. speciation, almost identical length or structure of DNA between parents, and strict base pairing during crossover. The reason the mechanism has evolved makes the actual mechanism somewhat irrelevant when changing the medium. Two parents have a child that combines some of the genome of each parent. The exchange is strongly biased toward experimenting with exchanging very similar chunks of the genome - specific genes performing specific functions - that have small variations among them.
Matingselection: Two trees are selected randomly. • Measurement of structural similarity: for each edge k in the larger tree, a subtree with smallest distance - imin(k) - in the other tree DS(k,imin(k)) • Measurement of structural similarity: • Selection of crossover points:
Improving Crossover - The Tradeoffs • Tradeoffs • Standard GP crossover acts mainly as a macromutation operator. • Much of our discussion has focused on how to improve crossover - how to make it more than a simple macromutation operator? • It is important not to under estimate the power of a simple mutation operator. • Digital Overhead and Homology • There is a cost associated with improving crossover in GP. • This digital overhead may be likened to the large amount of biological energy expended to maintain homologous crossover in nature. • Locating the Threshold
Conclusion • It certainly stands to benefit from improvements through smart mutation or other typed of added mechanisms. • The crossover operator will be much more powerful and robust over the next few years. • One of the strongest arguments for the building block hypothesis is the manner in which a GP population adapts to the destructive effects of crossover. • GP individuals tend to accumulate code that does nothing during a run - we refer to such code as introns. • The important point is that the presence of introns underlines how important preventation of destructive crossover is in the GP system. • The challenge in GP for the next few years is to tame the crossover operator and to find the building blocks.