270 likes | 370 Views
Evolving a Solution: Developmental Plasticity in Evolutionary Computation. URS Presentation – April 17 th , 2009 Sara Lahr. THE BIG PICTURE. Some problems are difficult to solve They may have many factors They may have a solution that is always in flux Predicting Financial Markets
E N D
Evolving a Solution: Developmental Plasticityin Evolutionary Computation URS Presentation – April 17th, 2009 Sara Lahr
THE BIG PICTURE • Some problems are difficult to solve • They may have many factors • They may have a solution that is always in flux • Predicting Financial Markets • Discovering the Contagiousness of Diseases • Two excellent tools: • Evolutionary Computation • Genetic Programming
OUTLINE • Background Terms • Evolutionary Computation • Genetic Programming • N-gram GP • The Problem: Definition and Solution • Incremental Fitness Development • Results • Conclusion
EVOLUTIONARY COMPUTATION Evolutionary Computation is a field of Artificial Intelligence that uses growth and development of a population of individuals to find solutions to a given problem.
HOW DOES IT WORK? • You have a problem that you don’t know how to solve. • You can recognize a good solution if you see it.
HOW DOES IT WORK? • Generate a population of random solutions 1– ADBCE 2– ACBDE 3– ADEBC 4- ABCED • Find the better individuals and generate a new population based on them. Repeat.
GENETIC PROGRAMMING Evolutionary Computation is a field of Artificial Intelligence that uses growth and development of a population of individuals to find a solution to a given problem. Genetic Programming is a subset of EC where each individual is an actual computer program.
N-GRAM GP Evolutionary Computation is a field of Artificial Intelligence that uses growth and development of a population of individuals to find a solution to a given problem. Genetic Programming is a subset of EC where each individual is an actual computer program. N-gram GP is an extension of GP where the programs are linear sequences of instructions generated based on probabilities learned over time.
N-GRAM GP • A N-gram is a group of n consecutive elements in a sequence. def and efg are both 3-grams of defg. • Each item in the N-gram is an instruction. d may be ADD, e SUB, etc. • A matrix holds the probability of a given instruction appearing. • Given instruction d followed by instruction e: the matrix determines the next instruction based on evolved probabilities. • Existing good program: dedededef • Building a new program: de_ d – 0.75 f – 0.25
THE SYMBOLIC REGRESSION PROBLEM GOAL: Find a function that maps to a given set of data points.
N-GRAM GP WEAKNESSES • Convergence • The system makes a specific instruction consistently dominant • Inflexible • d may be important early, but f may vital to the program later • Not Modular • Creates large loops of converged instructions.
INCREMENTAL FITNESS DEVELOPMENT • A technique we developed for N-gram GP. • A block of instructions is generated by the probability matrix and appended to the end of the program. If the fitness of the extended program gets worse, the block is thrown away, and a new block is generated.
INCREMENTAL FITNESS DEVELOPMENT Original Program Original Program A B C New Block
INCREMENTAL FITNESS DEVELOPMENT Discarded Block A B C Original Program C D B New Block Kept Block Original Program A E B C D B New Block
N-GRAM GP WEAKNESSES Versus Incremental Fitness Development • Convergence • IFD is a more meticulous searcher; it maps out local possibilities • Inflexible • IFD is more flexible; it is able to find the less probable instructions • Not Modular • IFD is able to split the function into valuable portions
RESULTS *Number of successful runs out of 100 independent trials
CONCLUSIONS • Theoretical work necessary for improving existing tools. • Incremental Fitness Development is a successful extension of N-gram GP • The more successful the system, the more reliable the solutions • The more reliable the solutions the more useful they are in application
REFERENCES J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA, 1992. N. F. McPhee, E. Crane, S. E. Lahr, R. Poli. Developmental Plasticity in Linear Genetic Programming. GECCO '09: Proceedings of the 11th annual conference on Genetic and evolutionary computation. Montreal. 2009. R. Poli and N. McPhee. A linear estimation-of-distribution GP system. In M. O’Neill, L. Vanneschi, S. Gustafson, A. I. Esparcia Alcazar, I. De Falco, A. Della Cioppa, and E. Tarantino, editors, Proceedings of the 11th European Conference on Genetic Programming, EuroGP 2008, volume 4971 of Lecture Notes in Computer Science, pages 206–217, Naples, 26-28 Mar. 2008. Springer. R. Poli, W. B. Langdon, and N. F. McPhee. A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk, 2008. (With contributions by J. R. Koza). ACKNOWLEDGEMENTS Thanks to the Morris Academic Partnership program and to all who helped me in this work, especially Nic McPhee, Ellery Crane (UMM ‘06), and RiccardoPoli QUESTIONS?
N-GRAM GP TEXT GENERATIONS • Romeo and Juliet – Shakespeare • “JULIET If I do I drink to thee Had I it written I would not let us forth So that my father that went hence so fast ?” • Wikipedia: Genetic Programming and N-gram • “In addition because of the best parsers of English currently in existence are roughly of this idea often say this approach is overly broad in scope .” • Declaration of Independence • “We hold these truths to be totally dissolved and that all political connection between them and the state remaining in the meantime exposed to all the dangers of invasion from without and convulsions within .”
MODULARITY Program using standard N-gram GP. >Less modular Program using Incremental Fitness Development. >More Modular
N-GRAM GP • A matrix holds the probability of a given instruction appearing. • Instruction • Start • READ_IN: Reg1 • READ_IN: Reg2 • ADD: Reg2 • MULT: Reg1 • SWAP • Answer: 2x2