1.45k likes | 1.64k Views
GENETIC PROGRAMMING. John R. Koza Consulting Professor (Biomedical Informatics) Department of Medicine School of Medicine Consulting Professor Department of Electrical Engineering School of Engineering Stanford University Stanford, California 94305 koza@stanford.edu
E N D
John R. Koza Consulting Professor (Biomedical Informatics) Department of Medicine School of Medicine Consulting Professor Department of Electrical Engineering School of Engineering Stanford University Stanford, California 94305 koza@stanford.edu http://www.smi.stanford.edu/people/koza/
THE CHALLENGE OF ARTIFICIAL INTELLIGENCE “How can computers learn to solve problems without being explicitly programmed? In other words, how can computers be made to do what is needed to be done, without being told exactly how to do it?” Attributed to Arthur Samuel (1959)
CRITERION FOR SUCCESS "The aim [is] ... to get machines to exhibit behavior, which if done by humans, would be assumed to involve the use of intelligence.“ Arthur Samuel (1983)
MAIN POINT No. 1 • Genetic programming now routinely delivers high-return human-competitive machine intelligence
MAIN POINT No. 2 • Genetic programming is an automated invention machine
MAIN POINT No. 3 • Genetic programming has delivered a progression of qualitatively more substantial results in synchrony with five approximately order-of-magnitude increases in the expenditure of computer time
MAIN POINT No. 1 • Genetic programming now routinely delivers high-returnhuman-competitive machine intelligence
“HUMAN-COMPETITIVE” • The result is equal or better than human-designed solution to the same problem
NASA EVOLVED ANTENNA X-Band Antenna for NASA's Space Technology 5 Mission in 2004
“HUMAN-COMPETITIVE” • Previously patented, an improvement over a patented invention, or patentable today
DEFINITION OF “HIGH-RETURN” The AI ratio (the “artificial-to-intelligence” ratio) of a problem-solving method as the ratio of that which is delivered by the automated operation of the artificial method to the amount of intelligence that is supplied by the human applying the method to a particular problem
DEFINITION OF “ROUTINE” A problem solving method is routine if it is general and relatively little human effort is required to get the method to successfully handle new problems within a particular domain and to successfully handle new problems from a different domain.
PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL RESULTS PRODUCED BY GP • Toy problems • Human-competitive non-patent results • 20th-century patented inventions • 21st-century patented inventions • Patentable new inventions
Decision trees If-then production rules Horn clauses Neural nets Bayesian networks Frames Propositional logic Binary decision diagrams Formal grammars Coefficients for polynomials Reinforcement learning tables Conceptual clusters Classifier systems REPRESENTATIONS
PROGRAM TREE (+ 1 2 (IF (> TIME 10) 3 4))
GENETIC PROGRAMMING • Create initial population (random) • Main generational loop • Execute all programs • Evaluate fitness of all programs • Select single individuals or pairs of individuals based on fitness to participate in the genetic operations (mutation, crossover, reproduction, architecture-altering operations) • Termination Criterion
CREATING RANDOM PROGRAMS • Function Set F = {+, -, *, %, IFLTE} • Terminal Set T = {X, Y, Random-Values}
CREATING RANDOM PROGRAMS • The random programs are: • Of different sizes and shapes • Syntactically valid • Executable
DARWINIAN SELECTION • Selection based on fitness • Better individual more likely to be selected • Probabilistic selection - Best is not always picked - Worst is not necessarily excluded
MUTATION OPERATION • Select 1 parent probabilistically based on fitness • Pick point from 1 to NUMBER-OF-POINTS • Delete subtree at the picked point • Grow new subtree at the mutation point in same way as generated trees for initial random population (generation 0) • The result is a syntactically valid executable program • Put the offspring into the next generation of the population
CROSSOVER OPERATION • Select 2 parents probabilistically based on fitness • Randomly pick a number from 1 to NUMBER-OF-POINTS for 1st parent • Independently randomly pick a number for 2nd parent • The result is a syntactically valid executable program • Put the offspring into the next generation of the population • Identify the subtrees rooted at the two picked points
REPRODUCTION OPERATION • Select parent probabilistically based on fitness • Copy it (unchanged) into the next generation of the population
PROBABILISTIC STEPS • The initial population is typically random • Probabilistic selection based on fitness - Best is not always picked - Worst is not necessarily excluded • Random picking of mutation and crossover points
5 MAJOR PREPARATORY STEPS OF GP • Determining the set of terminals • Determining the set of functions • Determining the fitness measure • Determining the parameters for the run • Determining the criterion for terminating a run
SYMBOLIC REGRESSION POPULATION OF 4 RANDOMLY CREATED INDIVIDUALS FOR GENERATION 0
First offspring of crossover of (a) and (b) picking “+” of parent (a) and left-most “x” of parent (b) as crossover points Second offspring of crossover of (a) and (b) picking “+” of parent (a) and left-most “x” of parent (b) as crossover points Mutant of (c) picking “2” as mutation point Copy of (a) SYMBOLIC REGRESSION x2 + x + 1 GENERATION 1
GENETIC PROGRAMMING: ON THE PROGRAMMING OF COMPUTERS BY MEANS OF NATURAL SELECTION(Koza 1992)
2 MAIN POINTS FROM 1992 BOOK • Virtually all problems in artificial intelligence, machine learning, adaptive systems, and automated learning can be recast as a search for a computer program. • Genetic programming provides a way to successfully conduct the search for a computer program in the space of computer programs.
PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL RESULTS PRODUCED BY GP • Toy problems • Human-competitive non-patent results • 20th-century patented inventions • 21st-century patented inventions • Patentable new inventions
COMPUTER PROGRAMS • Subroutines provide one way to REUSE code possibly with different instantiations of the dummy variables (formal parameters) • Loops (and iterations) provide a 2nd way to REUSE code • Recursion provide a 3rd way to REUSE code • Memory provides a 4th way to REUSE the results of executing code
DIFFERENCE IN VOLUMES D = L0W0H0 – L1W1H1
EVOLVED SOLUTION (- (* (* W0 L0) H0) (* (* W1 L1) H1))
AUTOMATICALLY DEFINED FUNCTION volume (progn (defun volume (arg0 arg1 arg2) (values (* arg0 (* arg1 arg2)))) (values (- (volume L0 W0 H0) (volume L1 W1 H1))))
AUTOMATICALLY DEFINED FUNCTIONS (SUBROUTINES) • ADFs provide a way to REUSE code • Code is typically reused with different instantiations of the dummy variables (formal parameters)
DIVIDE AND CONQUER • Decompose a problem into sub-problems • Solve the sub-problems • Assemble the solutions of the sub-problems into a solution for the overall problem