280 likes | 286 Views
Discover the intricacies of Genetic Programming (GP) slightly beyond Turing's computability, analyzing its formal aspects, why GP stands out, and its working mechanism. Explore the potential, challenges, and critical analysis to understand the necessity and limitations of GP.
E N D
Slightly beyond Turing’s computability for studying Genetic Programming Olivier Teytaud, Tao, Inria, Lri, UMR CNRS 8623, Univ. Paris-Sud, Pascal, Digiteo
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view
What is Genetic Programming (GP) • GP = mining Turing-equivalent spaces of functions • Typical example: symbolic regression. • Inputs: • x1,x2,x3,…,xN in {0,1}* • y1,y2,y3,…,yN in {0,1} yi=f(xi) • (xi,yi) assumed independently identically distributed (unknown distribution of probability) • Goal: • Finding g such that E|g(x)-y| + C E Time(g,x) as small as possible
How does GP works ? • GP = evolutionary algorithm. • Evolutionary algorithm: • P = initial population • While (my favorite criterion) • Selection = best functions in P according to some score • Mutations = random perturbations of progs in the Selection • Cross-over = merging of programs in the Selection • P ≈ Selection + Mutations + Cross-over
How does GP works ? • GP = evolutionary algorithm. • Evolutionary algorithm: • P = initial population • While (my favorite criterion) • Selection = best functions in P according to some score • Mutations = random perturbations of progs in the Selection • Cross-over = merging of programs in the Selection • P ≈ Selection + Mutations + Cross-over Does it work ?
How does GP works ? • GP = evolutionary algorithm. • Evolutionary algorithm: • P = initial population • While (my favorite criterion) • Selection = best functions in P according to some score • Mutations = random perturbations of progs in the Selection • Cross-over = merging of programs in the Selection • P ≈ Selection + Mutations + Cross-over Does it work ? Definitely, yes for robust and multimodal optimization in complex domains (trees, bitstrings,…).
How does GP works ? • GP = evolutionary algorithm. • Evolutionary algorithm: • P = initial population • While (my favorite criterion) • Selection = best functions in P according to some score • Mutations = random perturbations of progs in the Selection • Cross-over = merging of programs in the Selection • P ≈ Selection + Mutations + Cross-over Does it work ?
How does GP works ? • GP = evolutionary algorithm. • Evolutionary algorithm: • P = initial population • While (my favorite criterion) • Selection = best functions in P according to some score • Mutations = random perturbations of progs in the Selection • Cross-over = merging of programs in the Selection • P ≈ Selection + Mutations + Cross-over Which score ? A nice question for mathematicians
Why studying GP ? • GP is studied by many people • 5440 articles in the GP bibliography [5] • More than 880 authors • GP seemingly works • Human-competitive results http://www.genetic-programming.com/humancompetitive.html • Nothing else for mining Turing-equivalent spaces of programs • Probably better than random search • Not so many mathematical fundations in GP • Not so many open problems in computability, in particular with applications
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view
Formalization of GP What is typically GP ? • No halting criterion. We stop when time is exhausted. • No use of prior knowledge; no use of f, whenever you know it. People (often) do not like GP because: • It is slow and has no halting criterion • It uses the yi=f(xi) and not f (different from automatic code generation) Are these two elements necessary ?
Formalization of GP Summary: GP uses only the f(xi) and the Time(f,xi). GP never halts: O1, O2, O3, … . Can we do better ?
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view
Known results Whenever f is available (and not only the f(xi) ), computing O such that • O≡f • O optimal for size (or speed, or space …) is not possible. (i.e. there’s no Turing machine performing that task for all f)
A first (easy) good reason for GP. Whenever f isavailable (and not only the f(xi) ), computing O1, O2, …, such that • Op ≡ f for p sufficiently large • Lim size(Op) optimal is possible, with proved convergence rates, e.g. by bloat penalization: - while (true) - select the best program P for a compromise relevance on the n first examples + penalization of size, e.g. Sum |P(xi)-yi |+ C( |P| , n ) i < n - n=n+1 (see details of the proof and of the algorithm in the paper)
A first (easy) good reason for GP. Whenever f is notavailable (and not only the f(xi) ), computing O1, O2, …, such that • Op ≡ f for p sufficiently large • Lim size(Op) optimal is possible, with proved convergence rates, e.g. by bloat penalization: - consider a population of programs; set n=1 - while (true) - select the best program P for a compromise relevance on the n first examples + penalization of size, e.g. Sum |P(xi)-yi |+ C( |P| , n ) i < n - n=n+1 (see details of the proof and of the algorithm in the paper)
A first (easy) good reason for GP. Asymptotically (only!), finding an optimal function O ≡ f is possible. No halting criterion is possible (avoids the use of an oracle in 0’)
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view: • Kolmogorov’s complexity with bounded time • Application to genetic programming
Kolmogorov’s complexity • Kolmogorov’s complexity of x : Minimum size of a program generating x • Kolmogorov’s complexity of x with time at most T : Minimum size of a program generating x in time at most T. Kolmogorov’s complexity in bounded time = computable.
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view: • Kolmogorov’s complexity with bounded time • Application to genetic programming
Kolmogorov’s complexity and genetic programming • GP uses expensive simulations of programs • Can we get rid of the simulation time ? e.g. by using f not only as a black box ? • Essentially, no: • Example of GP problem: finding O as small as possible with • ETime(O,x)<Tn, • |O|<Sn • O(x)=y • If Tn = Ω(2n) and some Sn = O(log(n)), this requires time at least Tn/polynomial(n) • Just simulating all programs shorter than Sn and « faster » than Tn is possible in time polynomial(n)Tn
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view: • Kolmogorov’s complexity with bounded time • Application to genetic programming • Conclusion
Conclusion • Summary • GP is typically solving approximately problems in 0’ • A lot of work about approximating NP-complete problems, but not a lot about 0’ • We provide a theoretical analysis of GP • Conclusions: • GP uses expensive simulations, but the simulation cost can anyway not be removed. • GP has no halting criterion, but no halting criterion can be found. • Also, « bloat » penalization ensures consistency this point proposes a parametrization of the usual algorithms.
Conclusion • Summary • GP is typically solving approximately problems in 0’ • A lot of work about approximating NP-complete problems, but not a lot about 0’ • We provide a theoretical analysis of GP • Conclusions: • GP uses expensive simulations, but the simulation cost can anyway not be removed. • GP has no halting criterion, but no halting criterion can be found. • Also, « bloat » penalization ensures consistency this point proposes a parametrization of the usual algorithms.
Conclusion • Summary • GP is typically solving approximately problems in 0’ • A lot of work about approximating NP-complete problems, but not a lot about 0’ • We provide a mathematical analysis of GP • Conclusions: • GP uses expensive simulations, but the simulation cost can anyway not be removed. • GP has no halting criterion, but no halting criterion can be found. • Also, « bloat » penalization ensures consistency this point proposes a parametrization of the usual algorithms.