280 likes | 400 Views
Slightly beyond Turing’s computability for studying Genetic Programming. Olivier Teytaud, Tao, Inria, Lri, UMR CNRS 8623, Univ. Paris-Sud, Pascal, Digiteo. Outline. What is genetic programming Formal analysis of Genetic Programming Why is there nothing else than Genetic Programming ?
E N D
Slightly beyond Turing’s computability for studying Genetic Programming Olivier Teytaud, Tao, Inria, Lri, UMR CNRS 8623, Univ. Paris-Sud, Pascal, Digiteo
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view
What is Genetic Programming (GP) • GP = mining Turing-equivalent spaces of functions • Typical example: symbolic regression. • Inputs: • x1,x2,x3,…,xN in {0,1}* • y1,y2,y3,…,yN in {0,1} yi=f(xi) • (xi,yi) assumed independently identically distributed (unknown distribution of probability) • Goal: • Finding g such that E|g(x)-y| + C E Time(g,x) as small as possible
How does GP works ? • GP = evolutionary algorithm. • Evolutionary algorithm: • P = initial population • While (my favorite criterion) • Selection = best functions in P according to some score • Mutations = random perturbations of progs in the Selection • Cross-over = merging of programs in the Selection • P ≈ Selection + Mutations + Cross-over
How does GP works ? • GP = evolutionary algorithm. • Evolutionary algorithm: • P = initial population • While (my favorite criterion) • Selection = best functions in P according to some score • Mutations = random perturbations of progs in the Selection • Cross-over = merging of programs in the Selection • P ≈ Selection + Mutations + Cross-over Does it work ?
How does GP works ? • GP = evolutionary algorithm. • Evolutionary algorithm: • P = initial population • While (my favorite criterion) • Selection = best functions in P according to some score • Mutations = random perturbations of progs in the Selection • Cross-over = merging of programs in the Selection • P ≈ Selection + Mutations + Cross-over Does it work ? Definitely, yes for robust and multimodal optimization in complex domains (trees, bitstrings,…).
How does GP works ? • GP = evolutionary algorithm. • Evolutionary algorithm: • P = initial population • While (my favorite criterion) • Selection = best functions in P according to some score • Mutations = random perturbations of progs in the Selection • Cross-over = merging of programs in the Selection • P ≈ Selection + Mutations + Cross-over Does it work ?
How does GP works ? • GP = evolutionary algorithm. • Evolutionary algorithm: • P = initial population • While (my favorite criterion) • Selection = best functions in P according to some score • Mutations = random perturbations of progs in the Selection • Cross-over = merging of programs in the Selection • P ≈ Selection + Mutations + Cross-over Which score ? A nice question for mathematicians
Why studying GP ? • GP is studied by many people • 5440 articles in the GP bibliography [5] • More than 880 authors • GP seemingly works • Human-competitive results http://www.genetic-programming.com/humancompetitive.html • Nothing else for mining Turing-equivalent spaces of programs • Probably better than random search • Not so many mathematical fundations in GP • Not so many open problems in computability, in particular with applications
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view
Formalization of GP What is typically GP ? • No halting criterion. We stop when time is exhausted. • No use of prior knowledge; no use of f, whenever you know it. People (often) do not like GP because: • It is slow and has no halting criterion • It uses the yi=f(xi) and not f (different from automatic code generation) Are these two elements necessary ?
Formalization of GP Summary: GP uses only the f(xi) and the Time(f,xi). GP never halts: O1, O2, O3, … . Can we do better ?
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view
Known results Whenever f is available (and not only the f(xi) ), computing O such that • O≡f • O optimal for size (or speed, or space …) is not possible. (i.e. there’s no Turing machine performing that task for all f)
A first (easy) good reason for GP. Whenever f isavailable (and not only the f(xi) ), computing O1, O2, …, such that • Op ≡ f for p sufficiently large • Lim size(Op) optimal is possible, with proved convergence rates, e.g. by bloat penalization: - while (true) - select the best program P for a compromise relevance on the n first examples + penalization of size, e.g. Sum |P(xi)-yi |+ C( |P| , n ) i < n - n=n+1 (see details of the proof and of the algorithm in the paper)
A first (easy) good reason for GP. Whenever f is notavailable (and not only the f(xi) ), computing O1, O2, …, such that • Op ≡ f for p sufficiently large • Lim size(Op) optimal is possible, with proved convergence rates, e.g. by bloat penalization: - consider a population of programs; set n=1 - while (true) - select the best program P for a compromise relevance on the n first examples + penalization of size, e.g. Sum |P(xi)-yi |+ C( |P| , n ) i < n - n=n+1 (see details of the proof and of the algorithm in the paper)
A first (easy) good reason for GP. Asymptotically (only!), finding an optimal function O ≡ f is possible. No halting criterion is possible (avoids the use of an oracle in 0’)
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view: • Kolmogorov’s complexity with bounded time • Application to genetic programming
Kolmogorov’s complexity • Kolmogorov’s complexity of x : Minimum size of a program generating x • Kolmogorov’s complexity of x with time at most T : Minimum size of a program generating x in time at most T. Kolmogorov’s complexity in bounded time = computable.
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view: • Kolmogorov’s complexity with bounded time • Application to genetic programming
Kolmogorov’s complexity and genetic programming • GP uses expensive simulations of programs • Can we get rid of the simulation time ? e.g. by using f not only as a black box ? • Essentially, no: • Example of GP problem: finding O as small as possible with • ETime(O,x)<Tn, • |O|<Sn • O(x)=y • If Tn = Ω(2n) and some Sn = O(log(n)), this requires time at least Tn/polynomial(n) • Just simulating all programs shorter than Sn and « faster » than Tn is possible in time polynomial(n)Tn
Outline • What is genetic programming • Formal analysis of Genetic Programming • Why is there nothing else than Genetic Programming ? • Computability point of view • Complexity point of view: • Kolmogorov’s complexity with bounded time • Application to genetic programming • Conclusion
Conclusion • Summary • GP is typically solving approximately problems in 0’ • A lot of work about approximating NP-complete problems, but not a lot about 0’ • We provide a theoretical analysis of GP • Conclusions: • GP uses expensive simulations, but the simulation cost can anyway not be removed. • GP has no halting criterion, but no halting criterion can be found. • Also, « bloat » penalization ensures consistency this point proposes a parametrization of the usual algorithms.
Conclusion • Summary • GP is typically solving approximately problems in 0’ • A lot of work about approximating NP-complete problems, but not a lot about 0’ • We provide a theoretical analysis of GP • Conclusions: • GP uses expensive simulations, but the simulation cost can anyway not be removed. • GP has no halting criterion, but no halting criterion can be found. • Also, « bloat » penalization ensures consistency this point proposes a parametrization of the usual algorithms.
Conclusion • Summary • GP is typically solving approximately problems in 0’ • A lot of work about approximating NP-complete problems, but not a lot about 0’ • We provide a mathematical analysis of GP • Conclusions: • GP uses expensive simulations, but the simulation cost can anyway not be removed. • GP has no halting criterion, but no halting criterion can be found. • Also, « bloat » penalization ensures consistency this point proposes a parametrization of the usual algorithms.