1 / 62

树结构编码进化优化算法

树结构编码进化优化算法. 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cn http://cilab.ujn.edu.cn. Genetic Programming. Developed: USA in the 1990’s Early names: J. Koza Typically applied to: machine learning tasks (prediction, classification …) Attributed features: competes with neural nets and alike

Download Presentation

树结构编码进化优化算法

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnhttp://cilab.ujn.edu.cn

  2. Genetic Programming • Developed: USA in the 1990’s • Early names: J. Koza • Typically applied to: • machine learning tasks (prediction, classification …) • Attributed features: • competes with neural nets and alike • needs huge populations (thousands) • slow • Special: • non-linear chromosomes: trees, graphs • mutation possible but not necessary (disputed!)

  3. GP technical summary tableau

  4. Introductory example: credit scoring • Bank wants to distinguish good from bad loan applicants • Model needed that matches historical data

  5. Introductory example: credit scoring • A possible model: • IF (NOC = 2) AND (S > 80000) THEN good ELSE bad • In general: • IF formula THEN good ELSE bad • Only unknown is the right formula, hence • Our search space (phenotypes) is the set of formulas • Natural fitness of a formula: percentage of well classified cases of the model it stands for • Natural representation of formulas (genotypes) is: parse trees

  6. AND = > NOC 2 S 80000 Introductory example: credit scoring IF (NOC = 2) AND (S > 80000) THEN good ELSE bad can be represented by the following tree

  7. Tree based representation • Trees are a universal form, e.g. consider • Arithmetic formula • Logical formula • Program (x  true)  (( x  y )  (z  (x  y))) i =1; while (i < 20) { i = i +1 }

  8. Tree based representation

  9. Tree based representation (x  true)  (( x  y )  (z  (x  y)))

  10. Tree based representation i =1; while (i < 20) { i = i +1 }

  11. Tree based representation • In GA, ES, EP chromosomes are linear structures (bit strings, integer string, real-valued vectors, permutations) • Tree shaped chromosomes are non-linear structures • In GA, ES, EP the size of the chromosomes is fixed • Trees in GP may vary in depth and width

  12. Tree based representation • Symbolic expressions can be defined by • Terminal set T • Function set F (with the arities of function symbols) • Adopting the following general recursive definition: • Every t  T is a correct expression • f(e1, …, en) is a correct expression if f  F, arity(f)=n and e1, …, en are correct expressions • There are no other forms of correct expressions • In general, expressions in GP are not typed (closure property: any f  F can take any g  F as argument)

  13. Offspring creation scheme • Compare • GA scheme using crossover AND mutation sequentially (be it probabilistically) • GP scheme using crossover OR mutation (chosen probabilistically)

  14. Flowchart GA flowchart GP flowchart

  15. Mutation • Most common mutation: replace randomly chosen subtree by randomly generated tree

  16. Mutation cont’d • Mutation has two parameters: • Probability pm to choose mutation vs. recombination • Probability to chose an internal point as the root of the subtree to be replaced • Remarkably pm is advised to be 0 (Koza’92) or very small, like 0.05 (Banzhaf et al. ’98) • The size of the child can exceed the size of the parent

  17. Recombination = Crossover • Most common recombination: exchange two randomly chosen subtrees among the parents • Recombination has two parameters: • Probability pc to choose recombination vs. mutation • Probability to chose an internal point within each parent as crossover point • The size of offspring can exceed that of the parents

  18. Crossover Parent 2 Parent 1 Child 1 Child 2

  19. Selection • Parent selection typically fitness proportionate • Over-selection in very large populations • rank population by fitness and divide it into two groups: • group 1: best x% of population, group 2 other (100-x)% • 80% of selection operations chooses from group 1, 20% from group 2 • for pop. size = 1000, 2000, 4000, 8000 x = 32%, 16%, 8%, 4% • motivation: to increase efficiency, %’s come from rule of thumb • Survivor selection: • Typical: generational scheme (thus none) • Recently steady-state is becoming popular for its elitism

  20. Initialization • Maximum initial depth of trees Dmax is set • Full method (each branch has depth = Dmax): • nodes at depth d < Dmax randomly chosen from function set F • nodes at depth d = Dmax randomly chosen from terminal set T • Grow method (each branch has depth  Dmax): • nodes at depth d < Dmax randomly chosen from F  T • nodes at depth d = Dmax randomly chosen from T • Common GP initialisation: ramped half-and-half, where grow & full method each deliver half of initial population

  21. Bloat (膨胀) • Bloat = “survival of the fattest”, i.e., the tree sizes in the population are increasing over time • Ongoing research and debate about the reasons • Needs countermeasures, e.g. • Prohibiting variation operators that would deliver “too big” children • Parsimony pressure: penalty for being oversized

  22. Problems involving “physical” environments • Trees for data fitting vs. trees (programs) that are “really” executable • Execution can change the environment  the calculation of fitness • Example: robot controller • Fitness calculations mostly by simulation, ranging from expensive to extremely expensive (in time) • But evolved controllers are often to very good

  23. Example application: symbolic regression • Given some points in R2, (x1, y1), … , (xn, yn) • Find function f(x) s.t. i = 1, …, n : f(xi) = yi • Possible GP solution: • Representation by F = {+, -, /, sin, cos}, T = R {x} • Fitness is the error • All operators standard • pop.size = 1000, ramped half-half initialisation • Termination: n “hits” or 50000 fitness evaluations reached (where “hit” is if | f(xi) – yi | < 0.0001)

  24. Discussion • Is GP: • The art of evolving computer programs ? • Means to automated programming of computers? • GA with another representation?

  25. CREATING RANDOM PROGRAMS • Available functions F = {+, -, *, %, IFLTE} • IFLTE – if arg1 <= arg2 return arg3 else return arg4 • Available terminals T = {X, Y, Random-Constants} • The random programs are: • Of different sizes and shapes • Syntactically valid • Executable

  26. CREATING RANDOM PROGRAMS

  27. MUTATION OPERATION • Select 1 parent probabilistically based on fitness • Pick point from 1 to NUMBER-OF-POINTS • Delete subtree at the picked point • Grow new subtree at the mutation point in same way as generated trees for initial random population (generation 0) • The result is a syntactically valid executable program • Put the offspring into the next generation of the population

  28. MUTATION OPERATION

  29. CROSSOVER OPERATION • Select 2 parents probabilistically based on fitness • Randomly pick a number from 1 to NUMBER-OF-POINTS for 1st parent • Independently randomly pick a number for 2nd parent • The result is a syntactically valid executable program • Put the offspring into the next generation of the population • Identify the subtrees rooted at the two picked points

  30. CROSSOVER OPERATION

  31. Architecture-Altering Operations • 1.subroutine duplication operation

  32. Architecture-Altering Operations • 2. Argument duplication

  33. Architecture-Altering Operations • 3.Subroutine creation operation

  34. Architecture-Altering Operations • 4. Subroutine deletion

  35. Architecture-Altering Operations • 5. Argument deletion

  36. FIVE MAJOR PREPARATORY STEPS • Determining the set of terminals • Determining the set of functions • Determining the fitness measure • Determining the parameters for the run • Determining the method for designating a result and the criterion for terminating a run

  37. 概率增强式程序进化(PIPE) • Salustowicz & Schmidhuber (1997) • Probabilistic incremental program evolution (PIPE) • Model: • Probabilistic prototype tree (PPT) • Each node: Distribution over instruction set • Can grow and shrink (variable size) • Update algorithm • Similar to PBIL • Elitism is incorporated

  38. Probability Prototype Tree • Complete n-ary tree • Each node Nd,w contains • Random constant, Rd,w • Variable probability vector • l+k components (instructions) • d : Node’s depth, w : Horizontal position • pd,w(i) : probability of choosing instruction i

  39. Program Generation • Start with root node: d = w = 0 • Depth first, left-to-right traversal • Choose instruction i with pd,w(i) • If i is a random constant • If pd,w(i) > Tr use Rd,w • Uniformly random number

  40. Example: PPT & Generation

  41. PIPE Algorithm • Initialize probabilistic prototype tree • Repeat until termination criteria is met • Create population of programs • Grow PPT if required • Evaluate population • Favor smaller programs if all is equal • Update & mutate PPT • Prune PPT

  42. 初始化 迭代次数=0 否 是 基于种群的学习 迭代次数+1 精华学习 否 找到满意解 是 停 止 PIPE算法程序流程图 Flowchart 迭代次数!=0

  43. PPT Initialization • Random constant Rd,w = U[0,1) • pt= Probability of using terminal set • For all terminal instructions • pd,w(i) = pt/l • For all function instruction • pd,w(i) = (1-pt)/k

  44. PPT Growth Growth “on demand”

  45. Updating & Mutating PPT • Want best tree probability to be PT • pd,w(I) updated iteratively pd,w(i) = pd,w(i) +  (1-pd,w(i)) • Mutation pd,w(i) = pd,w(i) + m (1-pd,w(i)) • Normalize probabilities

  46. PPT Pruning • Prune if any pd,w(i) > Tp • Tp = 0.9

  47. Summary: PIPE • PBIL like algorithm for evolving programs • Probabilistic prototype tree • Variable length • PBIL updation rule • Resultsbetter than GP • Many user defined constants • Effects are not understood

  48. 例子1:曲线拟和 • Sin(x)可以展开成标准泰勒公式 3 5 7 x x x = - + - + sin( x ) x , Forx Î R … 3 ! 5 ! 7 ! • 运算符集即可以选为 • 设计一个适应值函数(在这个实验中取期望输出与模型输出之间的绝对误差之和为适应值函数)计算问题的个体的适应值

  49. Parameter Setting

  50. Result 正旋函数的曲线拟和

More Related