190 likes | 204 Views
This paper introduces a new algorithm called PCFG-LA (Probabilistic Context-Free Grammar with Latent Annotations) that combines probabilistic grammar and latent annotations for estimating distributions in evolutionary algorithms. The algorithm is applied to the Royal Tree Problem and the DMAX Problem to evaluate its performance.
E N D
Estimation of Distribution Algorithm based on Probabilistic Grammar with Latent Annotations Written by Yoshihiko Hasegawa and Hitoshi Iba Summarized by Minhyeok Kim
Contents • Introduction • Two groups in GP-EDA • PCFG-LA • PCFG and PCFG-LA • probability of the annotated tree • Probability of a observed tree • Log-likelihood and Update formula • Assumptions • Forward-backward probability • P(T;Θ) by forward and backward • Parameter update formula • Initial parameters • PAGE(Programming with Anntated Grammar Estimation) • Experiment • Royal Tree Problem • DMAX Problem • Conclusion
Two groups in GP-EDA • Proto-type tree based method • It translates variable length tree-structures into fixed length structures • PCFG based method • It is considered to be well suited for expressing functions in GP • It’s production rules do not depend on the ascendant nodes or sibling nodes • It can not take into account the interactions among nodes
PCFG-LA(1/10)- PCFG and PCFG-LA • PCFG • 0.7 VP → V NP • 0.3 VP → V NP NP • PCFG-LA • PCFG + Latent annotations
PCFG-LA (2/10)-Probability of the annotated tree • The probability of the annotated tree • T : derivation tree • xi : annotation of ith non-terminal (all the non-terminals are numbered from the root) • X = {x1, x2, ...} • π(S[x]) : probability of S[x] at the root position • β(r) : probability of annotated production rule r • DT[X] : multi-set of used annotated rules in tree T • Θ : set of parameters Θ = {π, β}.
PCFG-LA (3/10)-Probability of a observed tree • The probability of a observed tree • It can be calculated with summing over annotations • The parameters (π and β) have to be estimated by EM algorithm
PCFG-LA (4/10)-Log-likelihood and Update formula • The differenceof log-likelihood between parameters Θ’ and Θ • the update formula can be obtained by optimizing Q(Θ’|Θ)
PCFG-LA (5/10)-Assumptions • Using not CNF but GNF • To reduce the number of parameters, assume that all right-side non-terminal symbols have the same annotation
PCFG-LA (6/10)-Forward-backward probability(1/2) • Backward probability biT(x) • The probability that the tree beneath ith non-terminal S[x] is generated • Forward probability fiT(y) • The probability that the tree above ith non-terminal S[y] is generated
PCFG-LA (7/10)-Forward-backward probability(2/2) • Backward probability • Forward probability • ch(i,T) :function which returns the set of non-terminal children indices of ith non-terminal in T • pa(i, T) : returns a parent index of ith non-terminal in T • giT :terminal symbol in CFG and is connected to ith non-terminal symbol in T
PCFG-LA (8/10)-P(T;Θ) by forward and backward • P(T;Θ) • cover(g,Ti) : function which returns a set of non-terminal indices at which the production rule generating g without annotations is rooted in Ti
PCFG-LA (9/10)-Parameter update formula • Parameter update formula • By using the forward-backward probability and optimizing Q(Θ’|Θ)
PCFG-LA (10/10)-Initial parameters • EM algorithm maximizes the log-likelihood monotonically from the initial parameters • Initial parameters • κ : random value which is uniformly distributed over [−log 3, log 3] • γ(S → g S....S) : probability of observed production rule (without annotations) • β(S[x] → g S[x]...S[x]) = 0.
Initialization of individuals Evaluation of individuals Selection of individuals Estimation of parameters Generation of new individuals PAGE(Programming with Anntated Grammar Estimation) • Flowchart
Experiment-Royal Tree Problem • Royal Tree Problem • Each function has increasing arity • a has 1 arity, b has 2,… • Perfect tree whose level is smaller by 1 level than the level • P-tree of level c is composed of the P-tree of level b • Level d royal tree problem in this experiments • To compare the performance between PAGE and PCFG-GP
Experiment-DMAX Problem • DMAX problem • MAX problem + deceptiveness • {+m,×m}, {λ,0.95}, λr=1 • Depth 4, m(arity) 5, r(power) 3 • To show the superiority over simple GP
Conclusion • In the royal tree problem, we showed that the number of annotations greatly affects the search performance and larger annotation size offered better performance • The result of DMAX problem showed that PAGE is highly effective for the problem with strong deceptiveness • PAGE uses EM algorithm, so it is more computationally expensive • The performance of PAGE is much more superior than none-annotation algorithm • It is important to optimize these two contradicting factors which will be examined in the future work