1 / 17

Evolving L-Systems to Capture Protein Structure Native Conformations

Evolving L-Systems to Capture Protein Structure Native Conformations. Gabi Escuela 1 , Gabriela Ochoa 2 and Natalio Krasnogor 3 1,2 Department of Computer Science, Universidad Simon Bolivar, Caracas, Venezuela 1 gabiescuela@netuno.net.ve, 2 gabro@ldc.usb.ve

mandel
Download Presentation

Evolving L-Systems to Capture Protein Structure Native Conformations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evolving L-Systems to Capture Protein StructureNative Conformations Gabi Escuela1, Gabriela Ochoa2 and Natalio Krasnogor3 1,2 Department of Computer Science, Universidad Simon Bolivar, Caracas, Venezuela 1gabiescuela@netuno.net.ve,2gabro@ldc.usb.ve 3 School of Computer Science and I.T., University of Nottingham Natalio.Krasnogor@nottingham.ac.uk

  2. Content • Proteins • Protein Structure Prediction (PSP) • The HP model • EA approaches to PSP: current encoding • L-Systems • Why a grammatical encoding? • Methods and Results • Discussion and Future Work 3D structure of myoglobin, showing coloured alpha helices.

  3. Proteins • Linear chains of ~30-400 units from 20 different amino acids • Fold into a unique functional structure: native state or tertiary structure Show repeated substructures: alphahelices and beta sheets 1A8M 3-D Structure

  4. Protein Structure Prediction (PSP) • Goal: Determining the 3D structure of proteins from their amino acid sequences • Strategy: find an amino acid chain's state of minimum energy • Solution will have practical consequences in medicine, drug development and agriculture

  5. The 2D HP Model 2 Amino acids types: hydrophobic (H) and polar or hydrophilic (P) • Hydrophobic effect is the main force governing folding • qЄ{H, P}+, each letter of q has to be put in vertex of a given lattice L (at each point: turn 90º Left or Right, or continue ahead) • Scoring function: adds -1 for each “contact” between two Hs adjacent in the lattice that are not consecutive in q HPHPPHHPHPPHPHHPPHPH Square Lattice 9 H-H bonds Score = -9 Objective:Find the organization (embedding) of qin Lof minimum score (maximum contacts)

  6. EA approaches to PSP: Current (Direct) Encoding • EAs and other stochastic methods: global optimization of a suitable energy function • Encoding: Cartesian Coordinates, Distance Geometries, Internal Coordinates • Absolute: structure encoded as a string of symbols. For example: In the 2D Square s = {Up, Down, Left, Right}+ • Relative: each move is interpreted in terms of the previous one s = {Forward, TurnLeft, TurnRight} +

  7. Protein : HPHPPHHPHPPHPHHPPHPH L =20 Absolute Encoding RDDLULDLDLUURULURRD L = 19 R D L D First position is fixed Relative Encoding RFRRLLRLRRFRLLRRFRL = 18 R R R F First and second position are fixed

  8. F F+f F+f+F F+f+F+F+f L-Systems (Lindenmayer, 1968) • A model of morphogenesis, based on formal grammars • Rewriting: Define complex objects by replacing parts of a simple object using a set of productions. • Symbols: F, f, +, -, [, ] • Axiom (S) • Production (replacement) rules r1: S: F r2: f F start F 1 F+f 2 3

  9. Why a Grammatical Encoding? • Specifies how to construct the phenotype • Can achieve greater scalability through self-similar and hierarchical structure • Proteins exhibit high degree of regularity, and repeated motifs • Current encoding may not be suitable for crossover and building block transfer between individuals Protein Structure 3D L-System

  10. Method • Prove of principle: Can a folded protein be captured (encoded) by an L-system? • How to find that L-system: An EA used to evolve an L-system that capture a folded protein (inverse problem) Output: L-system L that once derived, will produce the target string RFRRLLRLRRFRLLRRFR Input: Folded structure in Relative Coordinates RFRRLLRLRRFRLLRRFR EA Axiom = 01F Rules = {0:RFR1, 1:2L2, 2:R0L}

  11. Proposed Grammatical Encoding • D0L-system (deterministic and context free): Alphabet: =tnt t={F,L,R} terminal symbols (relative coord.) nt={0,1,2,...,m-1} non-terminal symbols (rewriting rules), m = max. number of rules Axiom: α * Rewriting rules: i: wi , where i nt and wi* axiom R2 rules0:R03F; 1:R01L; 2:F310; 3:LRL3 Example

  12. Evolutionary Algorithm • Generational with rank based selection • Randomly generated initial population • Prefixed maximum number of rules • Axiom and Rules: randomly generated strings of prefixed maximum length • Genetic operators • Uniform-like (homologous) recombination (rate = 1.0) complete production rules are interchanged • Per symbol mutation in both axioms and rules (deletion (30%), insertion (10%), modification(60%))

  13. Axiom = 31 Rules={0:3LL2; 1:R0RL; 2:RRF; 3:RFR1} genotype axiom 31 1st step RFR1 R0RL 3 1 2nd step RFR R0RL R 3LL2 RL 1 0 3th step RFRR 3LL2 RL R RFR1 LL RRF RL 0 3 2 post-processing phenotype fitness= 18 RFRRLLRLRRFRLLRRFR Derivation, and Fitness Function • Derivation: from genotype (axiom and rules) to phenotype (folded structure) • Post-processing: non-terminal symbols pruning • Fitness calculation: number of matches between the target string and the solution Min. = 0, Max = length of the desired folding.

  14. Results (1)

  15. Results (2) Evolutionary progression towards the target structure

  16. Discussion • The proposed EA discovered L-systems that capture a target folding under the HP model in 2D lattices • We are not solving the PSP yet, but .. • We are proposing a novel and potentially useful, generative encoding for evolutionary approachesto PSP

  17. Future work • Incorporate problem knowledge about secondary structures Beta Turn Beta Sheet Alpha Helix • Explore longer chains and 3D lattices

More Related