1 / 33

Exploring Folding Landscapes with Motion Planning Techniques

This study explores the folding landscapes of ribonucleic acid (RNA) using motion planning techniques. The goal is to understand the folding process and predict the native conformation of RNA molecules. The study uses a probabilistic roadmap method (PRM) to construct a roadmap in the conformation space and analyze the folding pathways.

aredding
Download Presentation

Exploring Folding Landscapes with Motion Planning Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploring Folding Landscapeswith Motion Planning Techniques Bonnie Kirkpatrick2, Xinyu Tang1, Shawna Thomas1, Dr. Nancy Amato1 1 Texas A&M University 2 Montana State University 6 Aug 2003

  2. Outline • Motivation: Biopolymers • Goal: Folding Landscapes • Method: Motion Planning • Application: RNA Folding

  3. Outline • Motivation: Biopolymers • Goal: Folding Landscapes • Method: Motion Planning • Application: RNA Folding

  4. What is Ribonucleic Acid (RNA) ? • Is composed of a sequence of nucleotides • Folds in 3D into energetically optimal conformations • Is essential to the process of carrying out a gene functions in cells. • Performs specific functions including protein synthesis, acting as catalysts, and splicing introns, and regulating activities • The folding behaviors of the molecule tell us much about their structure and function.

  5. RNA Molecules • Primary Structure • Sequence of bases • Bases: A, C, G, U • e.g. ACGUGCCAUCG • Obtained by experiment • Secondary Structure • A 2D, planar representation • Base Pair: • A-U • G-C • G-U • Tertiary Structure • The sequence loops back on itself and folds in 3D.

  6. RNA Conformations • Chemical bonds (or contacts) form between complementary residues in close proximity. • There are many possible conformationsof the primary sequence. • e.g. CACAGAGUGU • Potential energy calculations based on number and types of bonds are used to classify conformations. • The stable, low-energy conformation is known as the native structure. • Conformations with few bonds and high energy are referred to as unfolded.

  7. Planar Representations • Bonds between base pairs are lines or parentheses (((((.(((....)))))))) All representations are equivalent

  8. Violates criteria (1) Violates criteria (2) Secondary Structure Formalized • A secondary structure conformation is specified by a set of intra-chain contacts (bonds between base pairs) that follow certain rules. • Given any two intra-chain contacts [i, j] with i < j and [k,l] with k < l, then: • If i = k, then j = l • Each base can appear in only one contact pair • If k < j, then i < k < l < j • No pseudo-knots Pseudo-knot

  9. Outline • Motivation: Biopolymers • Goal: Folding Landscapes • Method: Motion Planning • Application: RNA Folding

  10. Folding Landscapes • A “grand challenge” problem in biology • Study the kinetics of folding • Each RNA has a unique folding landscape • Assumption: Native state ↔ Lowest energy conformation • Different from the structure prediction problem • Prediction of the native conformation

  11. The Folding Process (a.k.a. the black box) Unfolded Conformation (high energy) AGGCUACUGGGAGCCUUCUCCCC Physical Laws cause folding Native Conformation (low energy)

  12. Conformation Space Potential Native State Tetrahymena Ribozyme Landscape [Russell, Zhuang, Babcock, Millett, Doniach, Chu, and Herschlag, 2002] Folding Landscapes • Description of the “black box” • A space in which every point corresponds to a conformation (or set of conformations) and its associated potential energy value (C-space). • A complete folding landscape contains a point for every possible conformation of a given sequence.

  13. Unfolded Energy Barrier Native State Phenylalanine tRNA [Hofacker, 1998] An RNA Folding Pathway

  14. Outline • Motivation: Biopolymers • Goal: Folding Landscapes • Method: Motion Planning • Application: RNA Folding

  15. start goal obstacles Motion Planning (Basic) Motion Planning (in a nutshell): Given amovableobject, find asequence of valid configurationsthat moves the object from the start to the goal. Motion Planning for Foldable Objects: Given a foldable object, find a valid folding sequence that transforms the object from one folded state to another.

  16. A conformation Probabilistic Roadmap Method (PRM): Folding Landscapes Conformation space Construct the roadmap: 1. Generate nodes. 2. Connect to form roadmap The Roadmap is like a net being laid down on the RNA’s potential landscape. Potential • Now the roadmap can be used: • To extract multiple paths • To compute population kinetics Native state

  17. Outline • Motivation: Biopolymers • Goal: Folding Landscapes • Method: Motion Planning • Application: RNA Folding

  18. Where n is the number of possible contact pairs PRM: Conformation Space C-space • C is less than the set of every possible combination of contact pairs. • C contains only valid secondary structures. • C-space is very large. • Sequence: (ACGU)10 • Length: 40 nucleotides • C-Space: 1.6x108 structures • Smaller than the conformation space for protein folding. • Purpose: generate a roadmap in C-space that describe the space without covering it

  19. PRM: Node Generation • Enumeration of C-space • Only feasible for small RNA • Enumeration of stack-based conformations • A stack is any two base pairs that occur sequentially in the secondary structure: .((.....)) • Maximal pair random sampling • Every generated node has the maximal number of contacts possible

  20. PRM: Node Connection • Choosing nodes to connect • K-closest • Fixed Radius • Distance Metric: Base Pair Distance • The number of contact pairs that differ between the two conformations • This is the number of contacts that must be opened or closed to move from one conformation to the next ..(..(.)..).. ..(.((.)).)..

  21. C1 = S1 S2 Sn-1 Sn = C2 PRM: Node Connection • Given two nodes in C-Space, C1 and C2, find a path between them consisting of a sequence of nodes: { C1 = S1, S2, …, Sn-1, Sn = C2 } • The path must have the property that for each i, 1 < i<n, the set of contact pair of Si differs from that of Si-1 by the application of one transformation operation: (1) open or (2) close a single contact pair. Discrete move-set.

  22. open close open close Bases involved in the change are marked in red. Connection Algorithm • more contacts  less potential energy • Heuristic: if a contact is opened by the transition from one node to another, try to close a contact in the next transition c1 = s1: ..(.((..))).. s2: ..(.(....)).. s3: ..(.((.).)).. s4: ..(..(.)..).. C2 = s5: ..(.((.)).)..

  23. C1 = S1 S2 Sn-1 Sn = C2 Up-hill Si+1 Si Edge Weight • Weights indicate the energetic feasibility of the edge. • ΔEi = E(si+1) – E(si)

  24. A conformation Roadmap Analysis • What does the roadmap tell us? • Folding Pathways

  25. A conformation Roadmap Analysis • Population Kinetics

  26. A conformation Roadmap Analysis • Population Kinetics

  27. A conformation Roadmap Analysis • Population Kinetics

  28. A conformation Roadmap Analysis • Population Kinetics

  29. A conformation Roadmap Analysis • Population Kinetics

  30. A conformation Roadmap Analysis • Population Kinetics

  31. Roadmap Analysis • Population Kinetics • Solved using a differential equation

  32. Future Work • Validation • Compare other statistical mechanical models • Compare with experimental results • Compare our sampled landscapes with complete landscapes • Explore the limits of our model • Try different sampling methods • Experiment with different distance metrics

  33. References Shi-Jie Chen and Ken A. Dill. Rna folding energy landscapes. PNAS, 97:646-651, 2000. Ivo L. Hofacker. Rna secondary structures: A tractable model of biopolymer folding. J.Theor.Biol., 212:35-46, 1998. Ivo L. Hofacker Jan Cupal and Peter F. Stadler. Dynamic programming algorithm for the density of states of rna secondry structures. Computer Science and Biology 96, 96:184-186, 1996. L. Kavraki, P. Svestka, J. C. Latombe, and M. Overmars. Probabilistic roadmaps for path planning in high-dimensional conguration spaces. IEEE Trans. Robot. Automat., 12(4):566-580, August 1996. J. C. Latombe. Robot Motion Planning. Kluwer Academic Publishers, Boston, MA, 1991. R. Li and C. Woodward. The hydrogen exchange core and protein folding. Protein Sci., 8:1571-1591, 1999. John S. McCaskill. The equilibrium partition function and base pair binding probabilities for rna secondary structure. Biopolymers, 29:1105-1119, 1990. Ruth Nussinov, George Piecznik, Jerrold R. Griggs, and Danel J. Kleitman. Algorithms for loop matching. SIAM J. Appl. Math., 35:68-82, 1972. R. Russell, X. Zhuang, H. Babcock, I. Millet, S. Doniach, S. Chu, and D. Herschlag. Exploring the folding landscape of a structured RNA. Proc. Natl. Acad. Sci., 99:155-60., 2002. Proc. Natl. Acad. Sci. U.S.A. 99, 155-60. D. Sanko and J.B. Kruskal. Time warps, string edits and macromolecules: the theory and practice of sequence comparison. Addison Wesley, London, 1983. A.P. Singh, J.C. Latombe, and D.L. Brutlag. A motion planning approach to exible ligand binding. In 7th Int. Conf. on Intelligent Systems for Molecular Biology (ISMB), pages 252-261, 1999. G. Song and N. M. Amato. Using motion planning to study protein folding pathways. In Proc. Int. Conf. Comput. Molecular Biology (RECOMB), pages 287-296, 2001. Stefan Wuchty. Suboptimal secondary structures of rna. Master Thesis, 1998.

More Related