350 likes | 426 Views
CONFORMATIONAL OPTIMIZATION AND SAMPLING ALONG NATURAL COORDINATES. Peter Minary Computational Structural Biology Group & Bio-X Center Stanford University Stanford, CA 94305. TALK OUTLINE. Obstacles for Deciphering the Central Dogma of MB Challenges for Optimization & Sampling Algorithms
E N D
CONFORMATIONAL OPTIMIZATION AND SAMPLINGALONG NATURAL COORDINATES Peter Minary Computational Structural Biology Group & Bio-X Center Stanford University Stanford, CA 94305
TALK OUTLINE • Obstacles for Deciphering the Central Dogma of MB • Challenges for Optimization & Sampling Algorithms • Natural Coordinates for Biological Macromolecules • Chain Closure Algorithms, Obstacles & Solutions • An Atomic Level Insight into the Central Dogma • Nucleosome Positioning/Large Scale Optimization • Structure Space of RNA Junctions and Fractals • Interpretation & Refinement of Experimental Data
CENTRAL DOGMA OF MOLECULAR BIOLOGY Transcriptional Regulation F. H. Crick(1) Translation Folding Motion Post Transcriptional Regulation FUNCTION “If you want to understand function, study structure.” F. H. C. Crick (1) F. H. C. Crick et al. Nature 227 561-563 (1970).
CENTRAL DOGMA OF MOLECULAR BIOLOGY Transcriptional Regulation F. H. Crick(1) Translation Folding Motion Post Transcriptional Regulation FUNCTION (1) F. H. C. Crick et al. Nature 227 561-563 (1970).
3D Structure ...GTCCAGTTACGAATTGCGCGC… TRANSCRIPTIONAL REGULATION DNA DNA ...GTCCAGTTACGAATTGCGCGC… TF Nucleosome Structure Nucleosome Positioning ~ DNA in Chromatin Scan DNA • Grand Challenges for CSB • Structure Based Prediction of Nucleosome Positions • Structure Based Prediction of TransF Binding Sites TF • Requires All Atom Representation & Rapid Optimization • Simultaneously Explore Sequence and Structure Space E(Xi) • Need Conceptually Novel Optimization/Sampling Tools …..GTGAATGCCCAG…..
CENTRAL DOGMA OF MOLECULAR BIOLOGY Transcriptional Regulation F. H. Crick(1) Translation Folding Motion Post Transcriptional Regulation FUNCTION (1) F. H. C. Crick et al. Nature 227 561-563 (1970).
POST TRANSCRIPTIONAL REGULATION • Grand Challenges for CSB • Prediction of RNA Tertiary Structure • & Transport Protein Binding Sites • Need a Novel O/SApproach EXAMPLE: mRNA TRANSPORT IN NEURONS
CENTRAL DOGMA OF MOLECULAR BIOLOGY Transcriptional Regulation F. H. Crick(1) Translation Folding Motion Post Transcriptional Regulation FUNCTION (1) F. H. Crick et al. Nature 227 561-563 (1970).
PROTEIN MOTION • In Current Trend: Experimentally Measured Structures Are Getting • Larger in Size • Higher in Flexibility • Lower in Resolution EM images of Molecular Complex • In Current Refinement Methods Atomic Motions Are Modeled As • Independent • Isotropic • Harmonic • To Follow the Trend Atomic Motion in Refinement Methods Should Be • Collective • Anisotropic • Anharmonic • FAS • Fatty • Acid • Synthase • Demand for Novel Optimization Methods for Structure Refinement
CHALLENGES FOR OPTIMIZATION & SAMPLING ALGORITHMS • Roughness of the object function, E(X) • Leads to rare events in Markov Chain MC(1) • Solutions • Multiple Markov Chains in Temperature(2)/Energy Domain(3, 4) • Transformation of Variables(5)and/or using Extra Dimensions(6) • Large number of degrees of freedom, Nd • Number of energy basins is non polynomial in Nd • Solutions • Localor Global Torsional Degrees of Freedom(4,7) • Arbitrary/Most Relevant/Natural Degrees of Freedom(9) • Metropolis, et al. J. Chem. Phys.21, 1087-1091 (1953). • Geyer, et al. Proceedings of the 23rd Symposium on the Interface, 156-163 (1991). • Kou, et al. Annals of Statistics34 1581-1619 (2006). • Minary et al. Annals of Statistics34 1638-1642 (2006). • Minary et al. SIAM Journal of Scientific Computing30 2055-2083 (2008). • Minary et al. J. Chem. Phys. 118 2510-2525 (2003) • Minary et al. J. Mol. Biol. 25 920-933 (2008). • Dodd et al. Mol. Phys.78 961-996 (1993). • Minary & LevittJ. Comp. Biol. 17(8) 993-11010 (2010).
Dx Dy Dz τ ρ ω π z x σ z x O3′ C5’ O3′ NATURAL DEGREES of FREEDOM for NUCLEIC ACIDS Dx z x C4’ y O5’ P RC O1’ dof: 10(4+12x½) z Dy Dx Shift DySlide DzRise τ Tilt ρ Roll ω Twist Sx Shear SyStretch SzStagger κ Buckle π Propeller σ Opening x y N Moves break the chain! Dz z Sx Sy Sz κ π σ x y τ κ z z z Sx x x x y y y ρ z z Sy x x y y y ω z Sz z x x y y y
z Sx x y NATURAL DEGREES of FREEDOM for PROTEINSβ-SHEET & α-HELIX Sx Shear Sy Stretch Sz Stagger κ Buckle π Propeller σ Opening Moves break the chain!
CHAIN CLOSURE ALGORITHMS • Analytical multi atom closure algorithms(1) • Ncd non-linear equations and Ncd unknown, Ncd number of closure dof • Ncd = 6 is the practical limit, given that the complexity is O(fNP(Ncd)) • Single atomDeterministic Full Closure (DFC)(2) • Cost efficient • Two solutions or No solution • Single atom Stochastic Partial Closure (SPC)(3) • Cost efficient • Solution always exist for • Any size of the chain break (1) Dodd et al. Mol. Phys.78 961-996 (1993). (2) Sklenar et al. J. Comp Chem.27 309-315 (2005). (3) Minary & Levitt J. Comp. Biol. 17(8) 993-11010 (2010).
m cycles RECURSIVE STOCHASTIC CLOSURE 1 cycle of RSC= DFC[SPC[ SPC[ SPC[…] ] ]] Molten zone 1st cycle • One SPC step • Restores 4-5, breaks 3-4 • Multiple SPC steps • Propagates the chain brake • Narrows closure gap • AC = O(Ncd) << O(fNP(Ncd)) • Ncd = 2 Nm + 5 DFC Molten zone Minary & LevittJ. Comp. Biol. 17(8) 993-11010 (2010).
Molten zone (C4’….O3’) MONTE CARLO RECURSIVE STOCHASTIC CLOSURE-I Minary & Levitt J. Comp. Biol. 17(8) 993-11010 (2010).
minimization invariant DOF X E evaluation BFGS, CG none cart/tors ~10-1000 MCM N cycle of RSC Xiarbitrary 1 MCRSC MONTE CARLO RECURSIVE STOCHASTIC CLOSURE-II • Monte Carlo Minimization(1) (MCM) is Monte Carlo on • In MCRSC(2) is Monte Carlo on • (1) Wales, D. J., Scheraga, H. A. Science 285 1368-1372 (1999). • (2) Minary, P., Levitt, M.J. Comp. Biol. 17(8) 993-11010 (2010).
z z Dx Sx x x y y z z Dy Sy x x y y z z Dz Sz x x y y RECURSIVE STOCHASTIC vs DETERMINISTIC FULL CLOSUREin MONTE CARLO: a B-DNA • RSC works with an order of magnitude larger move sizes than DFC • RSC is like a wire, you pull the system that deforms to follow the change dof: 6 • E2 binding DNA: 5’-ACCGAATTCGGT-3’ • Force Field: amber99-bs0 Minary & Levitt J. Comp. Biol. 17(8) 993-11010 (2010).
RECURSIVE STOCHASTIC CLOSURE vs LOOP TORSIONAL SAMPLING in MONTE CARLO: an α+β PROTEIN Ncd = 19 (1) (2) SCOP id: d1div_2, 55 residue domain (1) Minary & Levitt J. Comp. Biol. 17(8) 993-11010 (2010). (2) Minary & Levitt J. Mol. Biol. 25 920-933 (2008).
IN SILICO NUCLEOSOME POSITIONING THE METHOD: GENERAL PIPELINE
IN SILICO NUCLEOSOME POSITIONING APPLICATION TO CHROMOSOME 14 • Yeast Chromosome 14 • 187k-189k from SGD(1) • Experimental Data(2) • Nucleosome template • 1.9 Å resolution • pdb code (1kx3)(3) • Slide nucleosome along DNA • Slide a 147 bp window • Design template • Run MCRSC on all structures • Force field: AMBER99-bs0(5) • Software: MOSAICS(6) • Get probability profile • P(i) ~ exp(-β <E(i)>) 187k 189k 201k 203k 205k 207k ab initio in vitro P(i) P(i) • (1) Cherry, J. M. et al., Nucleic Acids Res. 26, 73-79 (1998). • (2) Kaplan, N. et al., Nature458, 362-366 (2006). • (3) Davey, C. A. et al., J. Mol. Biol.319 1097-1113 (2002). • (4) Minary& Levitt J. Comp. Biol. 17(8) 993-11010 (2010). • (5) Perez et al., Biophysics J. 92 3817-3827 (2007). • (6) Minary (2010). i i Minary & Levitt
IN SILICO NUCLEOSOME POSITIONING NUCLEOSOME OCCUPANCY Yeast Chromosome 14 P(i) in vitro P(i) ab initio P(i) in vivo 187000 191000 195000 199000 203000 207000 i P(i) in vitro P(i) ab initio P(i) in vivo 191000 193000 195000 197000 199000 i Minary & Levitt
EXPLORING RNA STRUCTURE SPACE HIERARCHICAL NATURAL DOFs/MOVES (HNM) L1 L2 L1 L3 L4
EXPLORING RNA STRUCTURE SPACE RNA 4 WAY JUNCTION: SAMPLING METHODS MCRSC(1) + User Defined Move Sets (Medicine/Physics) (Chemistry/Biology) NM-MC(1,3) L1 HNM-MC(1,2,3) L1 - L4 MCRSC(1) Sampling Methods Move Set(1,2,3) L1 NM-MC(1,3) . L1 – L2 L1 L2 . . . = + L1 – L3 HNM-MC(1,2,3) . L1 – L4 . . L4 L3 (1) Minary, P.,Levitt, M. J. Comp. Biol. 17(8) 993-11010 (2010). (2) Sim, A., Levitt, M., Minary, P. To be submitted. (3) Minary, P., MOSAICS: http://csb.stanford.edu/minary/MOSAICS
EXPLORING RNA STRUCTURE SPACE RNA 4 WAY JUNCTION NM-MC(1,5) FA-MC-Sym(2) FA-Rosetta(3) HNM-MC(1,4,5) L1 L1-L4 (a) (b) (c) (d) L1 - L4 • Necessary condition for unbiased sampling • Symmetric RNA -> distributions coincide • Easy to improve by field specific move set • RNA : relative arrangement of stem loops • Comparing to Fragment Assembly • Biased and non continuous sampling • Dependence on fragment libraries L1 L2 L4 L3 HNM-MC(1,4,5) (1) Minary, P.,Levitt, M. J. Comp. Biol. 17(8) 993-11010 (2010). (2) Parisien and Major, Nature, 452, 51 (2008). (3) R. Das, J. Karanicolas, and D. Baker, Nat. Methods 7 (4), 291 (2010). (4) Sim, A., Levitt, M., Minary, P. , To be submitted. (5) Minary, P. MOSAICS: http://csb.stanford.edu/minary/MOSAICS
EXPLORING RNA STRUCTURE SPACE FRACTAL RNA: BEYOND CURRENT METHODS εrror(i) i x 104 • Necessary condition for unbiased sampling • Symmetric RNA -> armend distributions coincide • Further improvement by L5, L6, L7 • No limitation on improvement • Benchmark with different move sets • Accuracy converges by L7(1,2,3) L1 – L4 L1 – L7 HNM-MC(1,2,3) (1) Minary, P.,Levitt, M. J. Comp. Biol. 17(8) 993-11010 (2010). (2) Sim, A., Levitt, M., Minary, P. , To be submitted. (3) Minary, P. MOSAICS: http://csb.stanford.edu/minary/MOSAICS
EXPLORING RNA STRUCTURE SPACE FRACTAL RNA: WHY/HOW DOES IT WORK? • Use embedded subspaces • In particular • : 6 DOFs / main arms(2) • : 6 DOFs / arms of arms(2) • : 10 DOFs / nucleotides(1) • Low cost method to approximate • Multi scale integration(3) along • around all • around all (1) Minary, P.,Levitt, M. J. Comp. Biol. 17(8) 993-11010 (2010). (2) Sim, A., Levitt, M., Minary, P. , To be submitted. (3) Minary, P. MOSAICS: http://csb.stanford.edu/minary/MOSAICS
OBJECTIVE CRYO-EM REFINEMENT Fatty Acid Synthase (FAS) EM images of Molecular Complex Objective initial model refined model EM image
VALIDATION I CRYO-EM REFINEMENT refined structure target projection 2 Å rmsd target structure optimization(1)-(3) along natural dof initial structure 18 Å rmsd • (1) Zhang, Minary, Levitt In preparation. • (2) Minary & Levitt J. Comp. Biol. 17(8) 993-11010 (2010). • (3) Minary, P. MOSAICS: http://csb.stanford.edu/minary/MOSAICS
VALIDATION II: CROSS CORRELATION OF MAPS CRYO-EM REFINEMENT Lysozyme Projection Angle cc
THE PROTOCOL CRYO-EM REFINEMENT Etotal= Weight*EEM+ Emolecule Lysozyme
REFINEMENT CRYO-EM REFINEMENT
CRYO-EM REFINEMENT DOMAIN FLEXIBILITY (1)-(3) (4) • (1) Zhang, Minary, Levitt In preparation. • (2) Minary & Levitt J. Comp. Biol. 17(8) 993-11010 (2010). • (3) Minary, P. MOSAICS: http://csb.stanford.edu/minary/MOSAICS • (4) Courtesy of Steve Ludtke, Baylor College, Texas.
CONCLUSION • CSB has Limited Impact due to Inefficient Conformational Sampling • Novel Algorithms Supporting Natural DOF May Offer The Solution • Our Novel Approach May Open New Avenues • In The Refinement and Interpretation of Experimental Data • In The Use of Structural Information in Molecular Biology • Atomic Level Understanding of the CDMB may be a reality with NC “If the code does indeed have some logical foundation then it is legitimate to consider all the evidence, both good and bad, in any attempt to deduce it.” F. C. H. Crick CDMB FUNCTION
ACKNOWLEDGEMENTS • Michael Levitt Computer Sci. & Structural Biology, Stanford, US • Jernei Ule Molecular Biology/MRC, Cambridge, UK • Peter Lukavszky Molecular Biology/MRC, Cambridge, UK • Sebastian Doniach Physics, Stanford, US • Zev Bryan Bioengineering, Stanford, US • Wing H Wong Statistics, Stanford, US • Wah Chiu Baylor College, Texas, US • Adelene Sim Physics, Stanford, US (graduate student) • Gaurav Chopra Mathematics, Stanford, US (graduate student) • Junjie Zhang Baylor College and Stanford, US (postdoc) • Anatole von Lilienfeld & and Workshop Organizing Committee