420 likes | 593 Views
Optimal Bundling of Transmembrane Helices Using Sparse Distance Constraints. Genetha Anne Gray Computational Sciences & Mathematics Research Sandia National Labs, Livermore IPAM April 16, 2004.
E N D
Optimal Bundling of Transmembrane Helices Using Sparse Distance Constraints Genetha Anne Gray Computational Sciences & Mathematics Research Sandia National Labs, Livermore IPAM April 16, 2004 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. SAND2004-1799P
Outline • Description of Transmembrane Proteins • Integrated Approach to the Structure Determination Problem • Numerical Experiments • Future Work • APPSPACK (if time permits)
What are Transmembrane Proteins? • Chain of amino acids that traverses the cell membrane one or more times. • Amino acids inside the membrane form stable secondary structures. • We focus on all a-helical transmembrane domains.
Approximately 1/3 of the proteins encoded for by a typical genome. Malfunction, mutation, or absence can result in disease. Important target of drug design. Form channels through which certain ions can enter or leave the cell. Act as signal transduction receptors. Play roles in cell recognition, senses mediation, cell to cell communication. Importance Functions
Progress in Protein Structure Determination Soluble proteins:25,000 + Protein Data Bank (PDB) (www.rcsb.org/pdb) Transmembrane proteins:~75 Membrane Proteins of Known 3D Structure (www.blanco.biomol.uci.edu/Membrane_Proteins_xtal.html)
Why is this hard? • Difficult to crystallize • Difficult to apply NMR techniques • Molecular conformation can change when released from cell membrane • Few suitable templates for homology modeling CONCLUSION: Must develop an integrated computational/experimental model
Integrated Model Predict transmembrane regions.Hydropathicity analysis (Hirokawa et al. 1998; Gromiha 1999; Nikiforovich et al. 2001; Vaidehi et al. 2002) Construct individual helices.Energy minimization or MD simulations of ideal helices to predict sequence-specific distortions in the helices, such as kinks induced by proline. (Vaidehi et al. 2002) Assemble helix bundle. Add interhelical loops.WHATIF (Vriend 1990) and JACKAL (Honig) Add side-chains. SCWRL (Dunbrack), JACKAL (Honig), CENTIPEDE (Slepoy, SNL/NM)
Assemble the Helix Bundle = Find the Positions of the Cylinders
Helix Bundle Assembly Satisfy Constraints Library of millions of helix bundle templates Distance Constraints Sub-library of several hundred to several thousand structures satisfying the set of distance constraints Ranking and refinement of structures using a scoring function based on known structure Set of several refined structures satisfying distance and theoretical constraints
Helix Bundle Assembly Satisfy Constraints Library of millions of helix bundle templates Distance Constraints Sub-library of several hundred to several thousand structures satisfying the set of distance constraints Ranking and refinement of structures using a scoring function based on known structure Set of several refined structures satisfying distance and theoretical constraints
Experimental Distance Constraints • Determine distances within a protein • Low to moderate resolution structural data • Laboratory techniques: • Chemical Crosslinking - Proteolysis - Mass Spec • Site Directed Spin Labeling - Dipolar Electron Paramagnetic Resonance Spectroscopy (SDSL - EPR) • Resonance Energy Transfer (RET) • Disulfide Mapping
Helix Bundle Assembly Satisfy Constraints Library of millions of helix bundle templates Distance Constraints Sub-library of several hundred to several thousand structures satisfying the set of distance constraints Ranking and refinement of structures using a scoring function based on known structure Set of several refined structures satisfying distance and theoretical constraints
Generating helix bundle templates P V 3 4 1 2 C Dc Da Oriented Atomistic Unlabeled Labeled = • Unlabeled template generator (J. Bowie, Protein Science 1999) • 7 helices results:150,000 unlabeled templates gives 80% library saturation • 2N! Labelings (10,080 for 7 helices) • 7 Å ≤ Dc ≤ 22 Å • Da 13.4 Å • No crossover • 27.2 labelings per template • 4.5 x106 labeled templates • Search the conformation space of a given labeled template • Da = 60o gives RMSD = 2.5 Å • 78,125 orientations per labeled template • 3.0 x 1011 oriented templates
Helix Bundle Assembly Satisfy Constraints Library of millions of helix bundle templates Distance Constraints Sub-library of several hundred to several thousand structures satisfying the set of distance constraints (Faulon et al 2003) Ranking and refinement of structures using a scoring function based on known structure Set of several refined structures satisfying distance and theoretical constraints
Structures satisfying distance constraints are close to their native structures
What are the effects on the number of helix arrangements of… • …the number of distances? Decreases exponentially as number of distances decreases • …the error on the distance? Increases exponentially as distance error increases • …the radius of the distance graph? Increases exponentially as distance graph radius increases
Filtering oriented templates with distances H1 H1 H7 H7 H2 H3 H2 H3 H6 H4 H6 H4 H5 H5 Rhodopsin (1F88) Corresponding distance graph Radius = 1 Radius = 3 38 possible distances
Experimental design implications • For n helices, 2n distances with errors of less than 8 Å result in a solution set “small enough” for further processing. • Experiments designed to measure distances from the same reference helix are preferable to those designed to generate distances linking helices in a daisy chain manner. • The number of solution templates decreases faster with decreasing error than with increasing number of distances.
Helix Bundle Assembly Satisfy Constraints Library of millions of helix bundle templates Distance Constraints Sub-library of several hundred to several thousand structures satisfying the set of distance constraints Ranking and refinement of structures using a scoring function based on known structure: BUNDLER Set of several refined structures satisfying distance and theoretical constraints
Components of Bundler Experimental Distances Vdw overlap Score (penalty) Packing Angle Helix Contacts Packing Distance Packing Density: Atomic V / Solvent Acc. V Side-chain interaction propensity
Bundler • Exp. Distances • Packing Angles • Packing Distance • Packing Density • Helix contacts • Van der Waals Repulsion(X-PLOR, Brunger, 1992) • Side-Chain Interaction Propensities (Adamian and Liang, 2001) 0
Variables: How to Reposition the Helices • A transmembrane protein with m helices has 6m positional variables. • All 6m variables have simple bounds resulting from the cell membrane.
Optimization Problem min f(x) s.t.L ≤ x ≤ U • The objective function f is the Bundler penalty function. • The unknown x is a 6m-dimensional vector of spatial positions of the helices. • The derivative of f is not explicitly available.
Simulated Annealing (SA) Randomize Structure Score new Structure Y Tn=0.95Tn-1 N Y Accept N Reject Reduce T ? N Y • Advantages: • Global method • Easy to implement • Disadvantages: • Extensive computational work • Sensitive to parameter choices • Difficult to fine tune • Does not employ knowledge gained in previous iterations
Asynchronous Parallel Pattern Search (APPS) Generate new set of structures using grid • Uses predetermined grid to sample given function space • Direct search method that takes advantage of parallel platform to reduce computational time • Does not assume time needed to evaluate objective function is constant • Does not assume homogeneous processors • Global convergence under mild conditions (Kolda & Torczon) • MORE if time permits… (Hough, Kolda, and Torczon) Score new structures new score < best score? Y N Shrink grid size Update best structure Convergence Y N STOP
Numerical Example: Rhodopsin • Located in the retinal rods of the eye • Consists of 7 helices • 3-D structure of dark-adapted rhodopsin has been determined (Palczewski et al, 2000) • Set of experimental distance constraints has been compiled (Yeagle et al, 2001)
Preliminary Test • Initial guess randomized the true structure of rhodopsin score = 11,342 RMSD = 15.02 • SA parameters: starting temperature: 500 number of cycles: 290
Preliminary Test Results SA-Bundler APPS-Bundler rmsd = 4.5 Function evaluations: ~60,000 Clock time: ~2 days rmsd = 3.2 Function evaluations: ~32,000 Clock time: ~15 minutes
Numerical Test Initial Guesses Satisfy Constraints Library of several 100K helix bundle templates 27 Distance Constraints Sub-library of 87 structures that satisfy the distance constraints Refinement of structures using Bundler
Test : SA Parameters • Round 1: starting temp = 300 no. of cycles = 125 • Round 2: starting temp = 300 no. of cycles = 60 • Round 3: starting temp = 30 no. of cycles = 75
TEST : SA vs. APPS • Round 1: starting temp = 300 no. of cycles = 125 • Round 2: starting temp = 300 no. of cycles = 60 • Round 3: starting temp = 30 no. of cycles = 75
TEST : SA vs. APPS • Round 1: starting temp = 300 no. of cycles = 125 • Round 2: starting temp = 300 no. of cycles = 60 • Round 3: starting temp = 30 no. of cycles = 75
Numerical Test Conclusions • SA more effectively reduces the Bundler score. • APPS more efficiently reduces the Bundler score. • Overall, APPS is more efficient and effective enough to reach the goals of this project. • Note: Other versions of SA may be more efficient
Loop building and side-chain placement SCWRL JACKAL JACKAL CENTIPEDE CENTIPEDE ~4 Å RMSD from 1F88.PDB
Work Continues … • Revise/Add to the Bundler scoring function. • Study a Bundler-like scoring function for seven helical bundles. • Do an optimization study to determine the Bundler constants. • Add constraint capabilities to APPSPACK and restate the problem. • Use CHARMM potentials for comparison studies.
Acknowledgements • Funding: • Interfacial Grand Science Challenge • Mathematical, Information and Computational Sciences Division (MICS) at the DOE CSMR • Genetha Gray • Tammy Kolda Biosystems Research • Ken Sale • Malin Young • Nicole Wood • Joe Schoeniger Computational Biology • Jean-Loup Faulon Other Groups • MicroSystems • Computational Materials • Mass Spec • Molecular Biology