410 likes | 427 Views
Multi-Scale Hierarchical Structure Prediction of Helical Transmembrane Proteins. Zhong Chen and Ying Xu Department of Biochemistry and Molecular Biology and Institute of Bioinformatics University of Georgia. Outline. Background information
E N D
Multi-Scale Hierarchical Structure Prediction of Helical Transmembrane Proteins Zhong Chen and Ying Xu Department of Biochemistry and Molecular Biology and Institute of Bioinformatics University of Georgia
Outline • Background information • Statistical analysis of known membrane protein structures • Structure prediction at residual level • Helix packing at atomistic level • Linking predictions at residue and atomistic levels
Roles in biological process: Receptors; Channels, gates and pumps; Electric/chemical potential; Energy transduction > 50% new drug targets are membrane proteins (MP). Helical structure Beta structure Membrane Proteins
Membrane Proteins • 20-30% of the genes in a genome encode MPs. • < 1% of the structures in the Protein Data Bank (PDB) are MPs difficulties in experimental structure determination.
Membrane Proteins • Prediction for transmembrane (TM) segments (α-helix or β-sheet) based on sequence alone is very accurate (up to 95%); • Prediction of the tertiary structure of the TM segments: how do these α-helices/β-sheets arrange themselves in the constrains of bi-lipid layers? Helical structures are relatively easier to solve computationally
Membrane Protein Structures • Difficult to solve experimentally • Computational techniques could possibly play a significant role in solving MP structures, particularly helical structures
High Level Plan • Statistical analysis of known structures: • Unveil the underlying principles for MP structure and stability; • Develop knowledge-based propensity scale and energy functions. • Structure prediction at residue level • Structure prediction at atomistic level: MC, MD • multi-scale, hierarchical computational framework
Database for Known MP Structures: Helical Bundles • Redundant database • 50 pdb files • 135 protein chains • Non-redundant database (identity < 30%) • 39 pdb files • 95 protein chains (avg. length ~220 AA)
Bi-lipid Layer Chemistry Polar header (glycerol, phosphate) Hydrophobic tail (fatty acid)
Statistics-based energy functions • Length of bi-lipid layer: ~60Å • Central regions • Terminal regions • Three energy terms • Lipid-facing potential • Residue-depth potential • Inter-helical interaction potential Terminal 60 Å 30 Å Central Terminal
Lipid-facing Propensity Scale fraction of AAare lipid-facing LF_scale(AA) = fraction of AAare in interior • The most hydrophobic residues (ILE, VAL, LEU) prefer the surface of MPs in the central region, while prefer interior position in the terminal regions; • Small residues (GLY, ALA, CYS, THR) tend to be buried in the helix bundle; • Bulky residues (LYS, ARG, TRP, HIS) are likely to be found on the surface. This propensity scale reflects both hydrophobic interactions and helix packing
30 20 10 Y (Angstrom) 0 -10 -20 -30 -30 -20 -10 0 10 20 30 X (Angstrom) Helical Wheel and Moment Analysis The magnitude of each thin-vector is proportional to the LF-propensity and overall lipid-facing vector is the sum of all thin vectors, * Average Predication Error: 41 degree Lipid facing vector prediction: state of the art kPROT: avg. error ~41º Samatey Scale: 61º Hydrophobicity scales: 65 ~68º
Reside-Depth Potential - hydrophobic residues tend to be located in the hydrocarbon core; - hydrophilic residues tend to be closer to terminal regions; - aromatic residues prefer the interface region.
23º TM Helix Tilt Angle Prediction major pVIII coat protein of the filamentous fd bacteriophage (1MZT)
Statistical energy potentials (summary) • Three residue-based statistic potentials were derived from the database: (a) lipid-facing propensity, (b) residue depth potential, (c) inter-helical pair-wise potential • The lipid-facing scale predicted the lipid-facing direction for single helix with a uncertainty at ~ ±40º; • The residue-depth potential was able to predict the tilt angle for single helix with high accuracy. • Need more data to make inter-helical pair-wise potential more reliable
Key Prediction Steps • Structure prediction through optimizing our statistical potential (weighted sum) • Idealized and rigid helical backbone configurations; • Monte Carlo moves: translations, rotations, rotation by helix axis; • Wang-Landau sampling technique for MC simulation • Principle component analysis.
Wang-Landau Method for MC Observation: if a random walk is performed with probability proportional to reciprocal of density of states then a flat energy histogram could be obtained. The density of states is not known a priori. In Wang-Landau, g(E) is initially set to 1 and modified “on the fly”. Monte Carlo moves are accepted with probability Each time when an energy level E is visited, its density of states is updated by a modification factor f >1, i.e.,
Wang-Landau Method for MC • Advantages: • simple formulation and general applicability; • Entropy and free energy information derivable from g(E); • Each energy state is visited with equal probability, so energy barriers are overcome with relative ease.
Principal Component Analysis • Purpose: • analyze the conformation variations during a simulation, and • identify the most important conformational degrees of freedom. • Covariance matrix: * A large part of the system’s fluctuations can be described in terms of only a few PCA eigenvectors.
A Model System: Glycophorin (GpA) Dimer • GxxxG motif • Ridges-into-grooves 22 residues, 189 atoms EITLIIFGVMAGVMAGVIGTILLISY
Glycophorin (GpA) Dimer (1AFO) A: GEM (global energy minimum) RMSD=3.6A E=-114.6kcal/mol B: LEM RMSD=0.8A E=-93.9kcal/mol B A RED: experiment GREY: simulation
Helices A and B of Bacteriorhodopsin (1QHJ) A B A: GEM RMSD=2.7A E=-94kcal/mol B: LEM RMSD=0.9A E=-86kcal/mol RED: experiment GREY: simulation
Bacteriorhodopsin (1QHJ) Rmsd=5.0A G F A A E C B D Computational prediction Experimental structure
Residue-level structure prediction (Summary) • A computational scheme was established for TM helix structure prediction at residue level; • For two-helix systems, LEM structures very close to native structures (RMSD < 1.0 Å) were consistently predicted; • For a seven-helix bundle, a packing topology within 5.0 Å of the crystal structure was identified as one of the LEMs.
Key Prediction Steps • Structure prediction through optimizing atom-level energy potential: • CHARMM19 force field for helix-helix interaction • Knowledge-based energy function for lipid-helix interaction • Idealized and rigid helix structure for backbone andsidechain flexible; • Apply helix orientation constraint (i.e., N-term inside/outside cell); • MC moves: translations, rotations, rotation by helix axis, and side-chain torsional rotation; • Wang-Landau algorithm for MC simulation
CHARMM19 Polar Hydrogen Force Field - nonpolar hydrogen atoms are combined with heavy atoms they are bound to , - polar hydrogen atoms are modeled explicitly.
Effect of Helix-Lipid Interactions: Helices A&B of Bacteriorhodopsin Helix-helix interactions Helix-helix & helix-lipid interactions Helix-lipid interactions play a critical role in the correct packing of helices
RMSD=0.2Å RMSD=4.4Å RMSD=5.7Å RMSD=7.1Å Effect of Helix-Lipid Interactions: Helix A&B of Bacteriorhodopsin (BR) Hydrocarbon core region 30 Å All four LEM structures share essentially the same contact surfaces. In the native structure, the polar N-terminals of both helices are located outside of hydrocarbon core region, resulting in low helix-lipid energy.
Docking of a Seven-helix Bundle: Bacteriorhodopsin (1QHJ) Crystal structure 7 helices, 174 residues, 1619 atoms A • CHARMM19 + lipid-helix potential; • One month CPU time on one PC B B A Initial Configuration
Rmsd=4.7A Rmsd=3.0A Rmsd=6.6A Rmsd=8.0A Rmsd=8.4A Potential Energy Landscape
Global Energy Minimum Structure (RMSD=3.0 Å) RED: experiment GREY: simulation
Atom-level Structure Prediction (Summary) • Wang-Landau algorithm proved to be effective for the energetics study of TM helix packing; • Prediction results for two-helix and seven-helix structures are highly promising • Practical application of Wang-landau method to large systems requires further work.
Part IV: Linking Predictions at Residue- and Atomistic levels
Correspondence between simulations at two levels • A multi-scale hierarchical modeling approach is feasible and practical: • LEMs identified at residue-level be used as candidates for atomistic simulation; • Using PC vectors from residue-level simulation to improve search speed in atomistic simulation.
Future Works • Further improvement of the residue-based folding potentials; • Speed-up and parallelization of Wang-Landau sampling; • Construct a hierarchical computational framework, and develop corresponding software package.
Acknowledgements • Funding from NSF/DBI, NSF/ITR, NIH, and Georgia Cancer Coalition • Dr. David Landau (Wang-Landau algorithm) and Dr. Jim Prestegard (NMR data generation) of UGA • Thanks DIMACS for invitation to speak here