1 / 30

Biomedical Data Science: Mining and Modeling Globular Protein Structure I

Biomedical Data Science: Mining and Modeling Globular Protein Structure I. Prof. Corey O ’ Hern Department of Mechanical Engineering & Materials Science Department of Physics Department of Applied Physics Program in Computation Biology & Bioinformatics

tlofton
Download Presentation

Biomedical Data Science: Mining and Modeling Globular Protein Structure I

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Biomedical Data Science: Mining and ModelingGlobular Protein Structure I • Prof. Corey O’Hern • Department of Mechanical Engineering & Materials Science • Department of Physics • Department of Applied Physics • Program in Computation Biology & Bioinformatics • Integrated Graduate Program in Physical & Engineering Biology • Yale University 1

  2. Schedule Mon/Wed. March 25 and 27: Globular Protein Structure Mon/Wed. April 1 and 3: Intrinsically Disordered Proteins Mon. April 8: Molecular Dynamics Simulations 2

  3. What are proteins? Linear polymer Folded state • Proteins are important; e.g. for catalyzing and regulating biochemical reactions, • transporting molecules, … • Linear polymer chain composed of tens (peptides) to thousands (proteins) of monomers • Monomers are 20 naturally occurring amino acids • Different proteins have different amino acid sequences • Structureless, extended unfolded state • Compact, ‘unique’ native folded state (with secondary and tertiary structure) required for biological function • Sequence determines protein structure (or lack thereof) • Proteins unfold or denature with increasing temperature or chemical denaturants 3

  4. Amino Acids I N-terminal C-terminal C peptide bonds R variable side chain • Side chains differentiate amino acid repeat units • Peptide bonds link residues into polypeptides 4

  5. Amino Acids II (-) (+) 5

  6. The Protein Folding Problem: What is ‘unique’ folded 3D structure of a protein based on its amino acid sequence? Sequence  Structure Lys-Asn-Val-Arg-Ser-Lys-Val-Gly-Ser-Thr-Glu-Asn-Ile-Lys- His-Gln-Pro- Gly-Gly-Gly-… 6

  7. Why do proteins fold (correctly & rapidly)?? Levinthal’s paradox: For a protein with N amino acids, number of backbone conformations/minima  = # allowed dihedral angles How does a protein find the global optimum w/o global search? Proteins fold much faster. Nc~ 3200 ~1095 fold ~ Nc sample ~1083 s fold ~ 10-6-10-3 s vs universe ~ 1017 s 7

  8. Energy Landscape U, F =U-TS S12 S23 S-1 M2 M3 M1 minimum all atomic coordinates; dihedral angles saddle point 8 maximum

  9. Roughness of Energy Landscape smooth, funneled rough (Wolynes et. al. 1997) 9

  10. Folding Pathways dead end similarity to native state 10

  11. 12th Critical Assessment of Structure Prediction (CASP12) Hoval et. al., Protein Science (2018) Moult et. al., Protein Science (2018) 11

  12. 12

  13. CASP12 target T0920 32 best guess predictions 13

  14. Driving Forces • Folding: hydrophobicity, hydrogen bonding, van der Waals interactions, … • Unfolding: increase in conformational entropy, electric charge… Hydrophobicity index inside H (hydrophobic) outside P (polar) 14

  15. 15

  16. Solvent Accessible Surface Areaand rSASA SASAres SASAdip rSASA=SASAres/SASAdip = [0,1] 16

  17. Secondary Structure: Loops, -helices, -strands/sheets -helix -strand -sheet 5Å • Right-handed; three turns • Vertical hydrogen bonds between NH2 (teal/white) backbone group and C=O (grey/red) backbone group four residues earlier in sequence • Side chains (R) on outside; point upwards toward NH2 • Each amino acid corresponds to 100, 1.5Å, 3.6 amino acids per turn • (,)=(-60,-45) • -helix propensities: Met, Ala, Leu, Glu • 5-10 residues; peptide backbones fully extended • NH (blue/white) of one strand hydrogen-bonded to C=O (black/red) of another strand • C ,side chains (yellow) on adjacent strands aligned; side chains along single strand alternate up and down • (,)=(-135,135) • -strand propensities: Val, Thr, Tyr, Trp, Phe, Ile 17

  18. Ns=62,938 monomeric xtal structures

  19. Bond Angles 19

  20. Bond Lengths 20

  21. Backbonde Dihedral Angles 21

  22. 3N-6 DoF -(N-1) Bond lengths -(N-2) Bond angles =N-3 Dihedral angles 22

  23. C’i-1 ϕ: C’i-1NCαC’ ψ: NCαC’Ni+1 ω1:Ci-1αC’i-1NCα ω2:CαC’Ni+1Ci+1α 23

  24. Ramachandran Plot: Determining Steric Clashes Gly Non-Gly Backbone dihedral angles  theory  4 atoms define dihedral angle:  PDB C-1NCC =0,180 C,-1C-1NC   NCCN+1 24 vdW radii backbone flexibility < vdW radii

  25. Backbone dihedral angles from PDB 25

  26. Dunbrack 1.0 Wu coil database 26

  27. Val Ile Phe Tyr Trp Leu Thr Ser 27

  28. Thr Dunbrack 1.0 28

  29. Ile 29

  30. 1. Can the structural properties of protein cores be quantitatively modeled using hard-spheres? 2. What is the packing fraction in protein cores? 3. Can simple hard-sphere model improve computational design of protein-protein interactions? 30

More Related