900 likes | 1.31k Views
Visualizing Protein Structures and Structural Bioinformatics. Stephen Sontum Middlbury College sontum@middlebury.edu. Chapter 1 & 6: Introduction to Bioinformatics Structural Bioinformatics and Drug Discovery Arthur M. Lesk. Core. Visualizing Protein Structures What we hope to learn.
E N D
Visualizing Protein Structuresand Structural Bioinformatics Stephen Sontum Middlbury College sontum@middlebury.edu Chapter 1 & 6: Introduction to Bioinformatics Structural Bioinformatics and Drug DiscoveryArthur M. Lesk
Core Visualizing Protein StructuresWhat we hope to learn • Hierarchy of Protein structure • Primary Structure • Secondary structure • Motifs or supersecondary structure • Tertiary Structure (protein domains) • Quaternary Structure • Forces Driving Protein Folding • Finding out more about structures • How to visualize molecules with VMD(HW) • How to edit Protein Data Bank files(HW)
Bioinformatics Spectrum • Informatics Models • Classification • Patterns • Relationships • Physical Models • Structure • Function • Mechanism
Breadth: Evolution Evolutionary relationships are essential for making sense of biological data. The study of evolutionary patterns must begin with the assembly of a set of homologues. Homologydecent from common ancestorSimilarityquantitative measure of difference Human hand Human hemoglobin Human eye Human ear bones dog’s forepaw dog hemoglobin Eye of an insect jaw of a fish Protein structure changes more conservatively than amino acid sequence. Groups of related proteins are called families Sequence databases: InterPro, Pfam, Procite, and COGStructure databases: Scop, Cath, and CDD
Protein Folding: Problem H3N+-A1-A2-A3-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A98-A99-A100-COO- Butane Protein • 3 angles 3 conformations = 27 conformations/peptide bond • 27100 = 10143 conformations !!!! • One conformation every femtosecond (10-15 sec) • 10120 years to fold a protein
+NH3-R-R-R-R | | | | -OOC-R-R-R-R +NH3-RSH-RSH-RSH-RSH-RSH-RSH-RSH-RSH-COO- 7 x 5 x 3 x 1 = 105 Anfinsen’s Hypothesis H3N-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-----A100-COO- Christian Anfinsen RNase A (1957) Nobel Prize (1973) Entropy 100% Active G Thermodynamic Hypothesis DGmin DG = DH – T DS DH = DHprotein + DHwater DS = DSprotein + DSwater 5 To 15 Kcal/mol
Core VMD [3bp1]Guanine Reductase Extensions/Analysis/Sequence Viewer File/New Molecule http://kb.psi-structuralgenomics.org/ Protein Structure Initiative
VMD [3bp1]Guanine Reductase Graphics Representations Display Window
VMD [3bp1]Guanine Reductase VMD Main Window Display Window Command Window
VMD [3bp1]Guanine Reductase Extensions/Ramachandran plot Extensions/Volume Plot
(from Mount) primary (1º) secondary (2º) tertiary (3º) quaternary (4º) From: Brandon & Tooze, “Introduction to Protein Structure”
Core amino acids/proteins: Glycine GLY G Alanine ALA A Phenylalanine PHE F Leucine LEU L Isoleucine ILE I Valine VAL V Proline PRO P Methionine MET M Glutamic acid GLU E Aspartic acid ASP D Glutamine GLN Q Asparagine ASN N Lysine LYS K Arginine ARG R Serine SER S Threonine THR T Tyrosine TYR Y Tryptophan TRP W Histidine HIS H Cysteine CYS C From: Brandon & Tooze, “Introduction to Protein Structure”
amino acids/proteins: Glycine GLY G Alanine ALA A Phenylalanine PHE F Leucine LEU L Isoleucine ILE I Valine VAL V Proline PRO P Methionine MET M Glutamic acid GLU E Aspartic acid ASP D Glutamine GLN Q Asparagine ASN N Lysine LYS K Arginine ARG R Serine SER S Threonine THR T Tyrosine TYR Y Tryptophan TRP W Histidine HIS H Cysteine CYS C Hydrophobic amino acids
Hydrophobic amino acids Alanine ALA A Phenylalanine PHE F Leucine LEU L Isoleucine ILE I Valine VAL V Proline PRO P Methionine MET M Phenylalanine Alanine Leucine Isoleucine Proline Methionine Valine
Hydrophobic amino acids Alanine ALA A Phenylalanine PHE F Leucine LEU L Isoleucine ILE I Valine VAL V Proline PRO P Methionine MET M Phenylalanine Alanine Leucine Isoleucine Proline Methionine Helix breaker (except at N-caps) Valine
amino acids/proteins: Glycine GLY G Alanine ALA A Phenylalanine PHE F Leucine LEU L Isoleucine ILE I Valine VAL V Proline PRO P Methionine MET M Glutamic acid GLU E Aspartic acid ASP D Glutamine GLN Q Asparagine ASN N Lysine LYS K Arginine ARG R Serine SER S Threonine THR T Tyrosine TYR Y Tryptophan TRP W Histidine HIS H Cysteine CYS C Charged amino acids Charged residues tend to reside on protein surface ?
Charged amino acids: Glutamic acid GLU E Aspartic acid ASP D Lysine LYS K Arginine ARG R Histidine HIS H Glutamic acid Aspartic acid Histidine Arginine Lysine
Charged amino acids: Glutamic acid GLU E Aspartic acid ASP D Lysine LYS K Arginine ARG R Histidine HIS H Glutamic acid Aspartic acid pKa = 6 ± 1 Histidine Arginine Lysine
amino acids/proteins: Glycine GLY G Alanine ALA A Phenylalanine PHE F Leucine LEU L Isoleucine ILE I Valine VAL V Proline PRO P Methionine MET M Glutamic acid GLU E Aspartic acid ASP D Glutamine GLN Q Asparagine ASN N Lysine LYS K Arginine ARG R Serine SER S Threonine THR T Tyrosine TYR Y Tryptophan TRP W Histidine HIS H Cysteine CYS C Polar amino acids
Polar amino acids: Glutamine GLN Q Asparagine ASN N Serine SER S Threonine THR T Tyrosine TYR Y Tryptophan TRP W Histidine HIS H Cysteine CYS C Glutamine Asparagine Serine Threonine Tyrosine Tryptophan Cysteine
? (from Mount, “Bioinformatics Sequence and Genome Analysis” primary (1º) secondary (2º) tertiary (3º) quaternary (4º) From: Brandon & Tooze, “Introduction to Protein Structure”
O phi f psi y C H R O N C N H C N R O R H O H C N peptide bond is relatively rigid and planar (significant barrier to rotation, ~20 kcal/mol) trans peptide bond is favored over cis by 103 (steric clash) Except PRO: trans favored by only 80:20 − O C + N H The planar peptide bonds can rotate about the Cα carbon Фand Ψ are 180o in the conformation shown and increase in the clockwise direction when view from the Cα carbon
phi f psi y H R O C N C N Core R O R H Ramachandran plot: Shows sterically allowed conformational angles phi and psi Steric interactions eliminate a large fraction of possible comformations. y Brandon & Tooze Introduction to Protein Structure, Figure 1.7a f
phi f psi y H R O C N C N R O R H Brandon & Tooze Introduction to Protein Structure, Figure 1.7b.c From J. Richardson, Adv. Prot. Chem. 34, 174-174 (1981) All amino acids (except glycine) form high resolution crystals GLY: much more freedom
Core amino acids/proteins: ala, gly, leu, ile, val, glu, asp, gln, asn, pro, lys, arg, ser, thr, tyr, trp, phe, his, cys, met Secondary structure a-helix b-sheet loop or turn random coil From: Brandon & Tooze, “Introduction to Protein Structure”
Can we predict 2° structure? -- helical propensities -- mining structural databases for known sequence-structure relationships -- sequence contexts? -- neural nets, HMM… Yes! …but accuracy is still poor PROF 81% http://www.aber.ac.uk/~phiwww/prof/
Core Can we predict properties of unknown Proteins? Yes! Parametric Sequence Analysis Sequence: LAKMVVKTAEAILKD α Helix 3.6 amino acids per turn 10 Å (3 turns) 10 amino acids N-H•••O=C H-bond between every 4th residue Transmembrane α Helix ≈ 19 hydrophobic AA long
Can we predict properties of unknown Proteins? Helical Wheel Plots (rotation 100o)
ExPASy http://ca.expasy.org/
Protscale Parametershttp://www.expasy.ch/tools/protscale.html I = 4.5 V = 4.2 L = 3.8 F = 2.8 C = 2.5 M = 1.9 A = 1.8 G =-0.4 T =-0.7 S =-0.8 W =-0.9 Y =-1.3 P =-1.6 H =-3.2 E =-3.5D =-3.5 N =-3.5 Q =-3.5L =-3.9 R =-4.5 Values averaged over a sliding window
Core Kyte-Doolittle of Leptin Receptor 1.7
SignalP 3.0 Server http://www.cbs.dtu.dk/services/SignalP/
Visualizing Protein Structuresand Structural Bioinformatics End of Tuesday Lecture • Forces Driving Protein Folding • Finding out more about structures • How to visualize molecules with VMD(HW) • How to edit Protein Data Bank files(HW)
Why does 2° structure form? Why do proteins “fold”? Secondary structure a-helix b-sheet loop or turn random coil From: Brandon & Tooze, “Introduction to Protein Structure”
Why do proteins fold? Digression: Chemical forces intra- and inter- molecular interaction
Why do proteins fold? Digression: Chemical forces intra- and inter- molecular interaction intra – covalent bonds
Why do proteins fold? Digression: Chemical forces intra- and inter- molecular interaction intra – covalent bonds inter - Hbonds - Ionic - VW interactions
Why do proteins fold? Digression: Chemical forces intra- and inter- molecular interaction intra – covalent bonds inter - Hbonds - Ionic - VW interactions Hydrophobic effect
Covalent interactions Covalent interactions hold the peptide backbone together.
Intermolecular interactions N-H · · · · · · · · · :O=C < 3.0 A Note: VMD requires H to find H-bonds. You can add H to a pdb file with babel. HBONDs is one of the strongest Intermolecular Interactions. They are due to large molecular dipoles between hydrogen and lone pairs (O & N).
~strength (kcal/mol) -1 to –5 +2.0 +2.5 +1.0 -0.2 -100.0 hydrogen-bond change bond angle by 10º stretch bond by 0.1 Å rotate bond by 60o pack two atoms close two + & - charges at 3.3 Å (from Mount) Why don’t salt bridges dominate as the force for protein stability? Why is the difference in free energy between a folded and unfolded state only a few kcal/mol?
Intermolecular interactions hydrophobic hydrophobic Hydrophobic interactions dominate all other interactions. They are due to solvation effects.
O What about solvation? O- + Na To form the salt pair, you must desolvate the ions Solvating ions is often favorable Desolvating ions is unfavorable