330 likes | 477 Views
Principles of protein structure and stability. Polypeptide bond is formed between two amino acids. Backbone conformation is described by φ and ψ angles. Picture from T. Przytycka, 2002. Hierarchy of protein structure. Amino acid sequence Secondary structure Tertiary structure
E N D
Backbone conformation is described by φ and ψ angles. Picture from T. Przytycka, 2002
Hierarchy of protein structure. • Amino acid sequence • Secondary structure • Tertiary structure • Quaternary structure Picture from Branden & Tooze “Introduction to protein structure”
Right-handed alpha-helix. • Helix is stabilized by HB between backbone –NH and backbone carbonyl atom. • Geometrical characteristics: • 3.6 residues per turn • translation of 5.4 Å per turn • translation of 1.5 Å per residue
Loop regions are at the surface of protein molecules. Adjacent antiparallel β-strands are joined by hairpin loops. Loops are more flexible than helices and strands. Loops can carry binding and active sites, functionally important sites. Branden & Tooze “Introduction to protein structure”
Protein classification based on the secondary structure content. • Class α - proteins with only α-helices • Class β – proteins with only β-sheets • Class α+β - proteins with α-helices and β-sheets
Native proteins have low stability… Scale of interactions in proteins: - Interactions less than kT~0.6 kcal/mol are neglected. - Interactions more than ΔG = 10 kcal/mol are too large Potential energy = Van der Waals + Electrostatic + Hydrophobic G U F ΔG Reaction coordinate
Electrostatic force. Coulomb’s law for two point charges in a vacuum: q – point charge, ε – dielectric constant ε = 2-3 inside the protein, ε = 80 in water Na+ Cl- d = 2.76 Å, E = 120 kcal/mol
Dipolar interactions. - 0.42 Dipole moment: O +0.42 C Interaction energy of two dipoles separated by the vector r: -0.20 N Peptide bond: μ = 3.5D, Water molecule: μ = 1.85D. +0.20 H
Van der Waals interactions. Lennard-Jones potential: E (kcal/mol) 0.2 repulsion London dispersion energy: 0 δ+ δ- attraction δ+ δ- - 0.2 2 4 6 8 10 12 Distance between centers of atoms
Hydrogen bonds δ+ δ- —N—HO==C N H O== H N 3 Ǻ D A D A + HOH OHH HOH::::OHH
Hydrogen bonding patterns in globular proteins. 1. Most HB are local, close in sequence. 2. Most HB are between backbone atoms. 3. Most HB are within single elements of secondary structure. 4. Proteins are almost equally saturated by HB: 0.75 HB per amino acid.
Disulfide bonds. PROTEIN + GS-SG PROTEIN + GSHPROTEIN + 2GSH SH HS SH S-SG - Breakdown and formation of S-S bonds are catalyzed by disulfide isomerase. - In the cell S-S bonds are reversible, the energetic equilibrium is close to zero. - Secreted proteins have a lot of S-S bonds since outside the cell the equilibrium is shifted towards their formation.
Hydrophobic effect. H Hydrophobic interaction – tendency of nonpolar compounds to transfer from an aqueous solution to an organic phase. • The entropy of water molecules decreases when they make a contact with a nonpolar surface, the energy increases. • As a result, upon folding nonpolar AA are burried inside the protein, polar and charged AA – outside. O H O H H
Cooperativity of protein interactions E Protein denaturation is a first order (“all-or-none”) transition. As T increases: 1. Globule expansion, loose packing. 2. As expansion crosses the barrier, liberation of side chains and increase in enthropy. T1 T’ T2 W(E) T2 T’ T1
Summary: • Hydrophobic effect is mostly responsible for making a compact globule. Final specific tertiary structure is formed by van der Waals interactions, HB, disulfide bonds. • Secret of stability of native structures is not in the magnitude of the interactions but in their cooperativity.
Classwork I: CN3D viewer. • Go to http://ncbi.nlm.nih.gov • Select alpha-helical protein (hemoglobin) • Select beta-stranded protein (immunoglobulin) • Select multidomain protein 1I50, chain “A” • View them in CN3D
PDB databank. • Archive of protein crystal structures was established in 1971 with several structures in 2002 – 17000 structure including NMR structures • Data processing: data deposition, annotation and validation • PDB code – nXYZ, n – integer, X, Y, Z -characters
Content of Data in the PDB. • Organism, species name • Full protein sequence • Chemical structure of cofactors and prosthetic groups • Names of all components of the structure • Qualitative description of the structural characteristics • Literature citations • Three-dimensional coordinates
Protein secondary structure prediction. Assumptions: • There should be a correlation between amino acid sequence and secondary structure. Short aa sequence is more likely to form one type of SS than another. • Local interactions determine SS. SS of a residues is determined by their neighbors (usually a sequence window of 13-17 residues is used). Exceptions: short identical amino acid sequences can sometimes be found in different SS. Accuracy: 65% - 75%, the highest accuracy – prediction of an α helix
Methods of SS prediction. • Chou-Fasman method • GOR (Garnier,Osguthorpe and Robson) • Neural network method
Chou-Fasman method. Analysis of frequences for all amino acids to be in different types of SS. Ala, Glu, Leu and Met – strong predictors of alpha-helices, Pro and Gly predict to break the helix.
GOR method. Assumption:formation of SS of an amino acid is determined by the neighboring residues (usually a window of 17 residues is used). GOR uses principles of information theory for predictions. Method maximizes the information difference between two competing hypothesis: that residue “a” is in structure “S”, and that “a” is not in conformation “S”.
Neural network method. Input layer Input sequence window Output layer Predicted SS Hidden layer L A W P G E V G A S T Y P α Si Hj Oi 1 β 0 coil 0 WijSj HjOi
PHD – neural network program with multiple sequence alignments. • Blast search of the input sequence is performed, similar sequences are collected. • Multiple alignment of similar sequences is used as an input to a neural network. • Sequence pattern in multiple alignment is enhanced compared to if one sequence used as an input.
Classwork • Go to http://ncbi.nlm.nih.gov, search for protein “flavodoxin” in Entrez, retrieve its amino acid sequence. • Go to http://cubic.bioc.columbia.edu/predictprotein and run PHD on the sequence.
Definition of protein domains. • Geometry: group of residues with the high contact density, number of contacts within domains is higher than the number of contacts between domains. - chain continuous domains - chain discontinous domains • Kinetics: domain as an independently folding unit. • Physics: domain as a rigid body linked to other domains by flexible linkers. • Genetics: minimal fragment of gene that is capable of performing a specific function.
Domains as recurrent units of proteins. • The same or similar domains are found in different proteins. • Each domain performs a specific function. • Proteins evolve through the duplication and domain shuffling. • The total number of different types of domains is small (~1000 – 3000).
The Conserved Domain Architecture Retrieval Tool (CDART). • Performs similarity searches of the NCBI Entrez Protein Database based on domain architecture, defined as the sequential order of conserved domains in proteins. • The algorithm finds protein similarities across significant evolutionary distances using sensitive protein domain profiles. Proteins similar to a query protein are grouped and scored by architecture.