1 / 21

Computing for Bioinformatics

Lecture 8: protein folding. Computing for Bioinformatics. Problem description. Definition: Given the amino acid sequence of a protein, what is the protein's structure in three dimension? Importance: The structure of a protein provides a key to understanding its biological function.

sophie
Download Presentation

Computing for Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 8: protein folding Computing for Bioinformatics

  2. Problem description • Definition: Given the amino acid sequence of a protein, what is the protein's structure in three dimension? • Importance: The structure of a protein provides a key to understanding its biological function. • Assumption: The amino acid sequence contains all information about the native 3-D structure. • Thermodynamic principle: (Christian Anfinsen's denaturation-renaturation experiments on ribonuclease.) If one changes the solvent condition, the protein will undergo a transition from the native state to an unfolded state and become inactive. When solvent condition is changed back, the protein refolds and becomes active again.

  3. Methods of 3D structure determination • Experimental approaches: expensive, slow • Nuclear magnetic resonance (NMR) • X-ray crystallography • Today we have much more sequenced proteins than protein’s structures. The gap is rapidly increasing. • Protein structure prediction is becoming increasingly important.

  4. Protein Structure • Primary structure – sequence of amino acids constituting polypeptide chain • Secondary structure – local organization of polypeptide chain into secondary structures such as  -helices and  -sheets • Tertiary structure –three dimensional arrangements of amino acids as they react to one another due to polarity and interactions between side chains • Quaternary structure – Interaction of several protein subunits

  5. Amino acids • Hydrophobic: Glycine(G), Alanine(A), Valine(V), phenylalanine (F), Proline (P), Methionine (M), isoleucine (I), Leucine(L), Tryptophan (W) • Charged: Aspartic acid (D), Glutamic Acid (E), Lysine (K), Arginine (R), Histidine (H) • Polar: Serine (S), Theronine (T), Tyrosine (Y); Histidine (H), Cysteine (C), Asparagine (N), Glutamine (Q), Tryptophan (W)

  6. Types of Secondary Structures •  Sheets •  -Helices • Loops • Image source: http://www.ebi.ac.uk/microarray/biology_intro.html

  7.  Helix • Most abundant secondary structure • 3.6 amino acids per turn • Hydrogen bond formed between every fourth reside • Average length: 10 amino acids, or 3 turns • Varies from 5 to 40 amino acids

  8.  Helix • Every third amino acid tends to be hydrophobic • Rich in alanine (A), gutamic acid (E), leucine (L), and methionine (M) • Poor in proline (P), glycine (G), tyrosine (Y), and serine (S)

  9.  Sheet Image source: http://broccoli.mfn.ki.se/pps_course_96/ss_960723_12.html; http://www4.ocn.ne.jp/~bio/biology/protein.htm

  10.  Sheet • Hydrogen bonds between 5-10 consecutive amino acids in one portion of the chain with another 5-10 farther down the chain • Interacting regions may be adjacent with a short loop, or far apart with other structures in between • Directions: • Same: Parallel Sheet • Opposite: Anti-parallel Sheet • Mixed: Mixed Sheet • Pattern of hydrogen bond formation in parallel and anti-parallel sheets is different

  11. Interactions in Helices and Sheets

  12. Loops • Regions between  helices and  sheets • Various lengths and three-dimensional configurations • Located on surface of the structure • More variable sequence structure • Tend to have charged and polar amino acids • Frequently a component of active sites

  13. Classes of Protein Structure The classes are made based on the percentages of secondary structure components. 1)  Class :: bundles of  -helices connected by loops on surface of proteins 2)  Class  : antiparallel  sheets, usually two sheets in close contact forming sandwich 3)  Class  / : mainly parallel  sheets with intervening  helices; may also have mixed  sheets (metabolic enzymes) 4) Class  +  : mainly segregated  -helices and antiparallel  sheets

  14.  Class Protein (hemoglobin)

  15.  Class Protein (T-Cell CD8)

  16. /  Class Protein(tryptohan synthase)

  17. + Class Protein(1RNB)

  18. Protein structure database • Databases of three dimensional structures of proteins, where structure has been solved using X-ray or NMR techniques • Protein Databases: • PDB • SCOP • Swiss-Prot • PIR • Most extensive for 3-D structure is the Protein Data Bank (PDB). • Current release of PDB (April 8, 2003) has 20,622 structures

  19. Partial PDB File ATOM 1 N VAL A 1 6.452 16.459 4.843 7.00 47.38 3HHB 162 ATOM 2 CA VAL A 1 7.060 17.792 4.760 6.00 48.47 3HHB 163 ATOM 3 C VAL A 1 8.561 17.703 5.038 6.00 37.13 3HHB 164 ATOM 4 O VAL A 1 8.992 17.182 6.072 8.00 36.25 3HHB 165 ATOM 5 CB VAL A 1 6.342 18.738 5.727 6.00 55.13 3HHB 166 ATOM 6 CG1 VAL A 1 7.114 20.033 5.993 6.00 54.30 3HHB 167 ATOM 7 CG2 VAL A 1 4.924 19.032 5.232 6.00 64.75 3HHB 168 ATOM 8 N LEU A 2 9.333 18.209 4.095 7.00 30.18 3HHB 169 ATOM 9 CA LEU A 2 10.785 18.159 4.237 6.00 35.60 3HHB 170 ATOM 10 C LEU A 2 11.247 19.305 5.133 6.00 35.47 3HHB 171 ATOM 11 O LEU A 2 11.017 20.477 4.819 8.00 37.64 3HHB 172 ATOM 12 CB LEU A 2 11.451 18.286 2.866 6.00 35.22 3HHB 173 ATOM 13 CG LEU A 2 11.081 17.137 1.927 6.00 31.04 3HHB 174 ATOM 14 CD1 LEU A 2 11.766 17.306 .570 6.00 39.08 3HHB 175 ATOM 15 CD2 LEU A 2 11.427 15.778 2.539 6.00 38.96 3HHB 176

  20. Description of PDB File • second column: amino acid position in the polypeptide chain • fourth column: current amino acid • Columns 7, 8, and 9: x, y, and z coordinates (in angstroms) • The 11th column: temperature factor -- can be used as a measurement of uncertainty

  21. Visualization of Proteins • Most popular program for viewing 3-dimensional structures is Rasmol Rasmol: http://www.umass.edu/microbio/rasmol/ Chime: http://www.umass.edu/microbio/chime/ Cn3D: http://www.ncbi.nlm.nih.gov/Structure/ Mage: http://kinemage.biochem.duke.edu/website/kinhome.html Swiss 3D viewer: http://www.expasy.ch/spdbv/mainpage.html

More Related