1 / 43

What the Protein Data Bank teaches us about structural biology

What the Protein Data Bank teaches us about structural biology. Helen M. Berman NCMI Workshop December 13, 2008. 1960’s Protein crystallography begins to take off Emerging interest in protein folding Use of computer graphics to represent structure

harvey
Download Presentation

What the Protein Data Bank teaches us about structural biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What the Protein Data Bank teaches us about structural biology Helen M. Berman NCMI Workshop December 13, 2008

  2. 1960’s Protein crystallography begins to take off Emerging interest in protein folding Use of computer graphics to represent structure Nobel Prize awarded for the first 3D protein structures: myoglobin and hemoglobin Myoglobin Hemoglobin Lysozyme Ribonuclease Myoglobin: Kendrew, Bodo, Dintzis, Parrish, Wyckoff, Phillips (1958) Nature 181 662-666; Hemoglobin: Perutz (1962) Proc. R. Soc. A265, 161-187; Lysozyme: Blake, Koenig, Mair, North, Phillips, Sarma (1965) Nature 206 757; Ribonuclease: Kartha, Bello, Harker (1967) Nature 213, 862-865; Wyckoff, Hardman, Allewell, Inagami, Johnson, Richards (1967) J. Biol. Chem. 242, 3753-3757.

  3. 1970’s • Grass roots community efforts to archive data • Protein crystallographers discuss how to archive data • June 1971 Cold Spring Harbor meeting brings groups together (Cold Spring Harbor Symposia on Quantitative Biology, vol. XXXVI, 1972) • October 1971 PDB is announced in Nature New Biology (7 structures; vol 233, 1971, page 223) • 1975 PDB receives first funding from NSF (~32 structures)

  4. Hemoglobin M.F. Perutz (1962) Proc. R. Soc. A265:161-187 Carboxypeptidase A F.A. Quiocho, W.N. Lipscomb (1971) Adv Protein Chem 25:1-78 Myoglobin J.C. Kendrew, G. Bodo, H.M. Dintzis, R.G. Parrish, H. Wyckoff, D.C. Phillips (1958) Nature 181:662-666 Subtilisin R.A. Alden, J.J. Birktoft, J. Kraut, J.D. Robertus, C.S. Wright (1971) Biochem Biophys Res Commun 45: 337-344 Alpha-chymotrypsin J.J. Birktoft, D.M. Blow (1972) J Mol Biol 68: 187-240 Pancreatic trypsin inhibitor R. Huber, D. Kukla, A. Ruhlmann, O. Epp, H. Formanek (1970) Nature 57: 389-392 Rubredoxin K.D. Watenpaugh, L.C. Sieker, J.R. Herriott, L.H. Jensen (1973) Acta Crystallogr B29: 943-956 Lactate dehydrogenase J.L. White, M.L. Hackert, M. Buehner, M.J. Adams, G.C. Ford, P.J. Lentz Jr., I.E. Smilely, S.J. Steindel, M.G. Rossmann (1976) J Mol Biol 102: 759-779 Cytochrome b5 F.S. Mathews, P. Argos, M. Levine (1972) Cold Spring Harb Symp Quant Biol 36: 387-395 Papain J. Drenth, J.N. Jansonius, R. Koekoek, H.M. Swen, B.G. Wolthers (1968) Nature 218: 929-932

  5. Ligases Isomerases Lyases Hydrolases Transferases Oxidoreductases Enzymes Lysozyme Blake, Koenig, Mair, North, Phillips, Sarma (1965) Nature 206 757 Percent RibonucleaseKartha, Bello, Harker (1967) Nature 213, 862-865; Wyckoff, Hardman, Allewell, Inagami, Johnson, Richards (1967) J. Biol. Chem. 242, 3753-3757. Decade: Proportion of enzyme classes relative to total enzyme structures

  6. RNA-containing structures (1317) 1200 1000 800 Number of Structures 600 400 200 tRNA 0 J.L. Sussman, S.-H. Kim (1976) Biochem Biophys Res Commun. 68:89-96; J.D. Robertus, J.E. Ladner, J.T. Finch, D. Rhodes, R.S. Brown, B.F.C. Clark, & A. Klug (1974) Nature 250: 546-551. Decade: 1972-1979 1980-1989 1990-1999 2000-2008 DNA/RNA hybrid Protein/RNA complexes Protein/DNA/RNA complexes RNA only

  7. 1980’s • Technology takes off • Structural biology is able to focus on medical problems • Community efforts to promote data sharing • IUCr guidelines requiring data deposition in the PDB are published

  8. DNA-containing structures (2474) Protein/DNA complexes DNA only DNA/RNA hybrid Protein/DNA/RNA complexes B-DNA Z-DNA 1bna Dickerson & Drew (1981) J. Mol. Biol. 149: 761-786 2dcg Wang, Quigley, Kolpak, Crawford, van Boom, van der Marel, Rich (1979) Nature 282: 680-686 Decade

  9. Protein-nucleic acid complexes (1920) Number of Structures Phage 434 repressor-operator 2or1 Aggarwal, Rodgers, Drottar, Ptashne, & Harrison (1988) Science 242: 899-907 Decade: Protein/DNA complexes Protein/RNA complexes Protein/DNA/RNA complexes

  10. Viruses (280 total) 139 160 121 140 120 100 Number of Structures 80 60 20 40 20 Hopper, Harrison, Sauer (1984) Structure of tomato bushy stunt virus. V. Coat protein sequence determination and its structural implications J.Mol.Biol. 177: 701-713 0 1980-1989 1990-1999 >=2000 Decade Silva, Rossmann (1985) The refinement of southern bean mosaic virus in reciprocal space Acta Crystallogr. B41: 147-157

  11. Cooperative community action Individual letters to editors of journals Committees IUCr commission on Biological Macromolecules ACA/USNCCr Richards committee Funding agencies Articles in journals Fred Richards Marvin Cassman Richard Dickerson

  12. 1990’s • Number of structures increases exponentially • Complexity of structures increases • mmCIF dictionary created • New databases begin to emerge • User base expands dramatically • PDB archive moves mmCIF Working Group Members

  13. Electron Microscopy structures Bacteriorhodopsin Henderson, Baldwin, Ceska, Zemlin, Beckmann, Downing (1990) J.Mol.Biol. 213: 899-929.

  14. Ribosome structures (214) Ribosome 50S 30S Ban, Nissen, Hansen, Moore, & Steitz (2000) Science 289: 905-920; Clemons Jr., May, Wimberly, McCutcheon, Capel, & Ramakrishnan (1999) Nature 400: 833-840; Schluenzen, Tocilj, Zarivach, Harms, Gluehmann, Janell, Bashan, Bartels, Agmon, Franceschi, Yonath (2000) Cell 102: 615-623; Yusupova, Yusupov, Cate,& Noller (2001) Cell 106: 233-241. Eukaryotic Prokaryotic

  15. 2000’s wwPDB is formed Continued growth in structures Structural genomics takes off

  16. www.wwpdb.org

  17. Depositions to the PDB by decade Number of released entries Year:

  18. July 2008

  19. What can we learn from the PDB?

  20. Structure distribution Protein-RNA complexes RNA only RNA-DNA hybrid 218 DNA only 280 Other Protein-DNA complexes 17988 23466 Protein only * 2911 * * 819 * 500 t 4445 * GO process

  21. Structure determination methods Number of structures 176 6 Decade April 30, 2008

  22. Resolution distribution of protein structures Resolution distribution of other structures Resolution Resolution distribution of all structures Year

  23. 70 63% 60 51% 50 39% 37% 40 32% 7% 27% 30 7% 16% 14% 20 25% 4% 2% 10 10% 0 1972-1979 1980-1989 1990-1999 2000-2008 Distinct and novel protein sequences Structures containing distinct protein sequences (<98%) Percent of distinct/novel structures Structures containing novel protein sequences (<30%) Subset of PSI structures Subset of other SG structures Decade

  24. Redundancy: protein clusters

  25. Lysozyme: Lessons learned T4 bacteriophage (459 structures) • Amino acid replacement studies suggest that fraction of amino acid residues that define the structure of T4 lysozyme is about 50% B.W. Matthews (1996) FASEB J.10: 35-41. Insight into folding and catalysis Hen egg white (297 structures) • Low sequence identity • Structural similarity of active site to T4 B.W. Matthews, M.G. Remington, M.G. Grutter, W.F. Anderson (1981) J.Mol.Biol. 147: 545-58. Insight into evolution and catalysis Blake, Koenig, Mair, North, Phillips, Sarma (1965) Nature 206: 757.

  26. Myoglobin and hemoglobin: Lessons learned Whale myoglobin (185 structures) • Different ligands: oxygen, carbon dioxide1 • Amino acid substitution studies2 • Laue studies3 Insight into function and dynamics Other species myoglobin • Low sequence identity, same structure4 Insight into evolution Human hemoglobin (178 structures) Insight into function and disease (sickle cell anemia, thalassemia)5 Other species hemoglobin • Low sequence identity, same structure4 Profound insight into evolution Lodish et al.6 1Kuriyan, Wilz, Karplus, Petsko (1986) J. Mol. Biol. 192:133–154; 2Quillin, Arduini, Olson, Phillips, Jr. (1993) J. Mol. Biol. 234: 140–155, Carver, Brantley Jr, Singleton, Arduini, Quillin, Phillips Jr, Olson (1992) J. Biol. Chem. 267:14443–14450; 3Bourgeois, Vallone, Schotte, Arcovito, Miele, Sciara, Wulff, Anfinrud, Brunori (2003) PNAS 100: 8704-8709; 4Dickerson, Geis (1983) Hemoglobin: structure, function, and pathology; 5Kidd, Baker, Mathews, Brittain Baker (2001) Prot. Sci. 10:1739-1749, Harrington, Adachi, Royer Jr. (1998) J. Biol. Chem. 273: 32690 - 32696; 6Lodish, Berk, Zipursky, Matsudaira, Balitmore, Darnell (2000) Molecular Cell Biology WH Freeman & Co.

  27. TIM barrel proteins: Lessons learned TIM barrel structures (1727) http://www.cathdb.info • Share the same fold but represent significant sequence and functional diversity • Are enzymes or enzyme-related proteins involved in molecular or energy metabolism • Comparative structure analysis indicates evolutionary relatedness of TIM barrel proteins Banner, Bloomer, Petsko, Phillips, Wilson, (1976) Biochem.Biophys.Res. Commun. 72: 146-155 Nagano, Orengo, Thornton (2002) J.Mol. Biol. 321: 741-65. Nagano, Orengo, Thornton (2002) J.Mol. Biol. 321: 741-65.

  28. Protease Reverse Transcriptase Gag protein Integrase Other HIV-related structures (609) 311 122 Number of Structures 27 39 110 Decade

  29. HIV-1 protease (311) 226 structures with ligands Navia, Fitzgerald, McKeever, Leu, Heimbach, Herber, Sigal, Darke, Springer (1989) Nature 337: 615-620; Wlodawer, Miller, Jaskolski, Sathyanarayana, Baldwin, Weber, Selk, Clawson, Schneider, Kent (1989) Science 245: 616-621 1T7J, 1HPV 2FXE, 2FXD, 2O4K, 2AQU, 2FND 2RKG, 2RKF, 2QHC, 2Z54, 2Q5K, 2O4S, 1RV7, 1MUI 2QAK, 2PYM, 2Q63, 2PYN, 2Q64, 2R5Q, 1OHR 2R5P, 2B7Z, 2AVV, 2AVO, 2AVS, 1SGU, 1SDT, 1SDV, 1SDU, 1K6C, 1C6Y, 2BPX, 1HSG, 1HSH 2O4N, 2O4L, 2O4P, 1D4Y, 1D4S 2B60, 1RL8, 1SH9, 1N49, 1HXW 3D1X, 3D1Y, 3CYX, 2NMW, 2NMZ, 2NNP, 2NMY, 2NNK, 1C6Z, 1FB7

  30. HIV-1 reverse transcriptase (110) 76 structures with ligands 2HND, 2HNY, 1S1U, 1S1X, 1LW0, 1LWE, 1LWC, 1LWF, 1JLB, 1JLF, 1FKP, 1VRT, 3HVT Wang, Smerdon, Jager, Kohlstaedt, Rice, Friedman, Steitz, (1994) Proc.Natl.Acad.Sci.USA 91: 7242-7246 1JKH, 1IKW, 1IKV, 1FKO, 1FK9 Number of Structures 1T05 1S6P Year

  31. Structural coverage of KEGG pathways

  32. Human biological pathways Complement and coagulation cascades pathway Regulation of actin cytoskeleton Small cell lung cancer Non small cell lung cancer Genes that contain a PDB structure are in red KEGG (http://www.genome.jp/kegg/)

  33. EM maps and Models in the PDB

  34. How EM experiments are archived

  35. EMDataBank Created by EBI in 2002 for archiving EM maps US deposition/annotation site added this year Maps stored in CCP4/MRC format Associated metadata stored in xml format Nuclear pore complex, 85 Å EMD-1097 Rotavirus V6 protein, 3.8 Å EMD-1461 580 entries total

  36. EM entries in the PDB Atomic coordinate models fitted to EM maps Storage format for models and metadata is CIF Matrix representations possible Some large entries “break” PDB format PBCV-1 (1m4x, 1680 matrices) 80S ribosome (1s1h + 1s1i) 230 entries total

  37. PDBj

  38. Goals • Common data model • Data harvesting tools • “One-stop shop” for deposition and retrieval • Tools for visualization, segmentation, and assessment

  39. Acknowledgements Wellcome Trust, EU, CCP4, BBSRC, MRC, EMBL BIRD-JST, MEXT NLM NSF, NIGMS, DOE, NLM, NCI, NCRR, NIBIB, NINDS, NIDDK

  40. Acknowledgements NIH GM079429 (Baylor, Rutgers, EBI) 2007- 2012 EU Network of Excellence LSHG-CT-2004-50282 (EBI) 2004-2009

More Related