980 likes | 997 Views
Template Based Protein Structure Modeling. Jianlin Cheng, PhD Professor Department of EECS Informatics Institute University of Missouri, Columbia 2019. Sequence, Structure and Function. AGCWY……. Cell. Protein Structure Determination. X-ray crystallography
E N D
Template Based Protein Structure Modeling Jianlin Cheng, PhD Professor Department of EECS Informatics Institute University of Missouri, Columbia 2019
Sequence, Structure and Function AGCWY…… Cell
Protein Structure Determination • X-ray crystallography • Nuclear Magnetic Resonance (NMR) Spectroscopy • Cryo-Electron Microscopy • X-ray: any size, accurate (1-3 Angstrom (10-10 m)), sometime hard to grow crystal • NMR: small to medium size, moderate accuracy, structure in solution
X-Ray Crystallography A protein crystal Mount a crystal Diffractometer Diffraction Protein structure Wikipedia
Pacific Northwest National Laboratory's high magnetic field (800 MHz, 18.8 T) NMR spectrometer being loaded with a sample. Wikipedia, the free encyclopedia
Key idea: measure the distance between atoms in protein • Build 3D structures by satisfying the distance between atoms using computational tools such as Crystallography and NMR system (CNS). • Kurt Wüthrich, Switzerland:Nobel Prize in Chemistry 2002, "for his development of nuclear magnetic resonance spectroscopy for determining the three-dimensional structure of biological macromolecules in solution"
Cryo-EM equipment
Key idea: generate 2D images of proteins from different angles, and them assemble them into one 3D structure. A lot of imaging techniques used.
Storage in Protein Data Bank Search database
Search Demo: Human P53 protein – 1KVP http://www.rcsb.org/pdb/explore/explore.do?structureId=4KVP
Structure Visualization • Rasmol (http://www.umass.edu/microbio/rasmol/getras.htm) • MDL Chime (plug-in) (http://www.mdl.com/products/framework/chime/) • Jmol: http://jmol.sourceforge.net/ • JSMol: java script version • Pymol: http://pymol.sourceforge.net/ • Chimera: https://www.cgl.ucsf.edu/chimera/
JSMol (4KVP, Human P53) • JSMol: http://www.rcsb.org/pdb/explore/jmol.do?structureId=4KVP&bionumber=1 • JMOL: 1VJP • Identify residues • Recognize atoms • Recognize peptide bonds • Identify backbone • Identify side chain • Analyze different visualization style
Protein Folding http://www.youtube.com/watch?v=fvBO3TqJ6FE&feature=fvw
Computational Protein Folding by MULTICOM (Demo) Free Energy Funnel-shaped energy landscape Bhattacharya & Cheng, 2015
AlphaFold Movie • https://deepmind.com/blog/alphafold/#gif-242
Alpha-Helix Jurnak, 2003
Beta-Sheet Parallel Anti-Parallel
Non-Repetitive Secondary Structure Loop Beta-Turn
Announcement – Next Class Data-driven modeling of protein structure, 3D genome and gene regulatory network Jianin Cheng Hosted by: Dr. Zezong Gu Monday, Feburary, 11, 2019 4:00 p.m. Pathology Conference Room MA223 Medical Sciences Building Annex Refreshments provided at 3:50 pm
tertiary structure (all atom) myoglobin
Quaternary Structure: Complex G-Protein Complex
Structure Analysis • Assign secondary structure for amino acids from 3D structure • Generate solvent accessible area for amino acids from 3D structure • Most widely used tool: DSSP (Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Kabsch and Sander, 1983)
DSSP server: http://bioweb.pasteur.fr/seqanal/interfaces/dssp-simple.html DSSP download: http://swift.cmbi.ru.nl/gv/dssp/ DSSP Code: H = alpha helix G = 3-helix (3/10 helix) I = 5 helix (pi helix) B = residue in isolated beta-bridge E = extended strand, participates in beta ladder T = hydrogen bonded turn S = bend Blank = loop
DSSP Web Service http://bioweb.pasteur.fr/seqanal/interfaces/dssp-simple.html
Solvent Accessibility Secondary Structure Amino Acids
Solvent Accessibility Size of the area of an amino acid that is exposed to solvent (water). Maximum solvent accessible area for each amino acid is its whole surface area. Hydrophobic residues like to be Buried inside (interior). Hydrophilic residues like to be exposed on the surface.
Dihedral / Torsional Angle 4 • http://en.wikipedia.org/wiki/Dihedral_angle 3 2 1 φ (phi, involving the backbone atoms C'-N-Cα-C'), ψ (psi, involving the backbone atoms N-Cα-C'-N)
Project Groups • 19 students? • Form 4 groups (4-5 students per group)
Protein Structure 1D, 2D, 3D B. Rost, 2005
3D structure and function Goal of Structure Prediction • Epstein & Anfinsen, 1961: sequence uniquely determines structure • INPUT: sequence • OUTPUT: This is a Nobel Prize Winning Problem!!! B. Rost, 2005
CASP – Olympics of Protein Structure Prediction • Critical Assessment of Techniques of Protein Structure Prediction • 1994,1996,1998,2000,2002,2004,2006, 2008, 2010, 2012, 2014, 2016, 2018 • Blind Test, Independent Evaluation • CASP13 (http://predictioncenter.org/casp13/index.cgi)
CASP13 Demo • http://predictioncenter.org/casp13/index.cgi
1D: Secondary Structure Prediction MWLKKFGINLLIGQSV… Helix Neural Networks / Deep Learning + Alignments Coil CCCCHHHHHCCCSSSSS… Strand Cheng, Randall, Sweredoski, Baldi. Nucleic Acid Research, 2005
2D: Contact Map Prediction 2D Contact Map 3D Structure 1 2 ………..………..…j...…………………..…n 1 2 3 . . . . i . . . . . . . n Distance Threshold = 8Ao Cheng, Randall, Sweredoski, Baldi. Nucleic Acid Research, 2005
Contact Prediction • PISCOV: http://bioinfadmin.cs.ucl.ac.uk/downloads/PSICOV/ • DNCON2: https://github.com/multicom-toolbox/DNCON2 • DeepCov https://github.com/psipred/DeepCov
Protein tertiary structure prediction is a space sampling / simuation / optimization problem.
Protein Energy Landscape & Free Sampling http://pubs.acs.org/subscribe/archive/mdd/v03/i09/html/willis.html
Two Approaches for 3D Structure Prediction MWLKKFGINLLIGQSV… • Ab Initio Structure Prediction Simulation …… Physical force field – protein folding Contact/distance map - reconstruction Select structure with minimum free energy • Template-Based Structure Prediction Query protein Fold MWLKKFGINKH… Recognition Alignment Template Protein Data Bank