1 / 67

Protein Structure Lab

Michael Zimmermann Ataur Katebi Ragothaman Yennamalli. Protein Structure Lab. Structures and Bioinformatics. Detailed genetic information informs organism wide views. Structures and Bioinformatics. Today’s Plan. What are molecular structures?

asher
Download Presentation

Protein Structure Lab

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSBSI Short Course, June, 2010 Michael Zimmermann Ataur Katebi Ragothaman Yennamalli Protein Structure Lab

  2. CSBSI Short Course, June, 2010 Structures and Bioinformatics Detailed genetic information informs organism wide views

  3. CSBSI Short Course, June, 2010 Structures and Bioinformatics

  4. CSBSI Short Course, June, 2010

  5. CSBSI Short Course, June, 2010 Today’s Plan • What are molecular structures? • Primary, Secondary, Tertiary, Quaternary Structure • Why we need them • Where do we get them? • PDB, NDB, and EMDB • Homology modeling • How do they interact? • DIP and Docking • How do we know what they do? • Genome annotation (what you’ve been doing) • Molecular motions • Molecular Dynamics • Normal Mode Analysis (Elastic Networks)

  6. CSBSI Short Course, June, 2010 What Are Molecular Structures?(and why are they important?)

  7. CSBSI Short Course, June, 2010 Central Dogma CGACGGGGACGACGGGGACCATTT GCUGCCCCUGCUGCCCCUGGUAAA AAPAAPGK DNA → RNA → Protein

  8. Protein secondary structure elements (1arl) • (H) -helices • (E) - sheets • (C) Coils • Molecules are too small to see • Artistic depictions are informative

  9. CSBSI Short Course, June, 2010 Size and Scale http://learn.genetics.utah.edu/content/begin/cells/scale/

  10. Protein Structure

  11.  Helix

  12. Parallel  sheet

  13. Antiparallel  sheet

  14. CSBSI Short Course, June, 2010 Diverse Tertiary Structures

  15. Importance of the problem • # sequences >> # number structures • Secondary structure may be used as an input for tertiary structure prediction • 1D problem is easier than 3D

  16. CSBSI Short Course, June, 2010 Scale of Sequence Versus Structure

  17. CSBSI Short Course, June, 2010 How do we get them? • Databases or Structure Prediction

  18. Assignments of secondary structure • Crystallographers assign (subjective) • Automatic assignments from the PDB coordinates • Dictionary of Secondary Structure of Proteins (DSSP) • Kabsch and Sander 1983 - based on positions of hydrogen bonds • STRIDE assignments

  19. DSSP assignments • 1. (H) Helix • 2 (E) Strand • 3 (G) 310 Helix • 4 (I) Helix • 5 (B) Bridge (single residue strand) • 6 (T) Turn • 7 (S) Bend • 8 (C) Coil

  20. Some ambiguity • Various translations of 8 DSSP states into 3 secondary structure states • Two versions of DSSP • EMBL (Heidelberg) version • Includes interchain hydrogen bonds • PDB version • Excludes interchain hydrogen bonds

  21. Improvement of prediction by using multiple sequence alignments • Zvelebil et al 1987 • Levin, Pascarella, Argos & Garnier 1993 • Rost & Sander 1993 • Accuracy of prediction based on single sequences ~ 65% • Accuracy of prediction using multiple sequence alignments ~ 75% (for the most successful methods)

  22. New improved algorithm (GOR V)Kloczkowski, Ting, Jernigan & Garnier • New database of 513 non-redundant sequences proposed by Cuff and Barton • Additional statistics of triplets • Resizable window (size of the window is adjusted to the length of the sequence) • Optimization of parameters • Decision parameters to increase the accuracy of prediction for -sheets • Multiple sequence alignments PSI-BLAST (FASTA + CLUSTAL in an early version)

  23. GOR V • >gi|42572793|ref|NP_974493.1| myb family transcription factor [Arabidopsis thaliana] • MDNHRRTKQPKTNSIVTSSSEVSSLEWEVV • SQEEEDLVSRMHKLVGDRWELIAGRIPGRT • AGEIERFWVMKN GOR V serverhttp://gor.bb.iastate.edu/

  24. References • A. Kloczkowski, K-L. Ting, R.L. Jernigan and J. Garnier – Protein secondary structure prediction based on the GOR algorithm incorporating multiple sequence alignment, Polymer, 2002, 43, 441-449 • A. Kloczkowski, K-L. Ting, R.L. Jernigan and J. Garnier – Combining GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence, Proteins; Structure, Function Genetics, 2002, 49, 154-166

  25. Other methods • PSIPRED (Neural Network) http://bioinf.cs.ucl.ac.uk/psipred/psiform.html • PHD (Neural Network) http://cubic.bioc.columbia.edu/predictprotein/ • JPRED (Neural Network) http://www.compbio.dundee.ac.uk/~www-jpred/submit.html • SAM-T99 (Hidden Markov Models) http://www.cse.ucsc.edu/research/compbio/HMM-apps/T99-query.html • META servers http://cubic.bioc.columbia.edu/predictprotein/submit_meta.html • compare with actual structure • problem of turning into 3D structure

  26. CSBSI Short Course, June, 2010 • Retrieving, Viewing, and Analyzing Molecular Structure Files

  27. CSBSI Short Course, June, 2010 Where to get Molecular Files • http://www.rcsb.org/ • http://ndbserver.rutgers.edu • http://www.emdatabank.org/

  28. CSBSI Short Course, June, 2010 Molecule Files • The Protein DataBank (PDB) file 1T3R ATOM 8 N GLN A 2 25.279 22.419 34.914 1.00 21.01 N ATOM 9 CA GLN A 2 23.872 22.620 34.516 1.00 17.82 C ATOM 10 C GLN A 2 23.654 24.078 34.247 1.00 18.11 C ATOM 11 O GLN A 2 23.996 24.956 35.114 1.00 20.40 O ATOM 12 CB GLN A 2 22.926 22.138 35.611 1.00 19.10 C ATOM 13 CG GLN A 2 21.447 22.401 35.328 1.00 18.52 C ATOM 14 CD GLN A 2 20.558 21.549 36.121 1.00 21.32 C ATOM 15 OE1 GLN A 2 20.145 20.502 35.662 1.00 22.49 O ATOM 16 NE2 GLN A 2 20.336 21.926 37.380 1.00 21.05 N AtomType ChainID X Y Z B-Factor Atom# Residue Residue# Element

  29. sdf mol2 MOL2 – SYBYL Tripos format SMILES convert to 3D with CORINA

  30. CSBSI Short Course, June, 2010 Molecular Visualization UIUC UCSF Delano Scientific and    Schrödinger

  31. CSBSI Short Course, June, 2010 • Homology • Modeling

  32. CSBSI Short Course, June, 2010 Homology Modeling • Use when sequence identity is > 35% • 1233 known topologies (CATH) • ≈70% of protein sequences (~50,000,000) • template selection • sequence-to-structure alignment • model building • model selection and refinement

  33. CSBSI Short Course, June, 2010

  34. CSBSI Short Course, June, 2010 Protein Machines • Most of biochemical processes taking placein vivo are controlled by proteins: • gene expression and regulation (nuclear receptors) • metabolic pathways (enzymes) • immune system (antibodies) • signal transduction (trans-membrane receptors) • structural (collagen) • Fully automated • Highly specific

  35. CSBSI Short Course, June, 2010 Classical Structure Determination • Proteins’ structures are solved mostly by: • x-ray crystallography (or SAXS) • NMR spectroscopy • Cryo-EM • All methods require a lot of human input from highly trained specialists. • time-consuming • $10,000 - $1,000,000 for one structure.

  36. CSBSI Short Course, June, 2010 Homology Modeling

  37. CSBSI Short Course, June, 2010 Template Detection • Sequence-only methods: • Blast, Fasta scan against PDB database. • PSI-Blast scan against sequence database. • Profile comparison: • Profile-to-profile alignment on structural database. • Threading: • Optimal fitting of modeled sequence to structures from PDB. • Metaservers: • Combination of all above (and others).

  38. CSBSI Short Course, June, 2010 Modeling • Template is used as a rigid scaffold. • Modeling algorithm rebuilds missing parts (loops) • Template is used as a semi-flexible scaffold. • Usually a great number of models are generated • Modeller (A. Sali), Rosetta (D. Baker),CABS (A. Kolinski), UnRes (H. Scheraga), I-TASSER (Y. Zhang)

  39. CSBSI Short Course, June, 2010 Homology Modeling Example See “Homology Modeling.pdf”

  40. CSBSI Short Course, June, 2010 How do they interact? • DIP: http://dip.doe-mbi.ucla.edu/dip/Main.cgi

  41. An Introduction to Docking

  42. Outline • Introduction to DOCKING • Protein-protein docking • Protein-ligand docking • Protein-ligand Docking – “Hands -on”

  43. What is docking Prediction of the optimal physical configuration and energy between two molecules The docking problem optimizes: • 1. Finds orientation that maximizes the interaction. • 2. Searches for minimum energy conformation • 3. Predicts structural rearrangement

  44. Why docking? • Predicting Biomolecular interactions • Computer aided analysis is time saving • Automated prediction of molecular interactions is the key to rational drug design • Measuring the relative strength of interactions in a cluster of interacting proteins • Drug design: Virtual Screening • Drug molecule database growth

  45. Different types of docking • Protein-protein docking: • Two proteins – aprox. the same • size • Protein-ligand docking • A large molecule (the receptor) • and a small molecule (the ligand)

  46. Rigid body and flexible docking • Rigid body docking: • bond angles, bond lengths, and • torsion angles of the components • are not modified • Flexible Docking: • Permits conformational change

  47. Scoring function • Van der Waals • A/(r6) - B/(r12) where A and B are constants and r is the distance between them • H-bond: • occurs when one molecule has a Hydrogen atom close to the docking surface that interacts with an atom from the second molecule when the docking occurs • Electrostatics • The most significant force that draws parts of the molecules closer together or further apart according to their electrical charge.

  48. Protein-Protein Docking Examples • Based on last CAPRI (Critical Assessment of Predicted Interactions) performances: • Zdock • Cluspro • Autodock • RosettaDock • PatchDock • HADDOCK

  49. Protein-Ligand Docking Examples • DOCK • Autodock • MOE-Dock • GOLD • FlexX • Glide • Hammerhead • FLOG

More Related