1 / 48

Protein Structure Databases

Protein Structure Databases. Databases of three dimensional structures of proteins, where structure has been solved using X-ray crystallography or nuclear magnetic resonance (NMR) techniques Protein Databases: PDB (protein data bank) Swiss-Prot PIR ( Protein Information Resource)

yahto
Download Presentation

Protein Structure Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein Structure Databases • Databases of three dimensional structures of proteins, where structure has been solved using X-ray crystallography or nuclear magnetic resonance (NMR) techniques • Protein Databases: • PDB (protein data bank) • Swiss-Prot • PIR (Protein Information Resource) • SCOP (Structural Classification of Proteins)

  2. Fibrous proteins have a structural role • Collagen is the most abundant protein in vertebrates. Collagen fibers are a major portion of tendons, bone and skin. Alpha helices of collagen make up a triple helix structure giving it tough and flexible properties. • Fibroin fibers make the silk spun by spiders and silk worms stronger weight for weight than steel! The soft and flexible properties come from the beta structure. • Keratin is a tough insoluble protein that makes up the quills of echidna, your hair and nails and the rattle of a rattle snake. The structure comes from alpha helices that are cross-linked by disulfide bonds. Source:http://www.prideofindia.net/images/nails.jpghttp://opbs.okstate.edu/~petracek/2002%20protein%20structure%20function/CH06/Fig%2006-12.GIF http://my.webmd.com/hw/health_guide_atoz/zm2662.asp?printing=true

  3. The globular proteins The globular proteins have a number of biologically important roles. They include: Cell motility – proteins link together to form filaments which make movement possible. Organic catalysts in biochemical reactions – enzymes Regulatory proteins – hormones, transcription factors Membrane proteins – MHC markers, protein channels, gap junctions Defense against pathogens – poisons/toxins, antibodies, complement Transport and storage – hemoglobin and myosin

  4. Proteins for cell motility Above: Myosin (red) and actin filaments (green) in coordinated muscle contraction. Right: Actin bound to the mysoin binding site (groove in red part of myosin protein). Add energy (ATP) and myosin moves, moving actin with it. Source: http://www.ebsa.org/npbsn41/maf_home.html http://sun0.mpimf-

  5. Proteins in the Cell Cytoskeleton Eukaryote cells have a cytoskeleton made up of straight hollow cylinders called microtubules (bottom left). They help cells maintain their shape, they act like conveyer belts moving organelles around in the cytoplasm, and they participate in forming spindle fibres in cell division. Microtubules are composed of filaments of the protein, tubulin (top left) . These filaments are compressed like springs allowing microtubules to ‘stretch and contract’. 13 of these filaments attach side to side, a little like the slats in a barrel, to form a microtubule. This barrel shaped structure gives strength to the microtubule. Tubulin forms helical filaments Source:heidelberg.mpg.de/shared/docs/staff/user/0001/24.php3?department=01&LANG=en http://www.fz-juelich.de/ibi/ibi-1/Cellular_signaling/ http://cpmcnet.columbia.edu/dept/gsas/anatomy/Faculty/Gundersen/main.html

  6. No catalyst = Input of 71kJ energy required Energy Activation Energy With catalase = Input of 8 kJ energy required Progress of reaction Proteins speed up reactions - Enzymes 2 + 2 Catalase speeds up the breakdown of hydrogen peroxide, (H2O2) a toxic by product of metabolic reactions, to the harmless substances, water and oxygen. The reaction is extremely rapid as the enzyme lowers the energy needed to kick-start the reaction (activation energy) Substrate Product

  7. Proteins can regulate metabolism – hormones When your body detects an increase in the sugar content of blood after a meal, the hormone insulin is released from cells in the pancreas. Insulin binds to cell membranes and this triggers the cells to absorb glucose for use or for storage as glycogen in the liver. Proteins span membranes –protein channels The CFTR membrane protein is an ion channel that regulates the flow of chloride ions. Not enough of this protein gets inserted into the membranes of people suffering Cystic fibrosis. This causes secretions to become thick as they are not hydrated. The lungs and secretory ducts become blocked as a consequence. Source: http://www.biology.arizona.edu/biochemistry/tutorials/chemistry/page2.html http://www.cbp.pitt.edu/bradbury/projects.htm

  8. Proteins Defend us against pathogens –antibodies Left: Antibodies like IgG found in humans, recognise and bind to groups of molecules or epitopes found on foreign invaders. Right: The binding site of an antigen protein (left) interacting with the epitope of a foreign antigen (green) Source: http://www.biology.arizona.edu/immunology/tutorials/antibody/FR.html http://tutor.lscf.ucsb.edu/instdev/sears/immunology/info/sears-ab.htm http://www.spilya.com/research/ http://www.umass.edu/microbio/chime/

  9. Making Proteins How are such a diverse range of proteins possible? The code for making a protein is found in your genes (on your DNA). This genetic code is copied onto a messenger RNA molecule. The mRNA code is read in multiples of 3 (a codon) by ribosomes which join amino acids together to form a polypeptide. This is known as gene expression. Source: http://genetics.nbii.gov/Basic1.html

  10. M S F T L K E G G E S G G G M E E E L L L T T T F F F S K M G K F T L S E K S T G A C T A G E G M S K S S G G M G G M K E E K K T M M T T G F G F F G G L E L L E E E E AUGAGUAAAGGAGAAGAACUUUUCACUGGAUA E E S G K G M E E L T F The protein folds to form its working shape Gene Expression Gene DNA Cell machinery copies the code making an mRNA molecule. This moves into the cytoplasm. Ribosomes read the code and accurately join Amino acids together to make a protein CELL The order of bases in DNA is a code for making proteins. The code is read in groups of three NUCLEUS Chromosome S K G E M F E G L T

  11. The building blocks The amino acids for making new proteins come from the proteins that you eat and digest. Every time you eat a burger (vege or beef), you break the proteins down into single amino acids ready for use in building new proteins. And yes, proteins have the job of digesting proteins, they are known as proteases. There are only 20 different amino acids but they can be joined together in many different combinations to form the diverse range of proteins that exist on this planet

  12. H S H H H H C C H CH3 H H C H H3C C C C Amino Acids An amino acid is a relatively small molecule with characteristic groups of atoms that determine its chemical behaviour. The structural formula of an amino acid is shown at the end of the animation below. The R group is the only part that differs between the 20 amino acids. Phenylalanine Cysteine Glycine Alanine Valine H R O H N H Acid Amino H H O

  13. The 20 Amino Acids The amino acids each have their own shape and charge due to their specific R group. View the molecular shape of amino acids by clicking on the URL link below: http://sosnick.uchicago.edu/amino_acids.html Would the shape of a protein be affected if the wrong amino acid were added to a growing protein chain?

  14. R H O R R H O H H O H H N C C C H2N C O¯H O H O H O H C C N C N N H H C C C O R O O R Peptide Bond Peptide Bond Peptide Bond R R H H O O N N C C C C C C H2N H2N C C O H O O R R Making a Polypeptide Polypeptide Growth Polypeptide production = Condensation Reaction

  15. Protein structure

  16. Why Investigate Protein Structure? Proteins are complex molecules whose structure can be discussed in terms of: primary structure secondary structure tertiary structure quaternary structure The structure of proteins is important as the shape of a protein allows it to perform its particular role or function

  17. Four levels of protein structure

  18. Protein Primary Structure The primary structure is the sequence of amino acids that are linked together. The linear structure is called a polypeptide http://www.mywiseowl.com/articles/Image:Protein-primary-structure.png

  19. Protein Secondary Structure The secondary structure of proteins consists of: alpha helices beta sheets Random coils – usually form the binding and active sites of proteins Source: http://www.rothamsted.bbsrc.ac.uk/notebook/courses/guide/prot.htm#I

  20. Protein Tertiary Structure Involves the way the random coils, alpha helices and beta sheets fold in respect to each other. This shape is held in place by bonds such as • weak Hydrogen bonds between amino acids that lie close to each other, • strong ionic bonds between R groups with positive and negative charges, and • disulfide bridges (strong covalent S-S bonds) Amino acids that were distant in the primary structure may now become very close to each other after the folding has taken place The subunit of a more complex protein has now been formed. It may be globular or fibrous. It now has its functional shape or conformation. Source: io.uwinnipeg.ca/~simmons/ cm1503/proteins.htm

  21. Protein Quaternary Structure This is packing of the protein subunits to form the final protein complex. For example, the human hemoglobin molecule is a tetramer made up of two alpha and two beta polypeptide chains (right) Source: www.ibri.org/Books/ Pun_Evolution/Chapter2/2.6.htm This is also when the protein associates with non-proteic groups. For example, carbohydrates can be added to form a glycoprotein Source: www.cem.msu.edu/~parrill/movies/neuram.GIF

  22. Protein Structure Prediction • Why ? • Type of protein structure predictions • Sec Str. Pred • Homology Modelling • Fold Recognition • Ab Initio • Secondary structure prediction • Why • History • Performance • Usefullness

  23. Why do we need structure prediction? • 3D structure give clues to function: • active sites, binding sites, conformational changes... • structure and function conserved more than sequence • 3D structure determination is difficult, slow and expensive • Intellectual challenge, Nobel prizes etc... • Engineering new proteins

  24. The Use of Structure

  25. The Use of Structure

  26. The Use of Structure

  27. It's not that simple... • Amino acid sequence contains all the information for 3D structure (experiments of Anfinsen, 1970's) • But, there are thousands of atoms, rotatable bonds, solvent and other molecules to deal with... • Levinthal's paradox

  28. Summary of the four main approaches to structure prediction. Note that there are overlaps between nearly all categories. Structure prediction

  29. Secondary structures -Helix

  30. Secondary Structure - Sheet

  31. Secondary structure - turns

  32. Secondary Structure Predictions Some highlights in performance • 1974 Chou and Fasman 50% • 1978 Garnier 62% • 1993 PhD 72% • 2000 PsiPred 76%

  33. Secondary structure prediction 1st generation methods • Chou and Fassman • Assign all residues the appropriate set of parameters. • Scan through the peptide and identify helical regions • Repeat this procedure to locate all of the helical regions in the sequence. • Scan through the peptide and identify sheet regions. • Solve conflicts between helical and sheet assignments • Identify turns • Claims of around 70-80% - actual accuracy about 50-60%

  34. GOR IIIGarnier, Osguthorpe, Robson, 1990 • Secondary structure depends on aminoacids propensities • As in Chou Fassman • Also influences by neighboring residues • Helix capping • Turns etc • How to include distant information. • Performance approximately 67%

  35. GOR III Garnier, Osguthorpe, Robson, 1990 The helix propensity tables thus have 20x17 entries. Assign the state with the highest propensity

  36. Status of predictions in 1990 • Too short secondary structure segments • About 65% accuracy • Worse for Beta-strands • Example:

  37. Secondary structure prediction 2nd generation methods • sequence-to-structure relationship modelled using more complex statistics, e.g. artificial neural networks (NNs) or hidden Markov models (HMMs) • evolutionary information included (profiles) • prediction accuracy >70% (PhD, Rost 1993)

  38. PhD-predictions • Secondary structure ``prediction'' by homology • If sequence of unknown secondary structure has a homologue of known structure, it is more accurate to make an alignment and copy the known secondary structure over to the unknown sequence, than to do ``ab initio'' secondary structure prediction.

  39. 3rd generation methods • enhanced evolutionary sequence information (PSI-BLAST profiles) and larger sequence databases takes Q3 to > 75% • PHD and PSIPRED are the best known methods

  40. PSIPRED • Similar to PhD • Psiblast to detect more remote homologs • only two layers • SVM or NN gives similar performance

  41. Alignment of Protein Structure • Compare 3D structure of one protein against 3D structure of second protein • Compare positions of atoms in three-dimensional structures • Look for positions of secondary structural elements (helices and strands) within a protein domain • Exam distances between carbon atoms to determine degree structures may be superimposed • Side chain information can be incorporated • Buried; visible • Structural similarity between proteins does not necessarily mean evolutionary relationship

  42. Alignment of Protein Structure

  43. Structure alignment T Find a transformation to achieve the best superposition Simple case – two closely related proteins with the same number of amino acids.

  44. Types ofStructure Comparison • Sequence-dependent vs. sequence-independent structural alignment • Global vs. local structural alignment • Pairwise vs. multiple structural alignment

  45. Sequence-dependent Structure Comparison 2 2 6 6 5 5 7 7 1 1 3 3 4 4 2 2 1 1 4 4 5 5 7 7 3 3 6 6 1234567 ASCRKLE ¦¦¦¦¦¦¦ ASCRKLE Minimize rmsd of distances 1-1,...,7-7

  46. Sequence-dependent Structure Comparison • Can be solved in O(n) time. • Useful in comparing structures of the same protein solved in different methods, under different conformation, through dynamics. • Evaluation protein structure prediction.

  47. T Sequence-independent Structure Comparison Given two configurations of points in the three dimensional space: find T which produces “largest” superimpositions of corresponding 3-D points.

  48. Evaluating Structural Alignments • Number of amino acid correspondences created. • RMSD of corresponding amino acids • Percent identity in aligned residues • Number of gaps introduced • Size of the two proteins • Conservation of known active site environments • … No universally agreed upon criteria. It depends on what you are using the alignment for.

More Related