1 / 58

Bioinformatics and Computational Molecular Biology Geoff Barton http://www.compbio.dundee.ac.uk

Bioinformatics and Computational Molecular Biology Geoff Barton http://www.compbio.dundee.ac.uk. Practical Tutorial. Dr David Martin practical tutorial on the use of pymol molecular graphics software.

fifi
Download Presentation

Bioinformatics and Computational Molecular Biology Geoff Barton http://www.compbio.dundee.ac.uk

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bioinformatics and Computational Molecular Biology Geoff Barton http://www.compbio.dundee.ac.uk

  2. Practical Tutorial • Dr David Martin practical tutorial on the use of pymol molecular graphics software. • In this lecture I will show lots of protein structures – use www.ebi.ac.uk/msd to find them, and/or scop domains database (find with google).

  3. Similarities in Proteins • Lecture 1 • Overview of data in molecular biology • Protein modelling • Similarities of Protein Sequence, Structure, Function

  4. Introduction to Sequence Comparison • Lecture 2: • Why compare sequences? • Methods for sequence comparison/alignment. • Multiple alignment • Database searching - FASTA/BLAST • Iterative searching - PSI-BLAST

  5. Practical/WWW references • Organised by Drs Martin • Good preparation would be to look at: http://www.ebi.ac.uk/Tools andhttp://www.ncbi.nlm.nih.gov • Look at BLAST and FASTA on these sites as well as database access facilities.

  6. Traditional biological research Analysis Reading. Talking. Thinking. Hypothesis! Public Data Journals Conferences Private Data Past Experiments. Lab note books. Group discussions. Experiment Design. Execution. Publish!

  7. Bioinformatics/Computational Biology and biological research Analysis Reading. Talking. Thinking. Computational Analysis Software Development Private Data Past Experiments. Lab note books. Group discussions. DNA sequences Protein Sequences Genetic maps Transcripts 3D structures proteomics results SNP data etc etc etc Public Data Journals Conferences DNA sequences Protein Sequences Genetic maps Transcripts 3D structures proteomics results SNP data etc etc etc Hypothesis! Computer aided. Experiment Design. Execution. Computational experiments Simulation Publish! Database submission Database management

  8. EMBL Nucleotide Sequence Database Growth (to 2nd Oct 2006) Taken from: www.ebi.ac.uk

  9. Protein Sequences Approx 3,500,000 known for all species (Oct. 2006.) 25,000 for Human(not counting splice variants and post-translational modifications)

  10. Protein 3D Structures Approx 39,000 known(much duplication)

  11. Biological data in context

  12. Ecosystem many different organisms Population group of the same type of organism Organelle nucleus, mitochondria, etc... Family group with known common lineage Nucleus Whole organism animal, plant, etc. Chromosome Tissue/organ brain, heart, lungs blood, ... Gene Cell nerve,muscle,etc.. Overview of Biological Hierarchy... DNA RNA Molecular Levels Protein Sequence Protein 3D structure Molecular function

  13. Ecosystem many different organisms Technology and data in biology Expression Data (Transcriptomics) Which of the genes are switched on in which cells/tissues and when? What are the effects of drugs and disease on expression patterns DNA ‘CHIP’ TECHNOLOGY Population group of the same type of organism Organelle nucleus, mitochondria, etc... DNA Family group with known common lineage RNA Nucleus Protein Sequence Whole organism animal, plant, etc. Chromosome Protein 3D structure Tissue/organ brain, heart, lungs blood, ... Gene Molecular function Cell nerve,muscle,etc..

  14. Ecosystem many different organisms Technology and data in biology Protein Expression Data (Proteomics) Which proteins are being produced in which cells/tissues when? Which modified forms are present? What are the effects of drugs and disease on these patterns 2D Gels + Mass Spectrometry. Population group of the same type of organism Organelle nucleus, mitochondria, etc... DNA Family group with known common lineage RNA Nucleus Protein Sequence Whole organism animal, plant, etc. Chromosome Protein 3D structure Tissue/organ brain, heart, lungs blood, ... Gene Molecular function Cell nerve,muscle,etc..

  15. Ecosystem many different organisms Technology and data in biology Protein 3D Structure - the bridge to chemistry (Structural Genomics) What is the atomic level structure of the protein? What other molecules does it interact with? What small molecules - potential drugs - does it interact with? What are the effects of point mutations on the structure? X-ray crystallography, NMR spectroscopy, single particle, cryo-electron microscopy. Population group of the same type of organism Organelle nucleus, mitochondria, etc... DNA Family group with known common lineage RNA Nucleus Protein Sequence Whole organism animal, plant, etc. Chromosome Protein 3D structure Tissue/organ brain, heart, lungs blood, ... Gene Molecular function Cell nerve,muscle,etc..

  16. Ecosystem many different organisms Overview of Biological Hierarchy... Population group of the same type of organism Organelle nucleus, mitochondria, etc... DNA Family group with known common lineage RNA Macroscopic Levels Nucleus Protein Sequence Whole organism animal, plant, etc. Chromosome Protein 3D structure Tissue/organ brain, heart, lungs blood, ... Gene Molecular function Cell nerve,muscle,etc..

  17. Biology is now a data intensive science To do good science, you need to know how to use (and not abuse) computational tools.

  18. Protein Structure Prediction • ‘Homology’ modelling • Relies on the fact that similarity of sequence implies similarity of 3D structure.

  19. ? Lysozyme (1lz1) a-lactalbumin (1alc) Imagine we don’t know the 3D structure of a-lactalbumin, but we do know its amino acid sequence and that of lysozyme

  20. ? Lysozyme (1lz1) a-lactalbumin (1alc) 37.7% Identity, Z=17.6

  21. Protein structure prediction(Homology Modelling) • Align sequence of protein of unknown structure to sequence of protein of known structure. • In ‘conserved core’ of protein, substitute the amino acid types into the known structure. • Deal with ‘loops’ between the core elements of structure.

  22. Lysozyme (1lz1) a-lactalbumin (1alc) 37.7% Identity, Z=17.6

  23. Protein structure prediction(Homology modelling) • Problems: • Need protein of known structure that is similar in sequence. • Building loops where there are deletions. • Verifying model. • Key is getting a good alignment in the first place • Bad alignment => bad model.

  24. Good alignment on its own can: • Identify key residues (absolutely conserved) • Identify likely protein core (conserved hydrophobic residues) • Help predict protein secondary structure (not this lecture).

  25. Sequence alignment is a fundamental technique in molecular biology. • May predict proteins of common function even when no 3D structure is known. • May be used to predict 3D structure and so help understanding of mutants. • Some examples of where this is right and wrong...

  26. Prediction of structure and function by similarity to known sequences and structures Assumption is that similar sequence implies similar structure and function. But what do we mean by “similar”? Does similarity of sequence really imply similarity of function?

  27. Sequence 3D Structure Function Similar Similar Similar Different Different Different Protein Sequence/Structure/Function Network

  28. Protein Sequence/Structure/Function Network Sequence 3D Structure Function Similar Similar Similar Different Different Different

  29. Similar Sequence, Similar Structure, Similar Function. e.g. Trypsin-like Serine Proteinases Same fold, same catalytic mechanism. But DIFFERENT specificity. e.g. Immunoglobulin variable domains. Same fold, similar binding function. But DIFFERENT specificity. True of all examples. Similarities only give clues to function, differences in specificity can be regarded as differences of function.

  30. Immunoglobulin Variable Domains e.g. see: 1a2y

  31. Tryptophan at core of Ig variable domain

  32. Protein Sequence/Structure/Function Network Sequence 3D Structure Function Similar Similar Similar Different Different Different

  33. Lysozyme (1lz1) a-lactalbumin (1alc) 37.7% Identity, Z=17.6

  34. e-crystallin/ L-Lactate Dehydrogenase

  35. Protein Sequence/Structure/Function Network Sequence 3D Structure Function Similar Similar Similar Different Different Different

  36. Trypsin (3ptn) Subtilisin (2sec)

  37. Subtilisin (2sec) Trypsin (3ptn)

  38. His- 57, Asp-102, Ser-195 Trypsin (3ptn) Asp- 32, His- 64, Ser-221 Subtilisin (2sec)

  39. Protein Sequence/Structure/Function Network Sequence 3D Structure Function Similar Similar Similar Different Different Different

  40. Nature 398,84-90, 1999 PDB: 1b47

  41. 11% sequence ID rmsd 1.47Å over 70 residues PDB: 1b47

  42. Protein Sequence/Structure/Function Network Sequence 3D Structure Function Similar Similar Similar Different Different Different

  43. PDB: 2ptk PDB: 1bia Russell, R. B. and Barton, G. J. (1993), "An SH2-SH3 Domain hybrid", Nature,364, 765.

  44. PDB:1bas PDB:2aai

  45. Matthews, S., et al. (1994), "The p17 Matrix Protein from HIV-1 is Structurally Similar to Interferon-gamma", Nature, 370, 666-668.

  46. Protein Sequence/Structure/Function Network Sequence 3D Structure Function Similar Similar Similar Different Different Different Does this ever happen?

  47. HIV Reverse Transcriptase (RT)

  48. HIV Reverse Transcriptase (RT)

  49. HIV Reverse Transcriptase (RT) - domain linkers

  50. Protein Sequence and Structural Similarity

More Related