1 / 20

Protein threading algorithms

Protein threading algorithms. Presented by Jian Qiu. GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider, R. & Sander, C. JMB(1997)270,471-480. Why do we need protein threading?.

Download Presentation

Protein threading algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protein threading algorithms Presented by Jian Qiu • GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 • Protein Fold Recognition by Prediction-based Threading • Rost, B., Schneider, R. & Sander, C. JMB(1997)270,471-480

  2. Why do we need protein threading? • To detect remote homologue  Genome annotation Structures are better conserved than sequences. Remote homologues with low sequence similarity may share significant structure similarity. • To predict protein structure based on structure template Protein A shares structure similarity with protein B. We could model the structure of protein A using the structure of protein B as a starting point.

  3. An successful example by GenTHREADER • ORF MG276 from Mycoplasma genitalium was predicted to share structure similarity with 1HGX. • MG276 shares a low sequence similarity (10% sequence identity) with 1HGX. • Supporting Evidence: • MG276 has an annotation of adeninephosphoribosyltransferase, basedon high sequencesimilarity tothe Escherichia coli protein; 1HGX isahypoxanthine-guanine-xanthinephosphoribosyltransferase from the protozoan parasite Tritrichomonas foetus. • Four functionally important residues in 1HGX are conserved in MG276. • The secondary structureprediction for ORFMG276 agrees very well with the observed secondary structure of 1HGX.

  4. Structure of 1HGX

  5. Functional residue conservation between 1HGX and MG276

  6. GenTHREADER Protocol Sequence alignment • For each template structure inthefold library, related sequences were collectedbyusingthe program BLASTP. • A multiple sequence alignment of these sequences was generated with a simplified version of MULTAL. • Get the optimal alignment between the target sequence and the sequence profile of a template structure with dynamic programming.

  7. Threading Potentials Pairwise potential (the pairwise model family): k: sequence separation s: distance interval mab: number of pairs ab observedwithsequence separation k s: weight given toeachobservation fk(s): frequency of occurrence ofallresidue pairs fkab(s): frequency of occurrenceofresidue pair ab

  8. Solvation potential (the profile model family): r: the degree of residue burial thenumber of other Cbatoms located within 10 Å of the residue'sCbatom fa(r): frequency of occurrence of residue awithburial r f (r): frequency ofoccurrenceof all residues with burial r

  9. Variables considered to predict the relationship • Pairwise energy score • Solvation energy score • Sequence alignment score • Sequence alignment length • Length of the structure • Length of the target sequence

  10. Artificial Neural Network A node

  11. Neural network architecture in GenTHREADER

  12. The effects of sequence alignment score and pairwise potential on the Network output

  13. Confidence level with different network scores Medium(80%) High (99%) Certain (100%) Low

  14. Genome analysis of Mycoplasma genitalium All the 468 ORFs were analyzed within one day.

  15. Distribution of protein folds in M. genitalium

  16. PHD: Predict 1D structure from sequence Sequence MaxHom Multiple Sequence Alignment PHDsec PHDacc Secondary structure: H(helix), E(strand), L(rest) Solvent accessibility: Buried(<15%), Exposed(>=15%)

  17. Threading Protocol

  18. Similarity matrix in dynamic programming • Purely structure similarity matrix: six states (combination of three secondary structure states and two solvent accessibility states) • Purely sequence similarity matrix: McLachlan or Blosum62 • Combination of strcture and sequence similarity matrix: Mij=m*Mij1D structure + (100-m)*Mijsequence m=0: sequence alignment only m=100: 1Dstructure alignment only

  19. Performance of the algorithm

  20. Results on the 11 targets of CASP1 • Correctly detected the remote homologues at first rank in four cases; Average percentage of correctly aligned residues: 21%; Average shift: nine residues. Best performing methods in CASP1: • Expert-driven usage of THREADER by David Jones and colleagues detected five out of nine proteins correctly at first rank. • Best alignments of the potential-based threading method by Manfred Sippl and colleagues were clearly better than the best ones of this algorithm.

More Related