1 / 18

Introduction to Bioinformatics - Tutorial no. 8

Introduction to Bioinformatics - Tutorial no. 8. Predicting protein structure PSI-BLAST. PHDsec and PSIpred. PHDsec Rost & Sander, 1993 Based on sequence family alignments PSIpred Jones, 1999 Based on PSI-BLAST profiles Both consider long-range interactions. PSIpred Input.

midori
Download Presentation

Introduction to Bioinformatics - Tutorial no. 8

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Bioinformatics - Tutorial no. 8 Predicting protein structure PSI-BLAST

  2. PHDsec and PSIpred • PHDsec • Rost & Sander, 1993 • Based on sequence family alignments • PSIpred • Jones, 1999 • Based on PSI-BLAST profiles • Both consider long-range interactions

  3. PSIpred Input Input sequence Type of Analysis

  4. PSIpred Input (2) Filtering Options Email address GO!

  5. PSIpred Output Conf: Confidence (0=low, 9=high) Pred: Predicted secondary structure (H=helix, E=strand, C=coil) AA: Target sequence Conf: 988766667637889999877999871289878877049963202468899999997887 Pred: CCCCCCCCCCHHHHHHHHHHHHHHHHHCCCCCCHHHCCCCCHHHCHHHHHHHHHHHHHHH AA: MQRSPLEKASVVSKLFFSWTRPILRKGYRQRLELSDIYQIPSVDSADNLSEKLEREWDRE 10 20 30 40 50 60 Conf: 742888731467888768899999999999999987557888998875227887303678 Pred: HHCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCHHHH AA: LASKKNPKLINALRRCFFWRFMFYGIFLYLGEVTKAVQPLLLGRIIASYDPDNKEERSIA 70 80 90 100 110 120 Confidence level Predicted structure

  6. Additional output Output format Reduce processing PHDsec Input (1) Email address Type of prediction

  7. PHDsec Input (2) Type (number) of input sequences Upload file Enter sequence Wait for results?

  8. PHDsec Output (1) Protein classification Structure proportions Amino acid proportions

  9. PHDsec Output (2) Estimated structure Confidence level Structure with high confidence

  10. PSI-BLAST • Position-Specific Iterative BLAST • Extension to BLASTP • Finds more distantly related sequences • Distant sequences with insignificant E values • Even in distantly related sequences, important domains can be highly conserved • PSI-BLAST gives more weight to those

  11. PSI-BLAST Profile • 123456 • AMTYQR • CTTYQS • SMTYQA • When close sequences are aligned – areas of conservation. • Scoring matrix becomes position specific • Each column has a unique set of a.a. frequencies. • Score is column specific, based on a.a. frequency. • More frequent a.a. -> higher score. • A new sequence is scored based on the new scoring matrix.

  12. Position-Specific Scoring Matrix

  13. A PSI-BLAST Iteration • Collect all database sequence segments that have been aligned with query sequence with E-value below set threshold (default 0.01) • Construct position specific scoring matrix for collected sequences. Rough idea: • Align all sequences to the query sequence as the template. • Assign weights to the sequences • Construct position specific scoring matrix • Find sequences that mach the profile

  14. Available from main BLAST page Or switch on in BLASTP Using PSI-BLAST (1) E value threshold for initial inclusion in multiple alignment for profile

  15. Using PSI-BLAST (2) Align selected sequences, generate profile, search again Number of results to show next iteration Select whether to include in next iteration New result

  16. Exercise 1 • There is a protein with an unknown structure: • >some protein MEAFLGTWKMEKSEGFDKIMERLGVDFVTRKMGNLVKPNLIVTDLGGGKYKMRSESTFKTTECSFKLGEKFKEVTRFTRGHFFMITVENGVMKHEQDDKTKVTYIERVVEGNELKATVKVDEVVCVRTYSKVA • Can BLAST help us to predict its SS? • Use any secondary structure prediction method to predict the secondary structure of 1O8V and compare it to the solved structure. • NOTICE! The secondary structure definition in PDB is given in a 7 letter code instead of 3 letter code (H, E, C). For comparison purposes consider: G H and I as H; E as E ; all the rest including spaces as C. • 3. What can you conclude about the secondary structure prediction in this case? • 4. Are the results consistent with the confidence value of the prediction? • 5. Can you explain the prediction results based on the real structure?

  17. Exercise 1

  18. Exercise 2 • Prion is the protein which responsible to the Mad Cow Disease. In the normal situation the amino acids in a specific region are arranged in α-helix (H1). In the abnormal situations this region undergoes a change into a β-strand conformation. • This conformational change is thought to be the origin of the disease, which brings to a rapid degeneration of the nerve system, and usually causes death. • It is assumed that the prion molecules, which changed conformations, accelerate the conformational change of additional molecules. • Check what conformation is predicted for this protein. • The PDB code of the prion protein is 1ag2. The helix is located at positions 21-30 on the sequence in this file. Does the predicted SS correlates with the real one in the region of interest?

More Related