1 / 22

Tutorial 4 Comparing Protein Sequences

Tutorial 4 Comparing Protein Sequences. Intro to Bioinformatics. Amino acids were not born equally. Comparing Protein Sequences. Substitution Matrices PAM - Point Accepted Mutations BLOSUM - Blocks Substitution Matrix Advance comparison tools Psi-BLAST Phi-BLAST. Substitution Matrix.

dessa
Download Presentation

Tutorial 4 Comparing Protein Sequences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tutorial 4Comparing Protein Sequences Intro to Bioinformatics

  2. Aminoacids were not born equally

  3. Comparing Protein Sequences • Substitution Matrices • PAM - Point Accepted Mutations • BLOSUM - Blocks Substitution Matrix • Advance comparison tools • Psi-BLAST • Phi-BLAST

  4. Substitution Matrix • Scoring matrix S • 20x20 for protein alignment (Amino-acid) • Si,jrepresents the gain/penalty due to substituting AAj by AAi(i – line , j – colomn) • Based on likelihood this substitution is found in nature • Computed differently in PAM and BLOSUM

  5. Computing probability of Mutation (Mi,j) • PAM - Point Accepted Mutations • Based on closely related proteins (X% divergence) • Matrices for comparison of divergent proteins computed • BLOSUM - Blocks Substitution Matrix • Based on conserved blocks bounded in similarity (at least X% identical) • Matrices for divergent proteins are derived using appropriate X%

  6. PAM-1 • Captures mutation rates between close proteins • 1% divergence • Mi,j = AB / #A • Problematic when comparing far proteins • The 1% divergence does not capture more sporadic mutations • PAM250 is theoretical (extrapolation based)

  7. PAM-1

  8. BLOSUM62 • Captures mutation rates between divergent proteins • Why is BLOSUM62 called BLOSUM62? Basically, this is because all blocks whose members shared at least 62% identity with ANY other member of that block were averaged and represented as 1 sequence.

  9. BLOSUM62 The idea of BLOSUM matrices is to get a better measure of differences between two proteins specifically for more distantly related proteins. • Similar AA have high score

  10. PAM & BLOSUM

  11. Use Recommendations PAM100 ~ BLOSUM90 Closely Related PAM120 ~ BLOSUM80 PAM160 ~ BLOSUM60 PAM200 ~ BLOSUM52 PAM250 ~ BLOSUM45 Highly Divergent

  12. Example • Query: >ADRM1_HUMAN (Proteasomal ubiquitin receptor) • Data Base: nr on Human genome. • Blast Program: BLASTP • Matrices: PAM30,BLOSUM45

  13. What difference do we observe? • With BLOSUM45 we found related and divergent sequences. • With PAM30 we found only related sequences. BLOSUM45 PAM 30

  14. With BLOSUM45 we can discover interesting relations between proteins PAM 30 Mucin-13:a glycosylated membrane protein that protects the cell by binding to pathogens BLOSUM45 . . .

  15. Using different scoring matrices can produce slightly Different alignments: With PAM 30 With BLOSUM45

  16. A same alignment can be solved in many ways, specially when using a matrix for highly divergent sequences (BLOSUM45):

  17. PSI-BLAST Position Specific Iterative BLAST We will analyze the following Archeal uncharacterized protein: >gi|2501594|sp|Q57997|Y577_METJA PROTEIN MJ0577 MSVMYKKILYPTDFSETAEIALKHVKAFKTLKAEEVILLHVIDEREIKKRDIFSLLLGVAGLNKSVEEFENELKNKLTEEAKNKMENIKKELEDVGFKVKDIIVVGIPHEEIVKIAEDEGVDIIIMGSHGKTNLKEILLGSVTENVIKKSNKPVLVVKRKNS

  18. Threshold for initial BLAST Search (default:10) Threshold for inclusion in PSI-BLAST iterations (default:0.005)

  19. The query itself Orthologous sequences in two other archaeal species Other homologous sequences

  20. Is MJ0577 a filament protein? . . . Is MJ0577 a cationic amino transporter? . . . Is MJ0577 a universal stress protein? . . .

More Related