1 / 24

BLAST program selection guide

BLAST program selection guide. http://www.ncbi.nlm.nih.gov/blast/producttable.shtml#tab31. Homology. Orthology, Paralogy, Xenology. Fitch WM.  Trends Genet. 2000 May;16(5):227-31. . Analogy vs Homology. Analogy

bayard
Download Presentation

BLAST program selection guide

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BLAST program selection guide • http://www.ncbi.nlm.nih.gov/blast/producttable.shtml#tab31

  2. Homology Orthology, Paralogy, Xenology

  3. Fitch WM.  Trends Genet. 2000 May;16(5):227-31. 

  4. Analogy vs Homology Analogy The relationship of any two characters that have descended convergently from unrelated ancestors. Homology The relationship of any two characters that have descended, usually with divergence, from a common ancestral character.

  5. Orthology The relationship of any two homologous characters whose common ancestor lies in the cenancestor of the taxa from which the two sequences were obtained. Paralogy The relationship of any two homologous characters arising from a duplication of the gene for that character. Xenology The relationship of any two homologous characters whose history, since their common ancestor, involves an interspecies (horizontal) transfer of the genetic material for at least one of those characters.

  6. A classic example(Figure from NCBI)

  7. Test Yourself • A1 – B1 • A1 – B2 • A1 – C3 • B1 – C2 • C2 – C3 • B2 – C3 • C3 – AB1

  8. Test Yourself • A1 – B1 = Ortho • A1 – B2 = Ortho • A1 – C3 = Ortho • B1 – C2 = Para (out) • C2 – C3 = Para (in) • B2 – C3 = Ortho • C3 – AB1= Xeno

  9. Homology on a Genome-Scale • How many and which genes are common to two or more organisms? • Which genes differentiate one organism from another? • How is homology related to function?

  10. Orthologs are the set of genes/proteins with gene trees identical to the species tree. • We can understand other types of homology relationships by comparison to the species tree. • But often we don’t know the species tree, and phylogenetic methods are complex

  11. Consider two genomes • Use BLASTP to compare one set of proteins (proteome) to the other • Which set will you use as the query and which as the database? • What criteria will you use to define “a match”? GenomeA – gene 1 GenomeB– gene 1 A1, A3, B2 and B3 are homologs (assuming the aligned regions overlap) GenomeA – gene 2 GenomeB – gene 2 GenomeA – gene 3 GenomeB – gene 3

  12. Reciprocal Best Hits • Use BLASTP to compare sets of proteins (proteome) to each other • First using GenomeA to query against GenomeB • Then using GenomeB to query against GenomeA • Save only one best match for each query • Save only the reciprocal best matches as “orthologs” GenomeA – gene 1 GenomeB– gene 1 GenomeA – gene 2 GenomeB – gene 2 GenomeA – gene 3 GenomeB – gene 3 GenomeA – gene 1 GenomeB– gene 1 GenomeA – gene 2 GenomeB – gene 2 GenomeA – gene 3 GenomeB – gene 3 Lose A3-B2 and A1-B3 homology GenomeA – gene 1 GenomeB– gene 1 GenomeA – gene 2 GenomeB – gene 2 GenomeA – gene 3 GenomeB – gene 3

  13. GenomeA – gene 1 GenomeB– gene 1 GenomeA – gene 2 GenomeB – gene 2 GenomeA – gene 3 GenomeB – gene 3 One case where RBH works GenomeA – gene 1 GenomeB– gene 1 GenomeA – gene 2 GenomeB – gene 2 GenomeA – gene 3 GenomeB – gene 3 GenomeA – gene 1 GenomeB– gene 1 GenomeA – gene 2 GenomeB – gene 2 GenomeA – gene 3 GenomeB – gene 3 Glucose transport GenomeA – gene 1 Glucose transport GenomeB – gene 2 GenomeA – gene 3 Fructose transport Galactose transport GenomeB – gene 3

  14. GenomeA – gene 1 GenomeB– gene 1 GenomeA – gene 2 GenomeB – gene 2 GenomeA – gene 3 GenomeB – gene 3 One case where RBH fails GenomeA – gene 1 GenomeB– gene 1 GenomeA – gene 2 GenomeB – gene 2 GenomeA – gene 3 GenomeB – gene 3 GenomeA – gene 1 GenomeB– gene 1 GenomeA – gene 2 GenomeB – gene 2 GenomeA – gene 3 GenomeB – gene 3 In paralogs- duplication since speciation Glucose transport GenomeA – gene 1 Glucose transport GenomeA– gene 3 GenomeB– gene 2 Fructose transport Galactose transport GenomeB – gene 3

  15. Software/Methods for Predicting Orthologs from Genome Sequences • RBH • RSD (Reciprocal Shortest Distance) • INPARANOID • RIO • Orthostrapper • Ortholuge • TribeMCL • OrthoMCL

  16. Li L, Stoeckert CJ Jr, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003 Sep;13(9):2178-89.

  17. Pre-computed OrthoMCL results http://www.orthomcl.org/

  18. Evaluating performance • No “gold standard” set of true orthologs • Latent Class Analysis • Agreement between methods provides confidence • 27,562 proteins from 6 eukarotes assigned to Pfams

  19. actual     \     predicted negative positive Negative TN FP Positive FN TP Performance Metrics • Accuracy – Proportion correct • TN+TP/total • TPR (Recall) – Proportion of predicted positives that are correct • TP/FP+TP • Sensitivity – Proportion of positives correctly predicted • TP/FN+TP • Specificity – Proportion of negatives correctly predicted • TN/TN+FP

  20. Chen F, Mackey AJ, Vermunt JK, Roos DS. Assessing performance of orthology detection strategies applied to eukaryotic genomes. PLoS ONE. 2007 Apr 18;2(4):e383.

  21. Method Comparison

  22. Is context useful for assigning homology type? • Prokaryotes vs eukaryotes • Evolutionary origin • Paralogs that arise as tandem repeats of single genes • Parlogs that arise from duplication of larger regions • Xenologs that arise from acquisition of a similar gene from another lineage

  23. Example: pectate lyases of soft-rot enterobactia may be SymBets, but genome context suggests they may not be orthologs

More Related