1 / 53

Lesson 9: predicting function from sequence

Lesson 9: predicting function from sequence. Evaluation of prediction methods. Evaluation of prediction methods. Comparing our results to experimentally verified sites. Our prediction gives:. Is the prediction correct?. Method evaluation.

quade
Download Presentation

Lesson 9: predicting function from sequence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lesson 9: predicting function from sequence

  2. Evaluation of prediction methods

  3. Evaluation of prediction methods • Comparing our results to experimentally verified sites Our prediction gives: Is the prediction correct?

  4. Method evaluation • A good method will be one with a high level of true-positives and true-negatives, and a low level of false-positives and false-negatives Our prediction gives: Is the prediction correct?

  5. Calibrating the method • All methods have a parameter (cutoff) that can be calibrated to improve the accuracy of the method. • For example: the E-value cutoff in BLAST

  6. Calibrating E-value cutoff Our prediction gives: Is the prediction correct? Is this a homolog?

  7. Calibrating E-value cutoff • Reminder: the lower the E-value, the more ‘significant’ the alignment between the query and the hit.

  8. Calibrating the E-value • What will happen if we raise the E-value cutoff (for instance – work with all hits with an E-value which is < 10) ? Our prediction gives: Is the prediction correct?

  9. Calibrating the E-value • On the other hand – if we lower the E-value (look only at hits with E-value < 10-8) Our prediction gives: Is the prediction correct?

  10. Improving prediction • Trade-off between specificity and sensitivity

  11. True positive True positive + False negative Sensitivity vs. specificity • Sensitivity = • Specificity = How good we hit real homologs Represent all the proteins which are really homologous True negative True negative + False positive How good we avoid real non-homologs Represent all the proteins which are really NOT homologous

  12. Raising the E-value to 10:sensitivityspecificity • Lowering the E-value to 10-8sensitivity specificity

  13. Functional prediction in proteins (purifying and positive selection)

  14. Darwin – the theory of natural selection • Adaptive evolution: Favorable traits will become more frequent in the population

  15. Adaptive evolution • When natural selection favors a single allele and therefore the allele frequency continuously shifts in one direction

  16. Kimura – the theory of neutral evolution • Neutral evolution: Most molecular changes do not change the phenotype Selection operates to preserve a trait (no change)

  17. Purifying Selection • Stabilizes a trait in a population:Small babies  more illnessLarge babies  more difficult birth… • Baby weight is stabilized round 3-4 Kg

  18. Purifying selection(conservation) -the molecular level • Histone 3

  19. Synonymous vs. non-synonymous substitutions Purifying selection: excess of synonymous substitutions

  20. Synonymous vs. non-synonymous substitutions Non-synonymous substitution: GUUGCU Synonymous substitution: GUUGUC Purifying selection: excess of synonymous substitutions

  21. Conservation as a means of predicting function Infer the rate of evolution at each site Low rate of evolution  constraints on the site to prevent disruption of function: active sites, protein-protein interactions, etc.

  22. Conservation as a means of predicting function

  23. Which site is more conserved?

  24. Use Phylogenetic information

  25. Prediction of conserved residues by estimating evolutionary rates at each site ConSurf/ConSeq web servers:

  26. Find homologous protein sequences (psi-blast) Perform multiple sequence alignment (removing doubles) Construct an evolutionary tree Project the results on the 3D structure Calculate the conservation score for each site Working process Input a protein with a known 3D structure (PDB id or file provided by the user)

  27. The Kcsa potassium channel • An outstanding mystery: how does the Kcsa Potassium channel conduct only K+ ions and not Na+?

  28. The Kcsa potassium channel structure • The structure of the Kcsa channel was resolved in 1998 • Kcsa is a homotetramer with a four-fold symmetry axis about its pore.

  29. The Kcsa potassium selectivity filter • The selectivity filter identifies water molecules bound to K+ • When water is bound to Na+: no passage

  30. Conservation analysis of Kcsa • Use Consurf to study Kcsa conservation

  31. ConSurf results

  32. Conseq • ConSeq performs the same analysis as ConSurf but exhibits the results on the sequence. • Predict buried/exposed relation • exposed & conserved  functionally important site • buried & conserved  structurally important site

  33. Conseq analysis • Exposed & conserved  functionally important site • Buried & conserved  structurally important site

  34. Positive selection & drug resistance

  35. Darwin – the theory of natural selection • Adaptive evolution: Favorable traits will become more frequent in the population

  36. Adaptive evolution on the molecular level ?

  37. Adaptive evolution on the molecular level Look for changes which confer an advantage

  38. Naïve detection • Observe multiple sequence alignment:variable regions = adaptive evolution??

  39. Naïve detection • The problem – how do we know which sites are simply sites with no selection pressure (“non-important” sites) and which are under adaptive evolution? X

  40. Solution – look at the DNA synonymous non-synonymous

  41. Solution – look at the DNA Adaptive evolution = Positive selectionNon-syn > Syn Purifying selectionSyn > Non-syn NeutralselectionSyn = Non-syn

  42. Also known as… Ka/Ks (or dn/ds, or ω) • Purifying selection: Ka < Ks (Ka/Ks <1) • Neutral selection: Ka=Ks (Ka/Ks = 1) • Positive selection: Ka > Ks (Ka/Ks >1) Ka Ks Non-synonymous mutation rate Synonymous mutation rate

  43. Examples for positive selection • Proteins involved in immune system • Proteins involved in host-pathogen interaction‘arms-race’ • Proteins following gene duplication • Proteins involved in reproduction systems

  44. Selecton – a server for the detection of purifying and positive selection http://selecton.bioinfo.tau.ac.il

  45. Detecting drug resistance using Selecton

  46. HIV: molecular evolution paradigm • Rapidly evolving virus: • High mutation rate (low fidelity of reverse transcriptase) • High replication rate

  47. HIV Protease Protease is an essential enzyme for viral replication Drugs against Protease are always part of the “cocktail”

  48. Ritonavir Inhibitor • Ritonavir (RTV) is a specific protease inhibitor (drug) C37H48N6O5S2

  49. Drug resistance No drug Drug Adaptive evolution (positive selection)

  50. Used Selecton to analyse HIV-1 protease gene sequences from patients that were treated with RTV only

More Related