540 likes | 660 Views
Lesson 9: predicting function from sequence. Evaluation of prediction methods. Evaluation of prediction methods. Comparing our results to experimentally verified sites. Our prediction gives:. Is the prediction correct?. Method evaluation.
E N D
Evaluation of prediction methods • Comparing our results to experimentally verified sites Our prediction gives: Is the prediction correct?
Method evaluation • A good method will be one with a high level of true-positives and true-negatives, and a low level of false-positives and false-negatives Our prediction gives: Is the prediction correct?
Calibrating the method • All methods have a parameter (cutoff) that can be calibrated to improve the accuracy of the method. • For example: the E-value cutoff in BLAST
Calibrating E-value cutoff Our prediction gives: Is the prediction correct? Is this a homolog?
Calibrating E-value cutoff • Reminder: the lower the E-value, the more ‘significant’ the alignment between the query and the hit.
Calibrating the E-value • What will happen if we raise the E-value cutoff (for instance – work with all hits with an E-value which is < 10) ? Our prediction gives: Is the prediction correct?
Calibrating the E-value • On the other hand – if we lower the E-value (look only at hits with E-value < 10-8) Our prediction gives: Is the prediction correct?
Improving prediction • Trade-off between specificity and sensitivity
True positive True positive + False negative Sensitivity vs. specificity • Sensitivity = • Specificity = How good we hit real homologs Represent all the proteins which are really homologous True negative True negative + False positive How good we avoid real non-homologs Represent all the proteins which are really NOT homologous
Raising the E-value to 10:sensitivityspecificity • Lowering the E-value to 10-8sensitivity specificity
Functional prediction in proteins (purifying and positive selection)
Darwin – the theory of natural selection • Adaptive evolution: Favorable traits will become more frequent in the population
Adaptive evolution • When natural selection favors a single allele and therefore the allele frequency continuously shifts in one direction
Kimura – the theory of neutral evolution • Neutral evolution: Most molecular changes do not change the phenotype Selection operates to preserve a trait (no change)
Purifying Selection • Stabilizes a trait in a population:Small babies more illnessLarge babies more difficult birth… • Baby weight is stabilized round 3-4 Kg
Purifying selection(conservation) -the molecular level • Histone 3
Synonymous vs. non-synonymous substitutions Purifying selection: excess of synonymous substitutions
Synonymous vs. non-synonymous substitutions Non-synonymous substitution: GUUGCU Synonymous substitution: GUUGUC Purifying selection: excess of synonymous substitutions
Conservation as a means of predicting function Infer the rate of evolution at each site Low rate of evolution constraints on the site to prevent disruption of function: active sites, protein-protein interactions, etc.
Prediction of conserved residues by estimating evolutionary rates at each site ConSurf/ConSeq web servers:
Find homologous protein sequences (psi-blast) Perform multiple sequence alignment (removing doubles) Construct an evolutionary tree Project the results on the 3D structure Calculate the conservation score for each site Working process Input a protein with a known 3D structure (PDB id or file provided by the user)
The Kcsa potassium channel • An outstanding mystery: how does the Kcsa Potassium channel conduct only K+ ions and not Na+?
The Kcsa potassium channel structure • The structure of the Kcsa channel was resolved in 1998 • Kcsa is a homotetramer with a four-fold symmetry axis about its pore.
The Kcsa potassium selectivity filter • The selectivity filter identifies water molecules bound to K+ • When water is bound to Na+: no passage
Conservation analysis of Kcsa • Use Consurf to study Kcsa conservation
Conseq • ConSeq performs the same analysis as ConSurf but exhibits the results on the sequence. • Predict buried/exposed relation • exposed & conserved functionally important site • buried & conserved structurally important site
Conseq analysis • Exposed & conserved functionally important site • Buried & conserved structurally important site
Darwin – the theory of natural selection • Adaptive evolution: Favorable traits will become more frequent in the population
Adaptive evolution on the molecular level Look for changes which confer an advantage
Naïve detection • Observe multiple sequence alignment:variable regions = adaptive evolution??
Naïve detection • The problem – how do we know which sites are simply sites with no selection pressure (“non-important” sites) and which are under adaptive evolution? X
Solution – look at the DNA synonymous non-synonymous
Solution – look at the DNA Adaptive evolution = Positive selectionNon-syn > Syn Purifying selectionSyn > Non-syn NeutralselectionSyn = Non-syn
Also known as… Ka/Ks (or dn/ds, or ω) • Purifying selection: Ka < Ks (Ka/Ks <1) • Neutral selection: Ka=Ks (Ka/Ks = 1) • Positive selection: Ka > Ks (Ka/Ks >1) Ka Ks Non-synonymous mutation rate Synonymous mutation rate
Examples for positive selection • Proteins involved in immune system • Proteins involved in host-pathogen interaction‘arms-race’ • Proteins following gene duplication • Proteins involved in reproduction systems
Selecton – a server for the detection of purifying and positive selection http://selecton.bioinfo.tau.ac.il
HIV: molecular evolution paradigm • Rapidly evolving virus: • High mutation rate (low fidelity of reverse transcriptase) • High replication rate
HIV Protease Protease is an essential enzyme for viral replication Drugs against Protease are always part of the “cocktail”
Ritonavir Inhibitor • Ritonavir (RTV) is a specific protease inhibitor (drug) C37H48N6O5S2
Drug resistance No drug Drug Adaptive evolution (positive selection)
Used Selecton to analyse HIV-1 protease gene sequences from patients that were treated with RTV only