290 likes | 472 Views
DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT. ESTIMATING THE dN/dS RATIO FOR GENE SEQUENCES IN THE PRESENCE OF RECOMBINATION. Danny Wilson 12 th October 2004. Menu. Codon-based models of molecular evolution An new method for estimating omega with recombination
E N D
DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT ESTIMATING THE dN/dS RATIO FOR GENE SEQUENCES IN THE PRESENCE OF RECOMBINATION Danny Wilson 12th October 2004
Menu • Codon-based models of molecular evolution • An new method for estimating omega with recombination • Does it work? Simulation studies and example data
Part one Codon-based models of molecular evolution
Selection Mutation Ancestral type Neutral mutant Inviable mutant Underlying rates of non-synonymous mutation are usually confounded with selection against inviable mutants.Thus it is convenient to model functional constraint as mutational bias.(Or rather, make no attempt to disentangle the two). Sampling usuallyoccurs at this pointi.e. post-selection
Types of single nucleotide mutationTransitions vs. transversions For any base there are always 2 possible transversions and 1 possible transition. A G Purine Transitions Transversions T C Pyramidine Transitions
T T G T T G Leucine Leucine T T A Leucine A T G Methionine Types of codon mutationSynonymous vs. non-synonymous Synonymous Non-synonymous Leucine pH 5.98 6-fold degeneracy in the genetic code Methionine pH 5.74 Single unique codon ATG CH3-S-(CH2)2-CH(NH2)-COOH (CH3)2-CH-CH2-CH(NH2)-COOH
Example: CTT C T T T T T A T T Leucine G T T T C T T A T T G T T T C T T A T T G
Nielsen and Yang (1998) codon-based model of molecular evolution
codeML • Pros • Viable method for detecting mode of selection on a codon sequence • Cons • Categorizes possible values for omega into a small number of discrete intervals • Results can be misleading in the presence of recombination
Part two An new method for estimating omega with recombination
Li and Stephens (2003)Approximation to the likelihood TTTGATACTGTTGCCGAAGGTTTGGGCGAAATTCGCGATTTATTGCGCCGTTATCATCAT TTTGATACCGTTGCCGAAGGTTTGGGTGAAATTCGCGATTTATTGCGCCGTTACCACCGC TTTGATACCGTTGCCGAAGGTTTGGGTAAAATTCGCGATTTATTGCGCCGTTACCACCGC TTTGATACCGTTGCCGAAGGTTTGGGCGAAATTCGTGATTTATTGCGCCGTTATCATCAT
Li and Stephens (2003)Approximation to the likelihood TTTGATACTGTTGCCGAAGGTTTGGGCGAAATTCGCGATTTATTGCGCCGTTATCATCAT TTTGATACCGTTGCCGAAGGTTTGGGTGAAATTCGCGATTTATTGCGCCGTTACCACCGC TTTGATACCGTTGCCGAAGGTTTGGGTAAAATTCGCGATTTATTGCGCCGTTACCACCGC
Estimating variable omega • The problem • A constant omega model is prone to averaging positive and negative omegas in a gene • Allowing every site its own omega leaves little information for inference • The solution • A change-point model where windows of adjacent sites share the same omega
Estimating variable omega • MCMC moves: • Change omega for a single block • Extend a block 5’ or 3’ • Split an existing block • Merge adjacent blocks w1 w2 w3 w4 w5
Part three Does it work? Simulation studies and example data
Neutral dataset True omega Posterior mean Posterior HPD interval
Non-neutral dataset True omega Posterior mean Posterior HPD interval
HIV envelope geneSlow Non-Progressors vs Rapid Progressors Slow Non-Progressors Rapid Progressors
HIV envelope geneSlow Non-Progressors vs Rapid Progressors Slow Non-Progressors Rapid Progressors
Neisseria meningitidis PorB3 95% HPD Upper0.0386 95% HPD Lower0.0187
Work in progress… • Variable recombination rate • Model indels • Falsifiability test • Test for sensitivity to rate heterogeneity
Acknowledgements • Gil McVean (Supervisor) • Martin Maiden (Supervisor) • Ziheng Yang • Rachel Urwin (meninge data) • Charlie Edwards (HIV data)