180 likes | 283 Views
Bayesian Evolutionary Distance. P. Agarwal and D.J. States. Bayesian evolutionary distance. Journal of Computational Biology 3(1):1—17, 1996. Determining time of divergence. Goal: Determine when two aligned sequences X and Y diverged from a common ancestor AGTTGAC ACTTGCC Model:
E N D
Bayesian Evolutionary Distance P. Agarwal and D.J. States. Bayesian evolutionary distance. Journal of Computational Biology 3(1):1—17, 1996
Determining time of divergence • Goal: Determine when two aligned sequences X and Y diverged from a common ancestor AGTTGAC ACTTGCC • Model: • Mutation only • Independence • Markov process
Divergence points have different probabilities X Ancestor Y Probability time
DNA PAM matrices • Similar to Dayhoff PAM matrices • PAM 1 corresponds to 1% mutation • 1% change ≈ 10 million years • Simplification: uniform mutation rates among nucleotides: • mij = if i = j • mij = if i j • Can modify to handle different transition/transversion rates • Transitions (AG or CT) have higher probability than transversions • PAM x = (PAM 1)x
DNA PAM 1 A G T C A G T A
DNA PAM x A G T C A G T A
DNA PAM x • As x , (x) and (x) 1/4 • Assume pi = ¼ for i ={A,C,T,G} • Leads to simple match/mismatch scoring scheme
( ) ( x ) log 4 ( x ) a = s ( ) ( x ) log 4 ( x ) b = r DNA PAM n: Scoring Log-odds score of alignment of length n with k mismatches: Odds score of same alignment:
Probability of k mismatches at distance x Note: Need odds score here, not log-odds!
Conditional expectation From odds scores ?? Expected evolutionary distance given k mismatches Over all distances By Bayes’ Thm:
Assumptions • Consider only a finite number of values of x; e.g., 1, 10, 25,50, etc. • In theory, could consider any number of values • “Flat prior:” All values of x are equally likely • If M values are considered, Pr(x) = 1/M
Fraction of the probability of k mismatches that comes from assuming distance is x Calculating the distance
X Y Ungapped local alignments An ungapped local alignmentof sequences X and Y is a pair of equal-length substrings of X and Y Only matches and mismatches — no gaps
23 matches 2 mismatches 34 matches 11 mismatches Ungapped local alignments A: B: P. Agarwal and D.J. States. Bayesian evolutionary distance. Journal of Computational Biology 3(1):1—17, 1996
Which alignment is better? Answer depends on evolutionary distance