1 / 18

Bayesian Evolutionary Distance

Bayesian Evolutionary Distance. P. Agarwal and D.J. States. Bayesian evolutionary distance. Journal of Computational Biology 3(1):1—17, 1996. Determining time of divergence. Goal: Determine when two aligned sequences X and Y diverged from a common ancestor AGTTGAC ACTTGCC Model:

baeddan
Download Presentation

Bayesian Evolutionary Distance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian Evolutionary Distance P. Agarwal and D.J. States. Bayesian evolutionary distance. Journal of Computational Biology 3(1):1—17, 1996

  2. Determining time of divergence • Goal: Determine when two aligned sequences X and Y diverged from a common ancestor AGTTGAC ACTTGCC • Model: • Mutation only • Independence • Markov process

  3. Divergence points have different probabilities X Ancestor Y Probability time

  4. DNA PAM matrices • Similar to Dayhoff PAM matrices • PAM 1 corresponds to 1% mutation • 1% change ≈ 10 million years • Simplification: uniform mutation rates among nucleotides: • mij =  if i = j • mij =  if i  j • Can modify to handle different transition/transversion rates • Transitions (AG or CT) have higher probability than transversions • PAM x = (PAM 1)x

  5. DNA PAM 1 A G T C A G T A

  6. DNA PAM x A G T C A G T A

  7. DNA PAM x • As x , (x) and (x)  1/4 • Assume pi = ¼ for i ={A,C,T,G} • Leads to simple match/mismatch scoring scheme

  8. DNA PAM x: Scoring

  9. DNA PAM

  10. ( ) ( x ) log 4 ( x ) a = s ( ) ( x ) log 4 ( x ) b = r DNA PAM n: Scoring Log-odds score of alignment of length n with k mismatches: Odds score of same alignment:

  11. Probability of k mismatches at distance x Note: Need odds score here, not log-odds!

  12. Conditional expectation From odds scores ?? Expected evolutionary distance given k mismatches Over all distances By Bayes’ Thm:

  13. Assumptions • Consider only a finite number of values of x; e.g., 1, 10, 25,50, etc. • In theory, could consider any number of values • “Flat prior:” All values of x are equally likely • If M values are considered, Pr(x) = 1/M

  14. Calculating Pr(k) and Pr(x|k)

  15. Fraction of the probability of k mismatches that comes from assuming distance is x Calculating the distance

  16. X Y Ungapped local alignments An ungapped local alignmentof sequences X and Y is a pair of equal-length substrings of X and Y Only matches and mismatches — no gaps

  17. 23 matches 2 mismatches 34 matches 11 mismatches Ungapped local alignments A: B: P. Agarwal and D.J. States. Bayesian evolutionary distance. Journal of Computational Biology 3(1):1—17, 1996

  18. Which alignment is better? Answer depends on evolutionary distance

More Related