1 / 26

Recombination and Pedigrees

Recombination and Pedigrees. Genealogies and Recombination: The ARG Recombination Parsimony The ARG and Data Pedigrees: Models and Data Pedigrees & ARGs Challenges Empirical Investigations: Open Questions. Finding Minimal Recombination Histories. 1. 2. 3. 4. 1. 2. 3. 1. 4. 2. 3.

elackey
Download Presentation

Recombination and Pedigrees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recombination and Pedigrees Genealogies and Recombination: The ARG Recombination Parsimony The ARG and Data Pedigrees: Models and Data Pedigrees & ARGs Challenges Empirical Investigations: Open Questions

  2. Finding Minimal Recombination Histories 1 2 3 4 1 2 3 1 4 2 3 4 Global Pedigrees Finding Common Ancestors NOW Recombination Histories & Global Pedigrees Acknowledgements Yun Song - Rune Lyngsø - Mike Steel

  3. Hudson & Kaplan’s RM 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 If you equate RM with expected number of recombinations, this could be used as an estimator. Unfortunately, RM is a gross underestimate of the real number of recombinations.

  4. Local Inference of Recombinations • Recoding • At most 1 mutation per column • 0 ancestral state, 1 derived state 0 0 1 1 0 1 0 1 T . . . G T . . . C A . . . G A . . . C Four combinations Incompatibility: 0 0 0 1 1 0 1 0 0 0 0 1 1 1 00 10 01 11 Myers-Griffiths (2002): Number of Recombinations in a sample, NR, number of types, NT, number of mutations, NM obeys:

  5. Finding Minimal Recombination Histories 64 Bodmer & Edwards: Parsimony defined as reconstruction principle 85 Hudson Kaplan uses minimal recombination histories as observed recombinations • Attempts to find minimal histories of sequences • Definition of recombination as Subtree Prune Regraft operations J.J.Hein: Reconstructing the history of sequences subject to Gene Conversion and Recombination. Mathematical Biosciences. (1990) 98.185-200. J.J.Hein: A Heuristic Method to Reconstruct the History of Sequences Subject to Recombination. J.Mol.Evol. 20.402-411. 1993 Hein,J.J., T.Jiang, L.Wang & K.Zhang (1996): "On the complexity of comparing evolutionary trees" Discrete Applied Mathematics 71.153-169 Song, Y.S. (2003) “On the combinatorics of rooted binary phylogenetic trees”. Annals of Combinatorics, 7:365–379 Song, Y.S. & Hein, J. (2005) Constructing Minimal Ancestral Recombination Graphs. J. Comp. Biol., 12:147–169 Song, Y.S. & Hein, J. (2004) On the minimum number of recombination events in the evolutionary history of DNA sequences.J. Math. Biol., 48:160–186. Song, Y.S. & Hein, J. (2003) Parsimonious reconstruction of sequence evolution and haplotype blocks: finding the minimum number of recombination events, Lecture Notes in Bioinformatics, Proceedings of WABI'03, 2812:287–302. Lyngsø, Song and Hein (2005) “Minimal Recombination Histories by Branch and Bound” WABI

  6. Minimal Number of Recombinations Last Local Tree Algorithm: 1 2 i-1 i L Data 1 2 Trees n How many local trees? • Unrooted • Coalescent The Kreitman data (1983):11 sequences, 3200bp, 43(28) recoded, 9 different Bi-partitions How many neighbors?

  7. Two Adjacent Columns 0 0 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 0 DiamRec 1 2 RecDist 1. RecDist[T1,T2] is hard for large leaf number, but can be automatically calculated by adding DiamRec trivial columns and only considering 1 recombination neighbors. 2. Infinite Site Assumption: Local Trees must contain Local Bipartition

  8. Metrics on Trees based on subtree transfers. Trees including branch lengths Unrooted tree topologies Rooted tree topologies Tree topologies with age ordered internal nodes Pretending the easy problem (unrooted) is the real problem (age ordered), causes violation of the triangle inequality:

  9. Tree Combinatorics and Neighborhoods Due to Yun Song Song (2003+) Allen & Steel (2001) Observe that the size of the unit-neighbourhood of a tree does not grow nearly as fast as the number of trees

  10. 1 4 2 3 5 6 7

  11. The Minimal Recombination History for the Kreitman Data

  12. The Griffiths-Ethier-Tavare Recursions No recombination: Infinite Site Assumption Ancestral State Known History Graph: Recursions Exists No cycles Possible Histories without Recombination for simple data example 0 1 1 1 4 2 3 5 4 5 5 5 6 3 7 2 8 1 - recombination 27 ACs + recombination 3*108 ACs

  13. mid-point heuristic 2nd 1st Ancestral configurations to 2 sequences with 2 segregating sites

  14. Counting Recursion k1 (k2+1)*k1 +1 possible ancestral columns. k2 Summary statistic lumping configurations k1(k2+1)+1 padded with “-” + 1 k+1 k

  15. Counting + Branch and Bound Algorithm 0 3 1 91 2 1314 3 8618 4 30436 5 62794 6 78970 7 63049 8 32451 9 10467 10 1727 Lower bound ? Upper Bound Exact length k 289920 k-recombinatination neighborhood

  16. i. The process is non-Markovian ii. The trees cannot be reduced to Topologies * * = Time versus Spatial: Coalescent-Recombination (Griffiths, 1981; Hudson, 1983 - Wiuf & Hein, 1999) Temporal Process Spatial Process

  17. Time versus Spatial 2: Pedigrees Elston-Stewart (1971) -Temporal Peeling Algorithm: Father Mother Condition on parental states Recombination and mutation are Markovian Lander-Green (1987) - Genotype Scanning Algorithm: Father Mother Condition on paternal/maternal inheritance Recombination and mutation are Markovian

  18. Time versus Spatial 3: Phylogenetic Alignment • Optimisation Algorithms • indels of length 1 (David Sankoff, 1973) Spatial • indels of length k (Bjarne Knudsen, 2003) Temporal • Statistical Alignment Temporal: Spatial:

  19. minARGs: Recombination Events & Local Trees Song-Hein Myers-Griiths ((1,2),(1,2,3)) Hudson-Kaplan Minimal ARG n=8, Q=40 True ARG 1 2 3 4 5 n=8, Q=15 True ARG Reconstructed ARG 1 3 2 4 5 ((1,3),(1,2,3)) 0 4 Mb Mutation information on both sides • Mutation information on only one side n=7, r=10, Q=75

  20. Likelihood Calculations on the e-ARG 010 010 101 101 110 Example:

  21. Reconstructing global pedigrees: Superpedigrees Steel and Hein, 2005 k The gender-labeled pedigrees for all pairs, defines global pedigree Gender-unlabeled pedigrees doesn’t!!

  22. Reconstructing global pedigrees: Links and lassos Steel and Hein, 2005 k Link Lasso Gender-labeled links and lassos determine the global pedigree.

  23. Benevolent Mutation and Recombination Process Genomes with r and m/r --> infinity r - recombination rate, m - mutation rate Embedded phylogenies: • All embedded phylogenies are observable • Do they determine the pedigree? Counter example:

  24. Summary and Future • Minimal Recombination Histories • Likelihood Calculations • Global Pedigrees & Inferring Pedigrees from Genomes Recombination: Remove infinite site assumption Investigate MCMC algorithms Pedigrees: Data Analysis Algorithms Presentation can be found at: http://mathgen.stats.ox.ac.uk/bioinformatics/

  25. To Do • Hudson Slide • Neighbor trees • Literature and History

More Related