150 likes | 233 Views
Finding Minimal Recombination Histories. 1. 2. 3. 4. 1. 2. 3. 1. 4. 2. 3. 4. Global Pedigrees. Finding Common Ancestors. NOW. Recombination Histories & Global Pedigrees. Acknowledgements Yun Song - Rune Lyngsø - Mike Steel. Recombination. Gene Conversion.
E N D
Finding Minimal Recombination Histories 1 2 3 4 1 2 3 1 4 2 3 4 Global Pedigrees Finding Common Ancestors NOW Recombination Histories & Global Pedigrees Acknowledgements Yun Song - Rune Lyngsø - Mike Steel
Recombination Gene Conversion Coalescent/Duplication Mutation Infinite site assumption ? Basic Evolutionary Events
Hudson & Kaplan’s RM 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 If you equate RM with expected number of recombinations, this could be used as an estimator. Unfortunately, RM is a gross underestimate of the real number of recombinations.
Local Inference of Recombinations • Recoding • At most 1 mutation per column • 0 ancestral state, 1 derived state 0 0 1 1 0 1 0 1 T . . . G T . . . C A . . . G A . . . C Four combinations Incompatibility: 0 0 0 1 1 0 1 0 0 0 0 1 1 1 00 10 01 11 Myers-Griffiths (2002): Number of Recombinations in a sample, NR, number of types, NT, number of mutations, NM obeys:
Minimal Number of Recombinations Last Local Tree Algorithm: 1 2 i-1 i L Data 1 2 Trees n How many local trees? • Unrooted • Coalescent The Kreitman data (1983):11 sequences, 3200bp, 43(28) recoded, 9 different Bi-partitions How many neighbors?
Metrics on Trees based on subtree transfers. Trees including branch lengths Unrooted tree topologies Rooted tree topologies Tree topologies with age ordered internal nodes Pretending the easy problem (unrooted) is the real problem (age ordered), causes violation of the triangle inequality:
Tree Combinatorics and Neighborhoods Due to Yun Song Song (2003+) Allen & Steel (2001) Observe that the size of the unit-neighbourhood of a tree does not grow nearly as fast as the number of trees
1 4 2 3 5 6 7
The Griffiths-Ethier-Tavare Recursions No recombination: Infinite Site Assumption Ancestral State Known History Graph: Recursions Exists No cycles Possible Histories without Recombination for simple data example 0 1 1 1 4 2 3 5 4 5 5 5 6 3 7 2 8 1 - recombination 27 ACs + recombination 3*108 ACs
mid-point heuristic 2nd 1st Ancestral configurations to 2 sequences with 2 segregating sites
Counting + Branch and Bound Algorithm 0 3 1 91 2 1314 3 8618 4 30436 5 62794 6 78970 7 63049 8 32451 9 10467 10 1727 Lower bound ? Upper Bound Exact length k 289920 k-recombinatination neighborhood
minARGs: Recombination Events & Local Trees Song-Hein Myers-Griiths ((1,2),(1,2,3)) Hudson-Kaplan Minimal ARG n=8, Q=40 True ARG 1 2 3 4 5 n=8, Q=15 True ARG Reconstructed ARG 1 3 2 4 5 ((1,3),(1,2,3)) 0 4 Mb Mutation information on both sides • Mutation information on only one side n=7, r=10, Q=75
Reconstructing global pedigrees: Superpedigrees Steel and Hein, 2005 k The gender-labeled pedigrees for all pairs, defines global pedigree Gender-unlabeled pedigrees doesn’t!!
Benevolent Mutation and Recombination Process Genomes with r and m/r --> infinity r - recombination rate, m - mutation rate Embedded phylogenies: • All embedded phylogenies are observable • Do they determine the pedigree? Counter example: