80 likes | 179 Views
Forward Genealogical Simulations. Assumptions: 1) Fixed population size 2) Fixed mating time Step #1: The mating process: For a fixed population size N, there is a random distribution of progeny from each parent, with a mean equaling 1.
E N D
Forward Genealogical Simulations Assumptions: 1) Fixed population size 2) Fixed mating time Step #1: The mating process: For a fixed population size N, there is a random distribution of progeny from each parent, with a mean equaling 1.
Step #2: The mutation process is then overlaid on the genealogy. Mutations are assumed to be neutral and do not impact the fitness of progeny (i.e. are not selective). This is a reasonable assumption, as very few polymorphisms have any function that affect selection. Mutation is a random event that occurs at a frequency equal to µ.
BUT, there are 2 major problems with the forward simulation model: It is computationally expensive to produce such a model It is difficult to know how to start the process - need to know initial conditions for the species in question. The Coalescence Model bypasses the need to simulate every individual in the population. Because mating is random and mutations are neutral, the lineages of a sample of individuals can be traced back to a most recent common ancestor (MRCA) through statistical calculations. Individuals of the sample are said to “coalesce” at the point of their MRCA.
The main assumption of this theory is that each individual of the previous generation is equally likely to be the parent of any individual of the current generation (Wright-Fisher model). Therefore, the probability that a sample of 2 individuals possess the same common ancestor in the preceding generation is 1/N, where N = the size of the preceding generation. Stated in another way, the probability that a sample of 2 individuals do NOT share the same common ancestor is 1 - 1/N. N = 10 t + 1 t In the above example, the probability that the 2 shaded individuals in generation t do not have the same MRCA in generation t + 1 is: 1 - 1/10 = 0.9 Thus, there is a 90% chance that the 2 sampled individuals do not possess the same MRCA in the previous generation
This basic calculation of P(2) = 1 - 1/N to determine the likelihood of the MRCA occurring in the previous generation for a sample of 2 individuals can be mathematically expanded for a sample of n individuals: The above equation calculates the probability that n sampled individuals have n distinct ancestors in each of the preceding t generations. Essentially, this equation can be used to generate random genealogies by statistically tracing back to the MRCA from a sample of individuals in the current generation.
Now that a genealogy can be randomly generated, the effect of mutations can be overlaid on the process, as in forward genealogical simulations. Assumptions: 1) Constant-rate neutral mutation process 2) Infinite site model - the locus examined is composed of many sites and no more than one mutation occurs at any site within the genealogy of the sample E() = 4 E() = 3 E() = 1 As one would expect, the number of accumulated mutations (S) is directly proportional to the length of the lineage [E(S) = E(Ttot)].
So now a simple coalescent model has been developed in which a neutral mutation process, governed by the mutation rate , is overlaid on random genealogies generated from a sample of alleles from the current generation. Time (in gen.) T(3) T(7) T(8) So what is the practical use of such a model? Model fitting - manipulating parameters of the model to generate plausible theories based on collected data
One Simple Example: The Effects of Population Size (N) on the Coalescent Model According to the Coalescent Model, individuals sampled from a large population accumulate more polymorphisms due to the extended time it takes to reach the MRCA. Large N Small N Now apply these models to what is seen in the field to look at which best fits the data: Sequence data of individuals from African and European populations shows that African populations have a significantly higher polymorphism rate. Thus, a plausible theory is that the current African population is derived from a larger ancestral population than the current European population.