310 likes | 538 Views
Grouping loci. Criteria Maximum two-point recombination fraction Example - r ij ≤ 0.40 Minimum LOD score - Z ij For n loci, there are n ( n -1)/2 possible combinations that will be tested Expect probability of false positives Significant probability value - p ij
E N D
Grouping loci Criteria • Maximum two-point recombination fraction • Example -rij≤ 0.40 • Minimum LOD score - Zij • For n loci, there are n(n-1)/2 possible combinations that will be tested • Expect probability of false positives • Significant probability value - pij • Example pij ≤ 0.00001
Locus ordering • Ideally, we would estimate the likelihoods for all possible orders and take the one that is most probable by comparing log likelihoods • That is computationally inefficient when there are more than ~10 loci • Several methods have been proposed for producing a preliminary order
Locus ordering Number of orders among k loci Number of triplets among k loci
Three-point Analysis Number of unique orders among k loci For three loci (k = 3 )
Non-Additivity of recombination frequencies rBC rAB C B A rAC The recombination frequency over the interval A – C (rAC) is less than the sum of rAB and rBC : rAC < rAB + rBC. This is because (rare) double recombination events (a recombination in both A - B and B - C) do not contribute to recombination between A and C.
Non-Additivity of recombination frequencies P00=(1-rAB)(1-rBC) P10=rAB(1-rBC) rAC=rAB(1-rBC)+(1-rAB)rBC rAC=rAB+rBC-2rABrBC P01=(1-rAB)rBC C C C C B B B B A A A A P11=rABrBC
Interference • Interference means that recombination events in adjacent intervals interfere. The occurrence of an event in a given interval may reduce or enhance the occurrence of an event in its neighbourhood. • Positive interference refers to the ‘suppression’ of recombination events in the neighbourhood of a given one. • Negative interference refers to the opposite: enhancement of clusters of recombination events. • Positive interference results in less double recombinants (over adjacent intervals) than expected on the basis of independence of recombination events. rAC=rAB+rBC-2CrABrBC
A B C a b c Interference Coefficient of coincidence C = coefficient of coincidence Expected number of double crossovers = rABrBCN Interference I = 1 - C
DH population N=100, locus order ABC Observed Count: 14 24 10 22 16 4 8 2
Interference • No interference • C = 1 and Interference = 1-C = 0 • Complete interference • C = 0 and Interference = 1-C = 1 • Negative interference • C > 1 and Interference = 1-C < 0 • Positive interference • C < 1 and Interference = 1-C > 0
Three locus analysis, DH population For the ABC locus order NR SC2 DC12 SC1 SC1 DC12 SC2 NR
MLE of two-locus recombination fractions For the ABC locus order Regardless of locus order the MLEs of rare
B a C B A C X X b A c b a c Ordering Loci by Minimizing Double Crossovers Rarest genotypes are double recombinants The order of loci is BAC
Ordering Loci by using recombination fractions MLEs of rare Order B C Largest r is rBC = 0.3 B A C A C Smallest r is rAC = 0.1
Minimum Sum of Adjacent Recombination Frequencies (SARF) (Falk 1989) r = recombination frequency between adjacent loci ai and aj for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[SARF] and the minimum distance (MD) map Simulations have shown that SARF is a reliable method to obtain markers orders for large datasets
Minimum Product of Adjacent Recombination Frequencies (PARF) (Wilson 1988) r = recombination frequency between adjacent loci ai and aj for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[PARF] and the minimum distance (MD) map SARF and PARF are equivalent methods to obtain markers orders for large datasets
Maximum Sum of Adjacent LOD Scores(SALOD) Z = LOD score for recombination frequency between adjacent loci aiand aj for a given order: 1, 2, 3, …, l -1, l The B-A-C order gives MAX[SALOD] SALOD is sensitive to locus informativeness
Minimum Count of Crossover Events (COUNT) (Van Os et al. 2005) X = simple count of recombination events between adjacent loci ai and aj for a given sequence: 1, 2, 3, …, l -1, l The B-A-C order gives MIN[COUNT] COUNT is equivalent to SARF and PARF with perfect data. COUNT is superior to SARF with incomplete data
Locus Order- Likelihood Approach r1 = Recombination fraction in interval 1 r2= Recombination fraction in interval 2 C = Coefficient of coincidence pi = fi /n fi = Expected frequency of the ith pooled phenotypic class I = 1, 2, …, k k = No. of pooled phenotypic classes
Three locus analysis, DH population For the ABC locus order NR SC2 DC12 SC1 SC1 DC12 SC2 NR
MLE of two-locus recombination fractions For the ABC locus order Regardless of locus order the MLEs of rare
ABC ORDER BAC ORDER ACB ORDER
Likelihood method The B-A-C order gives highest likelihood and LOD under a no interference C=1 model Most multipoint ML mapping algorithms use no interference models
Ordering Loci • GMENDEL (Liu and Knapp 1990) minimizes SARF (Minimum Sum of Adjacent Recombination Frequencies ) • PGRI (Lu and Liu 1995) minimizes SARF (Minimum Sum of Adjacent Recombination Frequencies ) or maximizes the likelihood. • RECORD (Van Os et al. 2005) minimizes COUNT (Minimum Count of Crossover Events)
Ordering Loci • JoinMap 4 (Van Ooijen, 2005) • minimizes the least square locus order using a stepwise search (regression) • Monte Carlo maximum likelihood (ML). Very fast computation of high density maps