210 likes | 486 Views
Describe parametric linkage analysis and how to determine a Lod score. Mike Oldridge FRC path 17/12/2010. Linkage Analysis. Linkage is the tendency of characters to co-segregate in a pedigree because their determinants lie close together on a particular chromosome
E N D
Describe parametric linkage analysis and how to determine a Lod score Mike Oldridge FRC path 17/12/2010
Linkage Analysis • Linkage is the tendency of characters to co-segregate in a pedigree because their determinants lie close together on a particular chromosome • Recombination between two genetic loci occurs at a rate related to the distance between two loci – the closer they are, the less likely recombination will occur between them and the more likely they will be inherited together • Follow the segregation of a genetic marker with the disease trait within the pedigree • Score meioses in a pedigree as either recombinant or non-recombinant • Calculate likelihood of the observed result fitting with the hypothesis of linkage of the marker to the disease
Recombination • At meiosis 1 chromosomes have replicated but remain attached as sister chromatids. It is when homologous chromosomes pair up that crossing over or recombination takes place • Involves the breaking of the DNA helix in one of the sister chromatids from both homologues and exchange of chromosomal segments in a reciprocal fashion to form two new chromatids • Each pair of homologues is thought to undergo at least one recombination event and sometimes several during meiosis
Recombination fraction (θ) The recombination fraction θ is the probability of recombination occurring between 2 loci. eg. θ = 0.2 is equivalent to 20% of meioses being recombinant If loci are identical θ = 0 as there can be no recombination between them If loci are far apart on same chromosome θ = 0.5 indistinguishable from loci on different chromosomes as loci will segregate independently and there is 50% probability that they will be inherited together Max value for θ is 0.5 2 loci which show 1% recombination between them (θ=0.01) are defined as being 1 centimorgan (cM) apart on a genetic map Related to physical distance but correlation between the 2 varies across the genome as recombination rates vary at different regions of the genome
Genetic Markers • In order to determine recombinations, informative meioses are required ie. need to be able to tell with certainty whether the disease is linked to a particular marker • Polymorphic markers of known chromosomal location spread evenly throughout the genome (<20cM apart) • RFLPs – usually only 2 polymorphic alleles, laborious to produce results (RE digest, Southern blotting) • Microsatellites – di-, tri- and tetra-nucleotide repeats: Multiple polymorphic alleles (highly informative), abundant, evenly spaced, easily typed using multiplex PCR • SNPs – only 2 polymorphic alleles ( less informative) but vast numbers and densely packed therefore can be highly informative. Easily typed using microarrays (500,000 snps in one operation)
Understanding phase A. Phase known II-1 has inherited A1 from I-2 along with disease trait can assume disease is linked to A1 allele. Can now identify III-1 – III-5 as nonrecombinants and III-6 as recombinant NR NR NR NR NR R NR NR NR NR NR R R R R R R NR B. Phase unknown It is unknown whether the disease trait is linked to A1 or A2 in II-1. Either linked to A1 III-1 – III-5 are nonrecombinants and III-6 is recombinant Or linked to A2 III-1 – III-5 are recombinants and III-6 is nonrecombinant
Parametric Linkage Analysis • Parametric linkage analysis is standard Lod score analysis • Requires a precise genetic model: • Mode of inheritance • Disease allele frequencies • Penetrance of each genotype • Powerful model for determining linkage of single gene Mendelian disorders
Lod scores Statistical measure of the likelihood of genetic linkage between 2 loci Lod = Logarithm of Odds Lod (Z) = Log10the odds that loci are linked with recombination factor θ the odds that loci are unlinked (θ=0.5) Lod scores are calculated across a range of θ from 0 to 0.5 and the maximum value determined (Zmax)
Lod scores Z >3 indicates statistically significant linkage Z=3 implies ratio of 1000:1 odds of linkage to unlinked However, there is a prior probability that 2 randomly chosen loci will be linked (~1/50) Therefore Z=3 gives odds of 20:1 (1000/50) which corresponds to conventional threshold of significance p=0.05 Z < -2 indicates statistically significant exclusion of linkage As Lod scores are logarithmic, scores from different families can be added together (assuming same disease locus!) Simplest use of Lod scores is to measure the likelihood of linkage of the disease locus to each polymorphic marker in turn – Two point linkage analysis
Two point analysis NR NR NR NR NR R NR NR NR NR NR R R R R R R NR Phase unknown Lod = log10[(1-)NR.R + (1-)NR.R]/2 0.5NR+R eg at = 0.1 Z = log10 ½ [(0.9)5x0.11 + (0.91x0.15)]/0.56 = 0.276 Phase known Lod (Z) = log10(1-)NR. R 0.5NR+R eg at = 0.1 Z = log10 [(0.9)5x0.11]/0.56 = 0.577
Two point analysis Graphs of lod score against recombination fraction from a hypothetical set of linkage experiments. Curve 1: evidence of linkage (Z > 3) with no recombinants. Curve 2: evidence of linkage (Z > 3) with the most likely recombination fraction being 0.23. Curve 3: linkage excluded (Z < -2) for recombination fractions below 0.12; inconclusive for larger recombination fractions . Curve 4: inconclusive at all recombination fractions.
Multipoint analysis Process by which the disease locus is tested across a framework of markers and its most likely position calculated More information can sometimes be extracted than from two-point mapping as informative meioses from nearby markers can give information about uninformative meioses when analysed together Multipoint mapping is also a powerful tool in exclusion of loci Eg. a meiosis is uninformative with one marker but is informative and shows recombinants for nearby flanking markers - either the meiosis is recombinant for the middle marker or a double recombination event has occurred and it is a non-recombinant. The closer together the flanking markers are then the more likely it is that the meiosis with the middle marker is also a recombinant. This can dramatically reduce the Lod score and eliminate false positive results
Multipoint analysis family 1 10 family 2 9 family 3 8 families 1 + 2 7 6 5 4 3 LOD score 2 1 0 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 1.2 -1 -2 -3 -4 -5 D9S1680 D9S1796 D9S1809 D9S1786 D9S257 D9S118 D9S15 Map distance (M)
Potential problems Computer programs needed to do all but the most basic pedigrees. Eg MLINK (2 point) and LINKMAP (multipoint) from the FASTLINK software package (Lathrop and Lalouel 1984) Computational limits – needs vast amounts of processing power for large complex pedigrees Errors in genotyping and misdiagnosis can generate spurious recombinants – may produce false negatives Locus heterogeneity Precise genetic model required including mode of inheritance, gene frequencies, penetrance
References • Human Molecular Genetics 4th ed - Strachan and Read • Lathrop GM, Lalouel JM (1984) Easy calculations of lod scores and genetic risks on small computers. Am J Hum Genet36:460-465
Key words Meisoses Recombination Genetic markers Recombination fraction θ Linkage Phase known/unknown Lod scores 2 point analyses Multipoint analyses