1 / 25

Mapping Genes in Complex Diseases: Insights from HLA Region Analysis

Explore the challenges in mapping multiple sclerosis (MS) susceptibility genes beyond the HLA region using traditional methods. Learn about linkage versus association studies, linkage disequilibrium, case-control studies, and the transmission-disequilibrium test in genetic epidemiology. Discover the importance of haplotype mapping and identifying associations between marker haplotypes and disease. Address issues of population structure and the need for controls in studies of genetic diseases.

maura
Download Presentation

Mapping Genes in Complex Diseases: Insights from HLA Region Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. 1 This leads me onto our first piece of work which as I have mentioned previously was conducted on the HLA region, a genomic region known to contain a gene or genes MS.This leads me onto our first piece of work which as I have mentioned previously was conducted on the HLA region, a genomic region known to contain a gene or genes MS.

    2. Mapping genes contributing to complex diseases

    3. 3 MS susceptibility genes are difficult to map MS is a complex disease. Analyses with traditional methods such as single marker association studies and standard linkage approaches (affected sib-pairs, pedigrees etc) have failed to agree on genomic regions other than the HLA region. There are a variety of possible reasons for this: Allelic and locus heterogeneity (no single gene model fits all) Significant environmental influences Imprecise phenotyping

    4. 4 Linkage vs Association studies Linkage mapping: tests for cosegregation of a marker allele with the disease within families Association mapping: seeks a marker allele that is present more frequently in cases than in controls; all affected individuals are treated as distant relatives Case/control studies Transmission disequilibrium test (needs triads) We will do a quick review of association mapping before turning to our MS study.

    5. 5 Linkage disequilibrium Suppose that we have a marker with just two alleles, M and m say, having frequencies p and 1-p, and a (not necessarily linked) disease locus with alleles D and d, having frequencies q and 1-q. A (haploid) gamete must have one of the four combinations (haplotypes) DM, Dm, dM or dm. Let the frequencies in a population of these four haplotypes be x1, x2, x3 and x4 . Under independence, we would have x1 = pq, etc. Deviations of the observed haplotype frequencies from these products is termed linkage disequilibrium (LD), or, better, gametic association. If inheriting the allele D at the disease locus increases the chance of getting the disease, and the disease and marker loci are in LD, then the frequencies of the marker alleles M and m will differ between diseased and non-diseased individuals. This observation is the basis of association studies.

    6. 6 Case-control studies in genetic epidemiology Case-control studies compare case and control allele frequencies at markers or candidate genes (the “exposure” variables). All the standard potential drawbacks of such studies apply, with the similarity of the two base populations being the most critical here. It is thought to be relatively easy for samples from racially mixed populations to differ in allele frequencies, and hard to deal with this in the genetic context. Key term: population structure. If our cases are MS patients, who are our controls? It would be rare for a study to be able to afford or get ethics approval to carry out random sampling of the relevant background population. More commonly, controls are people such as blood donors, whose blood (DNA) has been collected for other purposes. How close will they be to a random sample from the case population? In an effort to deal with this, the TDT which follows in effect uses untransmitted genotypes as controls, bypassing any population structure.

    7. 7 The transmission-disequilibrium test The TDT, as it is called, in its simplest form, starts with parents and an affected child, and considers a biallelic marker locus at which all three are typed, and we can determine which maternal and paternal alleles were transmitted, and which were not. For example, if the parents were a1/a2 and a1/a1 , and the affected offspring was a1/a2 , then a2 was transmitted and a1 was not transmitted by the first parent. From a random sample of such trios (called triads), a 2?2 table can be built up giving the number of times a1 and a2 were transmitted and were not transmitted, respectively, and a simple test can be derived. Many generalizations of this procedure now exist, see notes for Stat 260, 1998 Week 5.

    8. 8

    9. 9 Haplotype mapping If alleles at a disease locus are associated with alleles at one nearby marker locus on gametes, they are likely to be associated with alleles at other nearby marker loci, and hence with marker haplotypes. A potentially more powerful way to locate disease genes is to search for associations between marker haplotypes and disease. There are two possible problems here stemming from the fact that there can be a very large number of marker haplotypes: we may have to deal with very small frequencies, and we have a multiple testing problem.

    10. 10 Searching for common or rare haplotypes in cases alone is one form of association mapping. It has been successful, as very substantial LD can arise around disease loci. In general controls are necessary as the background LD can be large. That is, there can be substantial LD between putative disease gene alleles and alleles of nearby markers, without there being any causal link between the gene and the disease. We call this background LD. Background LD can be large when the population is young when the # of founders is small (bottlenecks) through admixture of populations

    11. 11 Exercises on LD 1. Under a random mating assumption, the long term values of the frequencies x1, x2, x3 and x4 on page 5 above are pq, (1-p)q, p(1-q) and (1-p)(1-q). (Week 5, Stat 260, 1998). 2. Demonstrate that a mixture (e.g. 50:50) of two populations initially in linkage equilibrium at two loci, will typically not be in LE. 3. Explain why a single mutant arising by chance, will initially be in strong LD with alleles at loci near the locus on which it arises.

    12. Mapping MS genes in Tasmania

    13. 13

    14. Tasmanian Population Growth

    15. 15 There are a number of reasons why Tassie is a good place to study the genetics of MS: There are a number of reasons why Tassie is a good place to study the genetics of MS:

    16. 16

    17. 17 What might have happened in the population?

    18. Design of the Tasmanian MS study

    19. 19 What strategy could be used to map MS susceptibility genes in Tasmania? Too few affected sib pairs/multiplex families for a conventional linkage approach Prefer a model free (non-parametric) approach A haplotype-based case-control study design seemed appropriate

    20. 20 MS study in Tasmania: design Collect as many MS cases with ancestral links to Tasmania as possible, and a suitable (not necessarily equal) number of similar, socioeconomically and geographically matched unrelated controls Around each case and each control, collect a constellation of ~ 4 close relatives for (probabilistic) haplotype reconstruction Infer genome-wide haplotypes for all cases and controls Carry out a case/control study with the haplotypes, seeking regions of the genome shared more by the cases, in comparison with the controls

    21. 21 Analysis options

    22. First mathematical questions Resolution of genome-wide scan (length of likely shared chromosomal segments) Nature and number of relatives needed to permit the reconstruction of accurate haplotypes with high probability

    23. Average length of shared chromosomal segments Exercise. Assume the Poisson model for crossovers along a chromosome. What is the mean and variance of the length in cM of the chromosomal segments shared by individuals with a common ancestor 7 generations back?

    24. Nature and numbr of relatives needed to give accurate haplotypes Exercise. Explain why it is that when we have both sets of parental genotypes, and the markers are reasonably polymorphic, we can reconstruct an individual’s haplotypes with high probability. What are the difficult cases? If we have no parents, or just one parent, and grandparents’, siblings’ or offsprings’ genotypes are available, which are most informative for an individual’s haplotype reconstruction?

    25. 25 Reconstructing haplotypes from genotypes Observe genotyping data for an individual

    26. 26 Genotyping Use STR (short tandem repeat) also known as microsatellite markers

More Related