1 / 39

The Architecture of Linkage Disequilibrium Blocks Across Chromosomes 6, 21, & 22

The Architecture of Linkage Disequilibrium Blocks Across Chromosomes 6, 21, & 22. Linkage Disequilibirum Mapping. Linkage disequilibrium mapping is based on the allelic association between a SNP surrogate marker and the phenotype influencing mutation being sought

haven
Download Presentation

The Architecture of Linkage Disequilibrium Blocks Across Chromosomes 6, 21, & 22

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Architecture of Linkage Disequilibrium Blocks Across Chromosomes 6, 21, & 22

  2. Linkage Disequilibirum Mapping • Linkage disequilibrium mapping is based on the allelic association between a SNP surrogate marker and the phenotype influencing mutation being sought • For it to work, a predictable relationship between LD strength and distance is required

  3. LD mapping: We rely on allelic association 200 kb Sense genes DNA Causative SNP Antisense genes SNPs Marker SNP 100% 50% 0%

  4. LD-distance relationship Stephens et al, 2001

  5. Jeffreys et al, 2001

  6. Daly et al., 2001

  7. LD “Block” Structure 200 kb Sense genes DNA Causative SNP Antisense genes SNPs Marker SNP

  8. LD-distance relationship revisited

  9. Haplotype Block Structure 200 kb Sense genes DNA Antisense genes SNPs Haplotype blocks 1 2 3 4

  10. Patterns of LD across chromosomes 6, 21, & 22 • Haplotypes as the basic elements of genetic variation • Consideration of LD patterns necessary to design effective association studies • Understanding of LD patterns across the genome should be helpful for: • Selection of optimal SNP subsets for a given population • Support the data analysis of LD mapping studies • Estimate the necessary number of SNPs for whole-genome LD mapping studies

  11. Patterns of LD across chromosomes 6, 21, & 22 • Find Linkage Disequilibrium “blocks” • Develop block descriptors • Block length • Haplotype diversity • Compare blocks across populations and chromosomes • Find correlations between block descriptors and other chromosomal features • Comparison with published data

  12. Datasets used for this study • Chr 6 • SNPs: 7255 African-American; Caucasian (45 samples each, ~95% call rate); average spacing 24 Kb • 1281 gene regions targeted • Coverage: ~123 Mb out of 172.2 Mb (71%) • Chr 21 • SNPs: 7049 African-American; 7255 Caucasian; average spacing ~12 Kb • 264 gene regions targeted • Coverage: ~28.38 Mb out of 33.6 Mb (84%) • Chr 22 • SNPs: 3653 African-American; 4040 Caucasian (2334 common); average spacing ~10 Kb • 624 gene regions targeted • Coverage: ~28.6 Mb out of 36.6 Mb (78%) • Total coverage: 179.98 Mb – about 6.4% of genome

  13. “Covered” segments: Chr 22

  14. SNP spacing in covered areas for Chr 22

  15. LD “block” definitions • Chromosome segments where allelic association among SNPs show little historical recombination • Low haplotype “diversity” • Methods used for block definition • LD based method • Typically using D’ as association statistic, plus some statistical significance test • Haplotype diversity-based method • Sweeping-window inference of haplotypes • Recombination evidence-based methods • Four gamete rule • Note that in most situations one is dealing with unphased genotype data

  16. LD “block” definitions (2) • Gabriel et al. block criteria • D’ 95% confidence interval: upper bound >0.98, lower bound >0.7 • Minor allele frequency > 10%, HWE test p<0.01 • Heuristic used for blocks of <4 SNPs • Previously they used D’>0.8, but D’ is biased towards high values with small sample size and low allele frequency • Block criteria • D’ >0.9, p-value < 0.001 Fisher exact test • Minor allele frequency > 10%, HWE test p<0.01 • Both definitions allow for small amounts of recombination, including gene conversion, and occasional genotyping errors • Notice that these thresholds are somewhat arbitrary

  17. LD block finding workflow

  18. Caucasians African-Americans 92 Kb Gap: 12 Kb 75 Kb Example of LD blocks Haplotypes 180 Kb region of Chromosome 6

  19. Summary statistics of LD blocks

  20. Distribution of block lengths

  21. LD block inference using common SNPs All available SNPs 401 blocks Common SNPs 276 blocks

  22. LD block quality assessment • Tested the blocks found by Block method for violations of the 4-gamete rule • Perform test among all pairs within a block • For each violation within a block, find the haplotype which carry the pair causing the violation (i.e. lowest frequency) • Report the highest haplotype frequency found

  23. Block quality: Violations of 4-gamete rule

  24. LD block overlap between populations

  25. Comparison between different LD block definitions D’ >0.9; p<0.001 401 blocks 95% C.I. 0.98-0.7 417 blocks

  26. Gabriel et al. 4.8 Mb in blocks 3.91 Mn shared 350 blocks 40 independent blocks 889 Kb unique Block method 4.55 Mb in blocks 3.91 Mb shared 335 Blocks 20 independent blocks 649 Kb unique Comparison of different block definitions

  27. Haplotype inferences • Infer common haplotypes (freq > 5%) within blocks • Haplotype population frequencies • No need to assign haplotypes to individuals for this study • Fast, scalable method (up to 10-15 SNPs) • Good accuracy

  28. Measuring haplotype blocks diversity • Number of haplotypes • Heterozygosity • Polymorphic Information Index (PIC) • Shannon Entropy • Where Pi is the frequency of the ith haplotype • H increases with haplotype number and their evenness of their frequencies

  29. Minimum SNPs per block

  30. Looking for correlations with chromosomal features • GC content, GpC islands, runs of bases • Repetitive elements density (LINE & SINE) • Recombination rate (from deCode & TSC genetic maps) • Chromosomal location (telomere vs. centromere) • Transcriptional activity (MPSS tags) • Intron/exons, gene length • Segmental duplications

  31. Visualization of LD blocks on a chromosome scale • Looking at the patterns of LD across the whole chromosome • Block size & diversity • Allow to pinpoint interesting regions • Clustering of blocks • Unusual block diversity • Allow to find correlations with other chromosomal descriptors

  32. Chr 22 LD Block Profile

  33. Chr 22 LD Block Profile

  34. Chr 21 LD Block Profile

  35. Chr 6 LD Block Profile

  36. Chr 22: Different block definitions D’ >0.9; p<0.001 95% C.I. 0.98-0.7

  37. Chr 22: Different block definitions D’ >0.9; p<0.001 95% C.I. 0.98-0.7

  38. Chr 22: Common SNPs All available SNPs Common SNPs

  39. Conclusions • LD block distribution across chromosomes is not uniform • Hot and cold spots evident considering both block length & haplotype diversity (H) • Related to some extent to recombination rate • There are differences on block structure and distribution among the two populations studied • Caucasians have more blocks of greater average length • African-American blocks are usually nested within Caucasian blocks • There are private block spans in both populations, but Caucasians have more unique blocks • Different block definitions tested does not change the picture dramatically, but there are differences • Definitions are arbitrary and one needs to look at them from the practical point of view

More Related