120 likes | 340 Views
The International Consortium. The International HapMap Project. Nature. 426, 789-796 (2003). The International HapMap Project. Launched October 29, 2002 Major initiative to map human genetic variation based on haplotype patterns.
E N D
The International Consortium. The International HapMap Project. Nature. 426, 789-796 (2003)
The International HapMap Project • Launched October 29, 2002 • Major initiative to map human genetic variation based on haplotype patterns. • Characterize sequence variants, their frequencies, and correlations between them. • Serve as a key resource for finding genes that affect health, disease, and drug response.
Direct Approach: Laborious and Expensive • Whole genome sequencing of numerous patient samples to identify candidate variations • Test each variant for correlation with a disease. • Genotyping 3 million SNPs in 1000 people = 3 billion separate genotyping assays
Indirect Approach: Efficient and Comprehensive • Relatively small set of variants will capture most common variation patterns. • Linkage Disequilibrium (LD) in SNPs = few haplotypes in many chromosome regions • A set of sequence variants serve as genetic markers to detect association between a particular genomic region and disease.
HapMap • A few common haplotypes among many chromosome regions account for most of the variation in the human genome. • Human genome can be divided into 200,000 haplotype blocks. • Identify 200,000 to 1 million tag SNPs • Efficient and comprehensive
SNPs, Haplotypes, and Tag SNPs Genotyping 3 tag SNPs out of 20 SNPs is sufficient to distinquish one haplotype from another.
Haplotype Map: Search for genes on Chromosome 5 Related to Crohn’s Disease Haplotype blocks contain 2-4 flavors of SNP combinations ( orange, purple, etc. ) Dashed lines indicate relationships between blocks Percentages indicate occurrence of each SNP set in patients.
DNA Samples and Populations • Population Sampling: samples chosen from particular populations based on ethnicity and geography. Ancestry Location Number of Samples N & W European United States 90 African Ibadan, Nigeria 90 ( 30 trios ) Japanese Tokyo, Japan 45 ( unrelated ) Chinese Beijing, China 45 ( unrelated ) • Include a substantial amount of genetic variation • Trios and unrelated individuals: local LD patterns. • Unrelated DNA samples: identify 99% of haplotypes, frequency of 5% or greater in a population
SNP Selection • High density of SNPs to adequately describe genetic variation • LD and haplotype density varies 100 fold across the genome. • Hierarchical strategy will allow regions of the genome with the least LD to be characterized with higher SNP density. Priorities Verified SNPswith available allele frequency and genotyping data Double-hit SNPs seen twice in two different DNA samples SNPs that cause amino acid changes
GENOTYPING • 10 genotyping centers: Japan, UK, Canada, China, US, and Nigeria • 5 high-throughput genotyping technologies • Performance criteria: 1. Data produced must be 99.2% complete & 99.5% accurate. 2. All experiments must include samples for internal quality checks. 3. Samples of SNP genotypes from each center re-genotyped by other centers. • Various platforms allow for comparisons for accuracy, success rate, throughput, and cost. • Complete and reliable data production
DATA ANALYSIS • Analyze LD between markers. • Measure proportion of common ancestral chromosomes that have not recombined • Sliding window LD profiles • LD unit maps • Haplotype blocks • Meiotic recombination rates • Statistical methods, replication studies, and functional analyses of variants – confirm the findings and identify functionally significant SNPs.