230 likes | 410 Views
Towards a Complete Mouse Haplotype Map. Mathew Pletcher Genomics Institute of the Novartis Research Foundation. SNP Discovery. Selected 2800 loci based on RH map positions Sequenced loci from 8 inbred strains ( A/J, BALBc/ByJ, CAST/Ei, C3H/HeJ, C57BL/6J, DBA/2J, SPRET/Ei, 129SvIm/J )
E N D
Towards a Complete Mouse Haplotype Map Mathew Pletcher Genomics Institute of the Novartis Research Foundation
SNP Discovery • Selected 2800 loci based on RH map positions • Sequenced loci from 8 inbred strains • (A/J, BALBc/ByJ, CAST/Ei, C3H/HeJ, C57BL/6J, DBA/2J, SPRET/Ei, 129SvIm/J) • Identified approximately 15,500 SNPs – all publicly available at dbSNP and the Jax Phenome Database • SNPview – a viewer to display SNP distributions • and patterns (http://www.gnf.org/SNP/)
assembly chromosome position strains view SSLPs SNPview (http://www.gnf.org/SNP/)
SNP Distribution and Analysis for Chr. 4 SNPs can be viewed as: SNP alleles Major and minor alleles Haplotypes BALB/cByJ A/J DBA/2J C3H/HeJ C57BL/6J 129/SvImJ CAST/Ei SPRET/Ei 211 loci BALB/cByJ A/J DBA/2J C3H/HeJ C57BL/6J 129/SvImJ BALB/cByJ A/J DBA/2J C3H/HeJ C57BL/6J 129/SvImJ 0 30.4 60.8 91.2 131.6 162 Pos, Mbp
400 300 DBA/2J/A/J SNPs/500 kbp 200 100 0 A/J DBA/2J 0 0 5 5 10 10 15 15 20 20 25 25 30 30 35 35 40 40 45 45 50 50 55 55 60 60 65 65 70 70 75 75 80 80 85 85 90 90 95 95 700 600 C57BL/6J/A/J 500 SNPs/500 kbp 400 300 200 100 0 A/J C57BL/6J Chromosome Length Mbp SNP Density on Chromosome 16 Celera data
100 90 80 C57BL/6J/A/J 70 C57BL/6J/DBA/2J C57BL/6J/129X1/SvJ A/J/DBA/2J 60 A/J/129X1/SvJ DBA/2J/129X1/SvJ 50 %bins/<100 SNPs 40 30 20 10 0 Celera data Chr.1 Chr.4 Chr.6 Chr.5 Chr.7 Chr.2 Chr.3 Chr.12 Chr.16 Chr.14 Chr.11 Chr.19 Chr.X Chr.8 Chr.18 Chr.9 Chr.17 Chr.13 Chr.10 Chr.15 Chr. 10 haplotypes BALB/cByJ A/J DBA/2J C3H/HeJ C57BL/6J 129/SvImJ 0 10 20 30 40 50 60 70 80 90 100 110 120 130 Pos, Mbp Genome-wide SNP Density
Chr. 12 290 SSLPs 15 SSLPs(2) 39 SSLPs(4+) BALB/cByJ A/J 0 22.8 45.6 68.4 91.2 114 BALB/cByJ A/J 165 loci Pos, Mbp Chr. 5 436 SSLPs 25 SSLPs(2) 73 SSLPs(4+) BALB/cByJ A/J 0 29.7 59.4 89.1 118.8 148.5 BALB/cByJ A/J 227 loci Pos, Mbp Distribution of SNPs and SSLPs For two strains (A/J/BALB/cByJ), with 1112 randomly distributed SNPs the mean expected block length is 2.3Mb. Probability of the largest block 40Mb (2.8e-5), 60Mb (4.2e-9), 80Mb (6.4e-13),100Mb (9.6e-17) • Co-distribution of SNPs and SSLPs • Chr. 5 contains many short blocks of common and disparate haplotypes • also shows a co-distribution of SSLPs and SNPs. In 5, 2, and 0.5 Mb bins • co-distribution of bins was evaluated (1-P=10-27, 10-24 and 7.4x10-11)
Utility of Mouse Haplotype Map Chr. 10 haplotypes • Mapped mutant to 20MB region on MMU10 • Shared haplotype made identifying new markers difficult • Used haplotype map to identify most compatible mapping partner for narrowing region
Extending the Haplotype Map to the Phenome Project A/J LP/J AKR/J MA/MyJ BALB/cByJ MAI/Pas BTBR MOLF/Ei BUB/BnJ MSM/Ms C3H/HeJ NOD/LtJ C57BL/10J NON/LtJ C57BL6J NZB/BlNJ C57BLKS/J NZW/LacJ C57BR/cdJ PERA/Ei C57L/J PL/J C58/J PWD/Ph CAST/Ei RIIIS/J CBA/J SEA/GnJ CE/J SEG/Pas CZECHII/Ei SJL/J DBA/1J SM/J DBA/2J SPRET/Ei DDK/Pas ST/bJ FVB/NJ SWR/J I/LnJ WSB/Ei JF1/Ms ZALENDE/Ei KK/HlJ 129S1/SvImJ LG/J 129X1/SvJ Beck et al., Nature Genetics (2000) vol. 1, 23–25
Initial Set Selection • Goal – To pick an evenly spaced SNP set from the Celera collection that are from unique haplotypes and not recent strain specific mutations • Preference was given to SNPs where each allele was found in two strains (DBA, 129X1, A/J, B6) • Worked on assumption that selection criteria would favor older SNPs – musculus vs. domesticus
Initial Set Make-up • Initial design produced nearly 5000 assays • 941 SNPs represent cSNPs (splice site, mis-sense, or nonsense mutations) • 1 gap bigger than • 5MB • Biggest Gap – • 7.4MB on MMUX
Preliminary Data from SNP Assays • Over 2300 SNP assays contain data for 90% of DNAs – Over 4500 assays contain data for at least 24 of the strains • Average of 1200 SNPs found between any 2 strain pair combinations • Average SNP frequency of roughly 40% in pairwise comparisons
C57 Breeding History MA/MyJ C57BR C57BL/6 C57BL/10 C58/J C57L/J C57BLKS Beck et al., Nature Genetics (2000) vol. 1, 23–25
Comparison of SNP-based Clustering to Breeding History MA/MyJ MA/MyJ C57BR C58/J C57BL/10 C57BL/6 C58/J C57L/J C57L/J C57BR C57BLKS C57BL/10 C57BL/6 C57BLKS Beck et al., Nature Genetics (2000) vol. 1, 23–25
SNP Data Detects Common Heritage Beck et al., Nature Genetics (2000) vol. 1, 23–25
Prager et al. (1998) Mouse Genetics and Phylogeography Relationship of Wild-derived Mouse Strains M.m. musculusM.m. molossinusM.m. domesticusM.m. castaneusM. spretus CZECHII/Ei JF1 PERA/Ei CAST/Ei SPRET/Ei MAI/Pas MSM/Ms WSB/Ei SEG/Pas PWD/PH MOLF/Ei ZALENDE/Ei
Wade et al. (2003) Nature 420, 574-578 Silver (1995) Mouse Genetics: Concepts and Applications Ogura et al. (2003) Genomics 81, 369-377 Lack of Diversity Between M.m. musculus and M.m. molossinus SNP Frequency M.m. musculus M.m. musculus M.m. molossinus M.m. molossinus M.m. molossinus M.m. musculus
? M.m. domesticus A Domesticus Haplotype SNP Frequency M.m. domesticus M.m. domesticus M.m. musculus M.m. domesticus US Switz. Peru
Future Steps • Do survey sequencing of wild-derived strains to assess true diversity • Continue to fill in gaps in data set and continue to expand SNP collection • Attempt to use data set to map phenotypic variants
Acknowledgements Tim Wiltshire Whitney Barnes Patrick Merritt Candace Motta Franzmarie Lippincott Deborah Stradley Niusha Ziaee Serge Batalov Steve Kay