300 likes | 407 Views
Kerstin Lindblad-Toh Whitehead/MIT Center for Genome Research. Michael Kamal Broad/MIT Center For Genome Reseach. A First Look at the Mouse Genome. Preliminary mouse genome analysis Future directions (briefly). Article available online:. http://www.nature.com/nature/mousegenome/.
E N D
Kerstin Lindblad-Toh Whitehead/MIT Center for Genome Research Michael Kamal Broad/MIT Center For Genome Reseach
A First Look at the Mouse Genome • Preliminary mouse genome analysis • Future directions (briefly) Article available online: http://www.nature.com/nature/mousegenome/
Mouse Genome Sequencing Consortium Whitehead Institute Washington University St Louis Sanger Institute EBI Draft BAC map x 6.5 x shotgun coverage x Genome Assembly x Finished sequence BAC-based coverage X Finishing C57BL/6J Female
Mouse Genome Sequencing Consortium -41 M reads -2 and 4 kb plasmids (90%) -10 kb plasmids (5%) -40 kb fosmids (5%) -155 kb and 200 kb BACs (RPCI-23 & 24) -WI 54% of reads
Assembly: 88 ultracontigs, covers 96% of genome Contig: 25 kb Super: 17 Mb Ultra: 50 Mb
Regions of conserved synteny: ~95% of genome Extremely high conservation: 560,000 anchors
Genome size: Mouse < Human (2.5 vs 2.9 Gb) Autosomes Chromosome X Expansion ratio (M/H)
Less Transposon Activity in Mouse Lineage? Total Transposon-derived Repeat 46% Human 400 Mb Mouse 37% No!!!! More Transposon Activity Lineage-Specific Repeat Ancestral Repeat Human 100 Mb Mouse Genome size: Mouse < Human (2.5 vs 2.9 Gb) More deletion in mouse
Protein-coding gene count falling (<30,000) Mouse-Human Comparison ~ 99% have homologs (maybe 100%) ~ 96% have homolog in region of conserved synteny ~ 80% have 1:1-ortholog ~22,500 evidence-based gene predictions
Gene family expansions: reproduction, immunity 25 mouse-specific gene family cluster expansions • 14 reproduction • 5 host defense, immunity
Large conserved elements: Coding, Non-coding PPARg Large conserved elements (>100 bp) Exons Non-exons 75% 90%
How much of the genome is under selection? Extremely high conservation: 560,000 anchors Less than half are coding exons (~220,000)
Given neutral substitution rate between mouse-human: Vast majority of truly orthologous sequence can be aligned! Alignable does NOT imply Functional Nucleotide-level alignment: ~40% of genomes WHYT Why so much?
Nucleotide-level alignment: ~40% of genomes WHYT Why so little? Suppose: • Ancestral genome ~2.9 Gb • New transposons are offset by deletion Ancestral genome remaining: • in human = 73% • in mouse = 57% • in both = 73% x 57% = 42%
Neutral substitution rate: ~0.46 per site 0.15 0.31 Human Mouse Substitutions in Ancestral Repeats roughly normal distribution Mouse 2x faster over 75 Myr
Neutral substitution rate: ~0.46 per site Introns Coding exons 5’-UTR 3’-UTR Upstream Downstream CpG Islands Known Regulatory
Proportion of genome under selection: ~5% Neutral sequence: Ancestral repeat Whole genome: Alignable portion Coding Exons only ~1.5% What is the rest? UTR, Regulatory Elements, RNA genes, Structural Elements? Excess Conservation
TNFα enhancer Conserved RefSeq Genscan Human Mouse ACCGCTTCCTCCACATGAGATCATGGTTTTCTCCACCAAGGAAGTTTTCCGAGGGTTGAATGAGAGCTTTTCCCCGCCC ||||||||||||| ||||| |||||| |||||||||||||||||||||||| |||||||||| ||||||||||| ACCGCTTCCTCCAGATGAGCTCATGGGTTTCTCCACCAAGGAAGTTTTCCGCTGGTTGAATGA--TTCTTTCCCCGCCC ******* ******** ********** ****** ****** ****** ******** NFat/Ets CRE k3-Nfat Ets Nfat AP1 SP1
Mouse Genome summary • 2.5Gb in size (smaller than human, due to deletion) • More lineage-specific repeats • < 30,000 genes (>99% with homologs in human) • Evolves 2x faster than human • 95% of genome in blocks of conserved synteny • 5% under selection (1.5% coding, the rest is unknown) • Large haplotype blocks of domesticus or musculus ancestry in inbred strains
Implications of mouse sequence • Cloning of Classical mutations • New Mutagenesis programs • Identification of Quantitative Trait Loci (QTLs) • Engineering Knock-outs, Knock-ins • BAC transgenics • Modeling human disease • Understanding gene regulation
Future direction • Finish mouse Genome • Sequence more mammals (dog, chimp, marsupial) • “Genomic accounting” • Identify regulatory elements • Mouse haplotype map
Genomic Alignments for Multiple Species • Sequence more mammals (dog, chimp, marsupial) • “Genomic accounting” • Identify regulatory elements • Mouse haplotype map …. integrated with gene expression analysis
Acknowledgement Washington University John McPherson Bob Waterston Whitehead Institute Kerstin Linblad-Toh Michael C. Zody David Jaffe Claire Wade Mark Daly Jade Vinson Elinor Karlsson EJ Kulbokas Nicole Stange-Thomann Rob Nicol Tim Holzer Toby Bloom Jill Mesirov Chad Nusbaum Bruce Birren Eric Lander Analysis Group David Haussler Jim Kent Arian Smit Chris Pontig Webb Miller Ross Hardison Laura Elnitsky Inna Dubchak Lior Pachter Sean Eddy Michael Brent Roderic Guigo Wayne Frankel Carol Bult Sanger Institute Jim Mullikin Jane Rogers Ensembl Ewan Birney • Mouse Liaison group • University of Oklahoma • Albert Einstein/Harvard • NIH ISC • TIGR • CHORI
Mouse Genome: http://www.ensembl.org/Mus_musculus/ • SNPs: http://aretha.jax.org/pub-cgi/phenome/mpdcgi?rtn=docs/snps