620 likes | 777 Views
A Tale of Two Worms: Comparing the Genomes of C. elegans & C . briggsae. Lincoln Stein Cold Spring Harbor Laboratory. My Lab. International HapMap Project. Find common regions of genetic variation in human population to reduce cost of genetic association studies.
E N D
A Tale of Two Worms:Comparing the Genomes ofC. elegans & C. briggsae Lincoln Stein Cold Spring Harbor Laboratory
International HapMap Project • Find common regions of genetic variation in human population to reduce cost of genetic association studies. • Reduce cost of genetic association studies. • 600,000 SNPs x 270 individuals
Comparative genomics among monocots Rice as model system Rice genome, maps, proteins, mutants, QTLs, phenotypes Map alignments to maize, wheat, oats, barley &c. Gramene
Genome KnowledgeBase • Biological pathways in human • Curated by experts in the field • Designed for • Education • Data mining & discovery • Open data/Open software
WormBase • Community database for C. elegans • C. elegans genome • C. briggsae genome • Genetic maps • Developmental anatomy • RNAi screens • Microarray screens • Evolutionary biology
Generic Model Organism Db • Reusable software for building model organism databases • Used by WormBase, FlyBase, Gramene, RatDB, SGD, MGD… • Genome browsers, genetic maps, curation tools…
200 µm 200 µm hermaphrodite male (lateral) male (ventral) 10 µm 10 µm
Sequencing C. briggsae • Isolate DNA, make libraries (2 mo) • Map libraries (4 mo) • Shotgun sequence genome (1 wk) • Assemble genome (6 mo) • Analyze genome (9 mo)
The Draft BAC Map Sequence Contigs Supercontigs Scaffolds Jim Mullikin, Sanger Center; LaDeana Hillier, WUSTL
briggsae genes: “hybrid” strategy Elegans predictions Briggsae predictions Avril Coghlan, University of Dublin
How accurate is it? • C. elegans gold standard • 2,257 genes entirely confirmed by mRNA data • Results on C. elegans set • 92% of time, hybrid method picked the “gold standard gene” correctly • 32 genes incorrectly split into 2 or more predictions (1.4%) • 49 genes incorrectly merged into 1 prediction (1%)
(Todd Harris, CSHL) best similarity match Use colinearity to resolve ambiguities best similarity match Identifying Orthologs ortholog pair Cb Ce ortholog vs paralog?
Comparing Orthologs • 12,155 orthologs • 807 C. briggsae “orphans” • 1,061 C. elegans “orphans” • Divergence date: 80-110 Mya • All genes under various degrees of purifying selection (Todd Harris, Jason Stajich)
Orthologs Similar but Differ in Detail Briggsae has 1 new intron every 5th gene.
Comparing Gene Families:TRIBE-MCL Cluster 3 Cluster 2 Cluster 1 Cluster 4 Cluster 5
Cb/Ce protein clusters 2169 clusters of >= 2 members 24% of elegans single-copy genes 28% of briggsae single-copy genes (Jason Stajich, rotation student)
Protein Family Clusters >200 clusters unbalanced by more than 2-fold
A Rapidly Evolving Family:Olfactory Receptors PFAM Class C. elegansC. briggsae 7tm_4 269 222 7tm_5 322 163 sra 37 18 srb 16 12 sre 55 51 srg 32 30 Total 718 476
Sra Olfactory Receptor Family Putative ortholog pair elegans exclusive subtree
Synteny: Aligning C.b. to C.e. Type Intergen Upstr Downst CDS Intron 5' UTR 3'UTR Repeat Total Strong 61,615 27,512 30,192 49,358 114,323 2,783 7,239 28,313 321,335 Coding 41,817 11,600 15,571 152,086 49,135 855 1,557 12,095 284,716 Weak 115,200 53,189 59,542 188,601 250,603 5,885 11,823 49,624 734,467 TOTAL 218,632 92,301 105,305 390,045 414,061 9,523 20,619 90,032 1,340,518 (Todd Harris & Jason Stajich)
Synteny Reconstruction raw aligned segments (WABA) Merge overlaps Merge adjacent merged segments Reconstruct interrupted segments reconstructed segments (Yours truly)
Reconstructing briggsae • 4,837 reconstructed segments • ~85% of genome • 0.5-0.7 bkpts/Mb/My)
Rearrangement is Local Junctions of elegans chromosomes on briggsae contigs
Rearrangement is Local Junctions of elegans chromosome arms on briggsae contigs
big map Syntenic blocks Genes & meiotic map Orthologs Orphans Essential genes Repetitive elements KA/KS KS
Recent Work:Chemosensory Receptors • Third largest C. elegans protein family. • Subclass of GPCR 7TM receptors. PFAM Class C. elegans C. briggsae 7tm_4 269 222 7tm_5 322 163 sra 37 18 srb 16 12 sre 55 51 srg 32 30
Questions • Are these differences real? • Mechanism of the differences? • Amplification vs gene loss • Why are some subfamilies unbalanced and not others? • Phenotypic consequences of the differences?
Sra Olfactory Receptor Family Putative ortholog pair elegans exclusive subtree
sra sra sra sra sra sra sra sra sra sra sra sra similarity searching elegans genome briggsae genome new new new new new new Are the Differences Real? • Intensive search for missing sra family members. C. elegans C. briggsae (Jack Chen, Postdoc; Shraddha Pai, URP)
Results • Family size differences real (still roughly twice as manyelegans sra as briggsae sra) • Differences due to species-specific tandem duplications, not due to conversion into pseudogenes. • But…
Hitting non-sra elegans genes? 18 non-sra C. elegans genes 17 non-sra C. briggsae genes
A New Nematode Chemosensory (Sub)family? sra family sra-like genes
C36C5.7 7TM Domain Structure • Most candidates showed 7 transmembrane domain signatures characteristic of GPCR membrane receptors.
missed exon Repairing Incomplete Genes
Before & After Before repairing: 6 TMs After repairing: 7 TMs
Expression Patterns Co-Cluster with sra Family Genes sra-like genes sra genes Kim et alScience, 293: 2087-2092. 2001