190 likes | 280 Views
Other Genome Projects. BIOL 473 Summer 2003. Why Other Genomes?. Proof of principle Refinement and advancement of technology “Relatively simple” data management Models of human disease Easy/inexpensive to culture/grow Many mutant strains/lines already identified.
E N D
Other Genome Projects BIOL 473 Summer 2003
Why Other Genomes? • Proof of principle • Refinement and advancement of technology • “Relatively simple” data management • Models of human disease • Easy/inexpensive to culture/grow • Many mutant strains/lines already identified
Importance of Mutant Organismsin Identification of Gene Function Mutant Molecular Defect Gene Function
Rodent Genome Projects • ~100 years of genetic research to support genomic findings • Hundreds of mutant strains, well-characterized genealogies of common strains (esp mice) • Evolutionary position relative to human: • Close: similar development, physiology & disease • Divergent: conserved blocks of sequence suggest essential function
Rodent/Human Genome Comparison • Extensive conservation of nucleotide sequences • Protein-coding regions (genes) • Small, noncoding intergenic regions (SNCIRs) • Suggests unknown but important function, perhaps regional control of multiple gene expressions • Extensive conservation of gene order (synteny) • MMU11 syntenic with HSA5 at 1 MB IL region: Perfect correspondence of order, orientation, and spacing of 23 different genes • Supports common ancestry • Suggests segmental rearrangement of chromosomes during evolution
Zebrafish (Danio rerio) • Development rapid and transparent • Easy to grow • Dense map of genetic markers • Many species-specific cell biology tools • Including human gene transfer • Including RNAi • Including organogenesis pathways • Significant synteny with human and mouse • >90% similar set of genes with human
Fugu rubripes Tetraodon nigroviridis Pufferfish • Same gene information as humans in 1/8 DNA • Lacks many repeats • Very Small introns (many same ex/in struct) • 400 MY of gene sequence and order conservation • Control regions easy to detect: closer to genes/less nonconserved intergenic region • 21 chromosomes all smaller than human 21 • Microchromosomes are gene dense • Important for understanding • Unknown mechanisms of gene expression control • Chromosomal expansion • Function and persistence of “junk DNA”
Other Verts • Salmon, sticklebacks, cichlids, and other commercial fish • Cats and Dogs • Common diseases with humans • Important models of morphological variation • Important models of behavioral variation • Chimpanzee • Mechanisms of pathogen resistance, incl HIV susceptibility • genetic changes crucial for evolution of Homo sapiens • Agrispecies (cattle, horses; true for crops as well) • Whole genome sequencing prohibitively expensive • Partial genome sequencing and SNPS enhance decades of selective breeding data • First nutria genome report appeared July 2002!! • Kass & Doucet: Molecular Phylogeny of the Louisiana Nutria. Proc. LA Acad. Sci. 63:10-24.
Why? • Proof-of-principle: sequencing multi-cellular organisms • Provide understanding of complex organismal functions • Support decades of genetic research (esp with Drosophila)
Genomic Surprises inDrosophila melanogaster & Caenorhabditis elegans • Gene Expression anomalies • Ce: leader sequence transplicing • Ce: polycistronic transcripts • Dm: high variance in transcript length • Dm: some distant regulatory sequences & long introns • 50% more genes in Ce, despite complexity of Dm( # cells, # cell types, morphogenesis) • Large gene families • Ce: steroid hormone-receptor gene family • Dm: olfactory receptor gene family • High conservation of major regulatory and biochemical pathways • some lost to parasitism in Ce • Some novel to Dm due to complete morphogenesis • RNAi highly effective in Ce: 90% gene knockout in 2/5 chromosomes • Models for human disease: 50-60% human disease genes have Ce &Dm orthologs • Models for drug development • Prozac resistance in Ce; ETOH tolerance in Dm • No presumption that trait is same, but molecular interaction b/w gene products conserved even when they affect distinct processes
Arabidopsis thaliana • First plant genome sequenced entirety • 115Mb: about same size as D. melanogaster but 2X genes (25,500) • Two rounds of whole genome duplication • Extensive chromosome reshuffling • Considerable gene loss after duplication • 1500 tandem arrays repeated genes (2-3 copies @) • Only 11,000 gene families minimum for complex multicellularity • 800 nuclear genes of plastid descent • Likely ongoing process • Plastid-targeting signal lost; now function in cyto • 10% genome is novel miniature repeats (MITEs, MULEs)
Classes of Arabidopsis genes absent/underrepresented in animals: • Enzymes for cell wall biosynthesis • Transcellular transport proteins • Minerals, organics, metabolites, toxins, macros • Photosynthesis enzymes (rubisco, ETSs) • Mediators of trophisms (turgor pressure, light, gravity sessility) • Enzymes and cytochromes for secondary metabolites • Many R genes (pathogen resistance); interspersed, not clustered Classes of animal genes absent/underrepresented in Arabidopsis: • Ras G-protein family • Tyrosine kinase receptors • Nuclear steroid receptors
Other Plants Projects & Why? • Projects underway for 50 different species • Rice and Maize: small genomes, economically important • Many commercial crops plants are polyploid, and genomes are too large to be feasibly sequenced in entirety • Must rely on comparative genomics to support hybridization data • Rice and Arabidopsis show extensive but complex synteny • Focus on QTLs rather than Mendelian (single-locus) traits • Resistance, flowering time, tolerance, sugar content, etc. • Domesticated/wild relationships: maize vs. teosinte • Mutation/morphology relationships: Brassica oleracea • Cabbage, kale, Brussels sprouts, broccoli, cauliflower, kohlrabi • Support of classical genetics • Sweet pea, snapdragon • Support of forestry (Poplar: small genome, easy to grow) • Lumber improvement (lignins, enzymes) • Biomass-biofuel improvment • Bioshperic carbon fixation • Parent/Ecotype crop comparisons by comparative genomics