1 / 77

Introduction to Genomics and the Tree of Life Chapter 13

Introduction to Genomics and the Tree of Life Chapter 13. Extra-Reading. Next generation sequencer What next generation sequencer can do for genetics/genomics research? Compar_genomics What can we learn from comparative genomics?. Outline of today’s lecture.

toyah
Download Presentation

Introduction to Genomics and the Tree of Life Chapter 13

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Genomics and the Tree of Life Chapter 13

  2. Extra-Reading • Next generation sequencer • What next generation sequencer can do for genetics/genomics research? • Compar_genomics • What can we learn from comparative genomics?

  3. Outline of today’s lecture Introduction: 5 perspectives, history of life Genome-sequencing projects: chronology Genome analysis: criteria, resequencing, metagenomics DNA sequencing technologies: Sanger, 454, Solexa Process of genome sequencing: centers, repositories Genome annotation: features, prokaryotes, eukaryotes

  4. Five approaches to genomics As we survey the tree of life, consider these perspectives: Approach I: cataloguing genomic information Genome size; number of chromosomes; GC content; isochores; number of genes; repetitive DNA; unique features of each genome Approach II: cataloguing comparative genomic information Orthologs and paralogs; COGs; lateral gene transfer Approach III: function; biological principles; evolution How genome size is regulated; polyploidization; birth and death of genes; neutral theory of evolution; positive and negative selection; speciation Approach IV: Human disease relevance Approach V: Bioinformatics aspects Algorithms, databases, websites Page 519

  5. Introduction Lessons learned form comparative genomics What have we learned about genes by comparing genomic sequences? What have we learned about regulation? About 5% of the human genome is under purifying selection Positively regulated regions Mechanisms and history of mammalian evolution Nonuniformity of neutral evolutionary rates within species Nonuniformity of evolution along the branches of phylogeny Learning more form existing data Choice of species Choice of tools Future of comparative genomics

  6. Levels of analysis in genomics leveltopicsdatabases DNA genes, chromosomes GenBank RNA ESTs, ncRNA UniGene, GEO protein ORFs, composition UniProt complexes binary, multimeric BIND pathways COGs, KEGG organelles organs individuals variation and disease HapMap species speciation TaxBrowser; SGD genus JAX mouse phylum FishBase kingdom TOL

  7. Definitions of terms Genomics is the study of genomes (the DNA comprising an organism) using the tools of bioinformatics. Bioinformatics is the study protein, genes, and genomes using computer algorithms and databases. Systematics is the scientific study of the kinds and diversity of organisms and of any and all relationships among them. Classification is the ordering of organisms into groups on the basis of their relationships. The relationships may be evolutionary (phylogenetic) or may refer to similarities of phenotype (phenetic). Taxonomy is the theory and practice of classifying organisms.

  8. Pace (2001) described a tree of life based on small subunit rRNA sequences. This tree shows the main three branches described by Woese and colleagues. Fig. 13.1 Page 521

  9. Molecular sequences as basis of trees Historically, trees were generated primarily using characters provided by morphological data. Molecular sequence data are now commonly used, including sequences (such as small-subunit RNAs) that are highly conserved. Visit the European Small Subunit Ribosomal RNA database for 20,000 SSU rRNA sequences. Page 523

  10. Tree of life from David Hillis’ lab (based on ~3000 rRNAs) animals plants you are here protists bacteria fungi archaea http://www.zo.utexas.edu/faculty/antisense/Download.html

  11. Tree of life from David Hillis’ lab (based on ~3000 rRNAs) you are here http://www.zo.utexas.edu/faculty/antisense/Download.html

  12. Ribosomal RNA Database Ribosomal Database Project http://rdp.cme.msu.edu/index.jsp Santos, S. R. and Ochman H. Identification and phylogenetic sorting of bacterial lineages with universally conserved genes and proteins. Environmental Microbiology. 2004. Jul(6)7:754-9. ►Download fusA (translation elongation factor 2 [EF-2]) ►Obtain DNA in the fasta format ►Align by ClustalW in MEGA ►Create a neighbor-joining tree Page 524

  13. European Small Subunit Ribosomal RNA database (http://www.psb.ugent.be/rRNA/ssu/)

  14. Neighbor-joining tree of ~150 fusA (GTPase) DNA sequences Yersinia pestis Clostridium Aquifex aeolicus Mycoplasma Bac. antracis Mycobacterium Rickettsia Treponema

  15. History of life on earth 4.55 BYA formation of earth (violent 100 MY period) 4.4-3.8 BYA last ocean-evaporating impacts 3.9 BYA oldest dated rocks 3.8 BYA sun brightened to 70% of today’s luminosity Ammonia, methane, or carbon dioxide atmosphere. Earliest life: RNA, protein Source: Schopf J.W. (ed.), Life’s Origins (U. Calif. Press, 2002) Page 521

  16. Millions of years ago (MYA) deuterostome/ protostome echinoderm/ chordate Cambrian explosion Age of Reptiles ends Land plants Insects Proterozoic eon Phanerozoic eon 1000 500 100 0 Page 522

  17. Millions of years ago (MYA) Human/chimp divergence Mass extinction Dinosaurs extinct; Mammalian radiation 100 50 10 0 Page 522

  18. Millions of years ago (MYA) Homo sapiens/ Chimp divergence Australepithecus Lucy Earliest stone tools Emergence of Homo erectus 10 5 0 1 Page 522

  19. Years ago Homo erectus emerges in Africa Mitochondrial Eve 1,000,000 500,000 100,000 0 Page 523

  20. Years ago Emergence of anatomically modern H. sapiens Neanderthal and Homo erectus disappear 10,000 0 100,000 50,000 Page 523

  21. Years ago “Ice Man” from Alps Earliest pyramids Aristotle 1,000 0 10,000 5,000 Page 523

  22. Years ago Darwin, Mendel algebra Gutenberg calculus 100 0 1,000 500 Page 523

  23. Chronology of genome sequencing projects We will next summarize the major achievements in genome sequencing projects from a chronological perspective. Page 525

  24. Chronology of genome sequencing projects 1976: first viral genome Fiers et al. sequence bacteriophage MS2 (3,569 base pairs, Accession NC_001417). 1977:Sanger et al. sequence bacteriophagefX174. This virus is 5,386 base pairs (encoding 11 genes). See accession J02482; NC_001422. Page 527

  25. Chronology of genome sequencing projects 1981 Human mitochondrial genome 16,500 base pairs (encodes 13 proteins, 2 rRNA, 22 tRNA) Today (10/09), over 1800 mitochondrial genomes sequenced 1986 Chloroplast genome 156,000 base pairs (most are 120 kb to 200 kb) Page 527

  26. mitochondrion chloroplast Lack mitochondria (?)

  27. Entrez Genomes organelle resource at NCBI http://www.ncbi.nlm.nih.gov/genomes/ORGANELLES/organelles.html

  28. There are >2100 eukaryotic organelles (10/09)

  29. GOBASE: resource for organelle genomes http://megasun.bch.umontreal.ca/gobase/

  30. MitoDat: resource for organelle genomes “This database is dedicated to the nuclear genes specifying the enzymes, structural proteins, and other proteins, many still not identified, involved in mitochondrial biogenesis and function. MitoDat highlights predominantly human nuclear-encoded mitochondrial proteins.” Not updated recently. http://www-lecb.ncifcrf.gov/mitoDat/

  31. MitoMap: resource for organelle genomes http://www.mitomap.org/

  32. It is possible to map mutations in human mitochondrial DNA that are responsible for disease

  33. Chronology of genome sequencing projects 1995: first genome of a free-living organism, the bacterium Haemophilus influenzae Page 530

  34. Chronology of genome sequencing projects 1996: first eukaryotic genome The complete genome sequence of the budding yeast Saccharomyces cerevisiae was reported. We will describe this genome soon. Also in 1996, TIGR reported the sequence of the first archaeal genome, Methanococcus jannaschii. Page 532

  35. Chronology of genome sequencing projects 1997: More bacteria and archaea Escherichia coli 4.6 megabases, 4200 proteins (38% of unknown function) 1998: first multicellular organism Nematode Caenorhabditis elegans 97 Mb; 19,000 genes. 1999: first human chromosome Chromosome 22 (49 Mb, 673 genes) Page 532

  36. 1999: Human chromosome 22 sequenced

  37. Chronology of genome sequencing projects 2000: Fruitfly Drosophila melanogaster (13,000 genes) Plant Arabidopsis thaliana Human chromosome 21 2001: draft sequence of the human genome (public consortium and Celera Genomics) Page 534

  38. 2000

  39. Overview of genome analysis • Selection of genomes for sequencing • Sequence one individual genome, or several? • How big are genomes? • Genome sequencing centers • Sequencing genomes: strategies • When has a genome been fully sequenced? • Repository for genome sequence data • Genome annotation Page 537

  40. Table 13.15 p.538

  41. Overview of genome analysis Fig. 13.8 p.539

  42. Criteria for selecting genomes for sequencing • Criteria include: • genome size (some plants are >>>human genome) • cost • relevance to human disease (or other disease) • relevance to basic biological questions • relevance to agriculture Page 538

  43. Criteria for selecting genomes for sequencing • Criteria include: • genome size (some plants are >>>human genome) • cost • relevance to human disease (or other disease) • relevance to basic biological questions • relevance to agriculture • Recent projects: • Chicken Fungi (many) • Chimpanzee Honey bee • Cow Sea urchin • Dog Rhesus macaque Page 540

  44. Selection criteria Selection of genomes for sequencing is based on specific criteria. For an overview, see a series of white papers posted on the National Human Genome Research Institute (NHGRI) website: http://www.genome.gov/10002154 For a description of NHGRI selection criteria, visit: http://www.genome.gov/10001495 Page 540

  45. Criteria for selecting genomes for sequencing Sequence one individual genome, or several? Try one… --Each genome center may study one chromosome from an organism --It is necessary to measure polymorphisms (e.g. SNPs) in large populations For viruses, thousands of isolates may be sequenced. For the human genome, cost is the impediment. Page 540

More Related