860 likes | 1.23k Views
Genomics, Bioinformatics and the Revolution in Biology. Jonathan Pevsner, Ph.D. Kennedy Krieger Institute/ Johns Hopkins School of Medicine. Outline. Three views of bioinformatics and genomics Informatics From small to large From genotype to phenotype The chromosomes
E N D
Genomics, Bioinformatics and the Revolution in Biology Jonathan Pevsner, Ph.D. Kennedy Krieger Institute/ Johns Hopkins School of Medicine
Outline Three views of bioinformatics and genomics Informatics From small to large From genotype to phenotype The chromosomes SNPs, HapMap, and the 1000 Genomes project
Definitions of bioinformatics and genomics • Bioinformatics is the interface of biology and computers. • It is the analysis of proteins, genes and genomes • using computer algorithms and databases. • Genomics is the analysis of genomes, including the • nature of genetic elements on chromosomes. • The tools of bioinformatics are used to make • sense of the billions of base pairs of DNA • that are sequenced by genomics projects. • Genetics is the study of the origin and expression of individual uniqueness.
Three views of bioinformatics and genomics 1. The field of informatics 2. From small to large 3. From genotype to phenotype
bioinformatics medical informatics public health informatics algorithms databases infrastructure genomics Tool-users Tool-makers
Three views of bioinformatics and genomics 1. The field of informatics 2. From small to large 3. From genotype to phenotype
DNA RNA protein phenotype
200 180 160 140 120 100 80 60 40 20 0 Rapid growth of DNA sequences Total number of DNA base pairs in GenBank/WGS Base pairs (billions) Sequences (millions) 1982 1992 2002 2008 Year
Time of development Body region, physiology, pharmacology, pathology
The Origin of Species (1859) It is interesting to contemplate a tangled bank, clothed with many plants of many kinds, with birds singing on the bushes, with various insects flitting about, and with worms crawling through the damp earth, and to reflect that these elaborately constructed forms, so different from each other, and dependent upon each other in so complex a manner, have all been produced by laws acting around us. Source: Origin of Species, Chapter 15
Eukaryotes (Baldauf et al. 2000) fungi animals slime mold plants Paramecium Plasmodium Trypanosoma Giardia Trichomonas
8 chromosomes (5,000 genes) 16 chromosomes (10,000 genes) 16 chromosomes (6,000 genes) Wolfe et al. (1999)
Paramecium tetraurelia: a ciliate with two nuclei, 40,000 genes, and three whole-genome duplications
Phylogenetic footprinting Phylogenetic shadowing Population shadowing
Three views of bioinformatics and genomics 1. The field of informatics 2. From small to large 3. From genotype to phenotype
DNA RNA protein pathway cell organism population
DNA RNA protein pathway cell organism population We see 500 inpatients and 13,000 outpatients per year at the Kennedy Krieger Institute. Why do children engage in self-injurious behavior? In many cases, there are chromosomal insults. Phenotype
DNA RNA protein pathway cell organism population From genotype… …to phenotype
DNA RNA protein pathway cell organism population DNA RNA protein cellular phenotype clinical phenotype
DNA RNA protein pathway cell organism population DNA RNA protein Central dogma of molecular biology: DNA is transcribed into RNA, and translated into protein. Central dogma of bioinformatics/genomics: the genome is transcribed into the transcriptome, and translated into the proteome.
DNA RNA protein pathway cell organism population 200 180 160 140 120 100 80 60 40 20 0 1982 1992 2002 2008 Over 200 billion base pairs of DNA have now been sequenced, from >165,000 organisms.
DNA RNA protein pathway cell organism population Scope of bioinformatics Sequence analysis Pairwise alignment Multiple sequence alignment Phylogeny Database searching (e.g. BLAST) Functional genomics RNA studies; gene expression profiling Proteomics; protein structure Gene function
Pairwise alignments in the 1950s b-corticotropin (sheep) Corticotropin A (pig) ala gly glu asp asp glu asp gly ala glu asp glu CYIQNCPLG CYFQNCPRG Oxytocin Vasopressin
globins: a- b- myoglobin Early example of sequence alignment: globins (1961) H.C. Watson and J.C. Kendrew, “Comparison Between the Amino-Acid Sequences of Sperm Whale Myoglobin and of Human Hæmoglobin.” Nature 190:670-672, 1961.
LAGAN 2e Fig. 5.21
DNA RNA protein pathway cell organism population Scope of bioinformatics Sequence analysis Pairwise alignment Multiple sequence alignment Phylogeny Database searching (e.g. BLAST) Functional genomics RNA studies; gene expression profiling Proteomics; protein structure Gene function
DNA RNA protein pathway cell organism population Four bases: A, G, C, T arranged in base pairs along a double helix (1953). Human genome project: sequencing all ~3 billion base pairs (2003).
DNA RNA protein pathway cell organism population 1995: first genome sequence (a bacterium) 2000: fruit fly genome, plant 2003: human genome 2008: --two individual human genomes finished --1,000 human genomes (launched) --SNPs used to study chromosomes
DNA RNA protein pathway cell organism population
DNA RNA protein pathway cell organism population
DNA RNA protein pathway cell organism population Time of development Body region, physiology, pharmacology, pathology
DNA RNA protein pathway cell organism population
DNA RNA protein pathway cell organism population Genotype Phenotype
Outline Three views of bioinformatics and genomics Informatics From small to large From genotype to phenotype The chromosomes SNPs, HapMap, and the 1000 Genomes project
Eukaryotic genomes are organized into chromosomes Genomic DNA is organized in chromosomes. The diploid number of chromosomes is constant in each species (e.g. 46 in human). Chromosomes are distinguished by a centromere and telomeres. The chromosomes are routinely visualized by karyotyping (imaging the chromosomes during metaphase, when each chromosome is a pair of sister chromatids).
Fig. 16.19 Page 565
nucleolar organizing center centromere human chromosome 21 at NCBI
nucleolar organizing center centromere human chromosome 21 at www.ensembl.org
centromere human chromosome 21 at UCSC Genome Browser
centromere human chromosome 21 at UCSC Genome Browser
First P.G. mitosis in polar view. Tradescantia virginiana, Commelinaceae, n = 9 (from aberrrant plant with 22 chromosomes). 2 BE - CV smears. x 1200. Printed on multigrade paper. Darlington.