660 likes | 1.05k Views
Introduction to Molecular Biology, Genetics and Genomics. Sushmita Roy www.biostat.wisc.edu /bmi576/ sroy @ biostat.wisc.edu September 6, 2012. BMI/CS 576. Goals for today. Molecular biology crash course: The different parts of a cell DNA, RNA, chromosomes, nucleus, cytoplasm
E N D
Introduction to Molecular Biology, Genetics and Genomics Sushmita Roy www.biostat.wisc.edu/bmi576/ sroy@biostat.wisc.edu September 6, 2012 BMI/CS 576
Goals for today • Molecular biology crash course: • The different parts of a cell • DNA, RNA, chromosomes, nucleus, cytoplasm • Bio-chemical entities of a cell: mRNA, proteins, metabolites • genes, heredity, transcription, translation, gene regulation, gene expression, alternative splicing • Genomics crash course: • Genomes, functional genomics, other omes, networks
Organization of biological information Organism Chromosome Tissue Cell Gene http://publications.nigms.nih.gov/thenewgenetics/chapter1.html
The central dogma of Molecular biology DNA Transcription RNA Translation Proteins
image from the DOE Human Genome Program http://www.ornl.gov/hgmis
DNA • Short for Deoxyribonucleic acid • composed of small chemical units called nucleotides (or bases) • adenine (A), cytosine (C), guanine (G) and thymine (T) • ATGC is the alphabet • DNA is double stranded: made up two twisting strands • Each strand of DNA is a string composed of the four letters: A, C, G, T
DNA is a double helical molecule DNA molecules consist of two strands arranged in a double helix • DNA is made up of nucleotides Double-helical structure is needed for the DNA molecule to store and pass with great precision James Watson, Francis Crick, Maurice Wilkins and Rosalind Franklin
Watson-Crick Base Pairs Calwaysbonds to G A always bonds to T This is called base pairing. A and G are double ringed structures called purines. C and T single ringed structures called pyrimidines
5’ and 3’ of a DNA molecule • The backbone of this molecule has alternating carbon and phosphate molecules • each strand of DNA has a “direction” • at one end, the terminal carbon atom in the backbone is the 5’ carbon atom of the terminal sugar • at the other end, the terminal carbon atom is the 3’ carbon atom of the terminal sugar • therefore we can talk about the 5’ and the 3’ ends of a DNA strand
DNA stores the blue print of an organism • The heredity molecule • Has the information needed to make an organism • Base pairing enables self-replication: • one strand has all the information
Chromosomes • All the DNA of an organism is divided up into individual chromosomes • prokaryotes (single-celled organisms lacking nuclei) typically have a single circular chromosome • eukaryotes (organisms with nuclei) have a species-specific number of chromosomes Image from www.genome.gov
DNA packaging in Chromatin DNA is very long (3m in humans), cell is very small Chromosome compresses the DNA molecule 50,000 Collection of DNA and proteins is called chromatin.
Genes • genes are the basic units of heredity • a gene is a sequence of bases which specifies a protein or RNA genes • the human genome comprises ~ 25,000 protein-coding genes (still being revised) • One gene can have many functions • One function can require many genes …GTATGTCTAAGCCTGAATTCAGTCTGCTTTAAACGGCTTC…
Structure of genes DNA Gene A Gene B Gene C Non-coding Gene Promoter
Genomes • Refers to the complete complement of DNA for a given species • the human genome consists of 2X23 chromosomes • every cell (except egg and sperm cells and mature red blood cells) contains the complete genome of an organism
The central dogma of Molecular biology DNA Transcription RNA Translation Proteins
RNA • RNA is like DNA except: • single stranded • U is used in place of T • a strand of RNA can be thought of as a string composed of the four letters: A, C, G, U
Transcription • In eukaryotes: happens inside the nucleus • RNA polymerase is an enzyme that builds an RNA strand from a gene • RNA Pol II is recruited at specific parts of the genome in a condition-specific way. • Transcription factor proteins are assigned the job of Pol II recruitment. • RNA that is transcribed from a gene is called messenger RNA (mRNA)
The central dogma of Molecular biology DNA Transcription RNA Translation Proteins
Translation • Process of turning mRNA into proteins. • Happens inside the cytoplasm in ribosomes • ribosomesare the machines that synthesize proteins from mRNA • Translation process reads one codon at a time • translation begins with the start codon • translation ends with the stop codon
Codons • Each triplet of bases is called aodon • How many codons are possible? • Each codon is responsible for coding a particular amino acid.
Codons and Reading Frames Alanine Threonine
Proteins • Proteins are long strings ofcomposed of amino acids • There are 20 different amino acids known
Proteins are the workhorses of the cell • structural support • storage of amino acids • transport of other substances • coordination of an organism’s activities • response of cell to chemical stimuli • movement • protection against disease • selective acceleration of chemical reactions
Proteins are complex molecules • Primary amino acid sequence • Secondary structure • Tertiary structure • Quarternary structure
Some well-known proteins Actin: maintenance of cell structure Hemoglobin: carries oxygen Insulin: metabolism of sugar
Hemoglobin protein HBA1 >gi|224589807:226679-227520 Homo sapiens chromosome 16, GRCh37.p9 Primary Assembly 1 cccacagactcagagagaacccaccatggtgctgtctcctgacgacaagaccaacgtcaa 61 ggccgcctggggtaaggtcggcgcgcacgctggcgagtatggtgcggaggccctggagag 121 gatgttcctgtccttccccaccaccaagacctacttcccgcacttcgacctgagccacgg 181 ctctgcccaggttaagggccacggcaagaaggtggccgacgcgctgaccaacgccgtggc 241 gcacgtggacgacatgcccaacgcgctgtccgccctgagcgacctgcacgcgcacaagct 301 tcgggtggacccggtcaacttcaagctcctaagccactgcctgctggtgaccctggccgc 361 ccacctccccgccgagttcacccctgcggtgcacgcctccctggacaagttcctggcttc 421 tgtgagcaccgtgctgacctccaaataccgttaagctggagcctcggtggccatgcttct 481 tgcccctttgg >sp|P69905|HBA_HUMAN Hemoglobin subunit alpha OS=Homo sapiens GN=HBA1 PE=1 SV=2 MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR Amino acid sequence (142 aa) Protein 3d structure DNA sequence (491 bp)
RNA Processing in Eukaryotes • eukaryotes are organisms that have enclosed nuclei in their cells • in many eukaryotes,RNAs consist of alternating exon/intron segments • exons are the coding parts • introns are spliced out before translation
RNA Genes • not all genes encode proteins • for some genes the end product is RNA • ribosomal RNA (rRNA), which includes major constituents of ribosomes • transfer RNAs (tRNAs), which carry amino acids to ribosomes • micro RNAs (miRNAs), which play an important regulatory role in various plants and animals • lincRNAs (long non-coding RNAs), play important regulatory roles.
Central Dogma revisited DNA Transcription RNA Translation Non-coding RNA processing Proteins ncRNA, miRNA, rRNAs
Summary • Key concepts in molecular biology • Central Dogma • DNA, RNA, proteins • Chromosomes, Nucleus, Ribosomes • Important processes • Transcription • Translation • RNA splicing
Functional Genomics • Aims to characterize gene, proteins in an organism in an unbiased way using high throughput technologies. • Really focused on “beyond the genetic sequence” • What does a piece of DNA do? • Gene, regulatory element, a mutation • Has generated large collections of “omics” datasets • Gene expression • Protein expression • Metabolite levels
Metabolites • Metabolism: • A set of chemical processes in cells • Need for sustaining life • Small molecules that are intermediates of metabolism • Sugar • Glycerol • Metabolic pathway • A set of chemical reactions in a cell
The Tri-Carboxylic Acid cycle Metabolites Enzyme Courtesy KEGG Pathways
Context-specific expression of a cell • The DNA is static • But the set of mRNA per cell type, environment, time-point may be different. • A key process is gene regulation • determines which genes are expressed when Environmental signal
Transcriptional gene regulation • Key control process that determines what genes are expressed when • Requires • RNA Polymerase • Transcription factors • Energy http://www.youtube.com/watch?v=WsofH466lqk
Transcriptional gene regulation Transcription factor level (trans) P2 P1 HSP12 Transcription factor binding sites (cis) Promoter mRNA levels
Regulation of GAL genes • GAL genes are required for yeasts to grow on Galactose. • There are 4 genes that are metabolic • GAL1, GAL10, GAL2 and GAL7 • There are three that are regulatory • GAL4, GAL80 and GAL3
Regulation of GAL genes No Galactose A metabolic GAL gene In Galactose
Transcriptome • The entire set of RNA products in a cell • A cell can decide to make more or less of a particular RNA • Levels change • It’s constituents are context-specific • Context is determined by environment of a cell