350 likes | 445 Views
Designs in DNA. Richard Deem, Paradoxes Class, March 16, 2014. The problem with biology…. …is you need to know things. Transcription. Protein. Central Dogma of Biology. Nucleus. Mitochondrion. Chloroplast. DNA. T. T. C. T. C. A. T. C. G. A. A. C. A. A. A. G. A . G.
E N D
Designs in DNA Richard Deem, Paradoxes Class, March 16, 2014
The problem with biology… …is you need to know things
Transcription Protein Central Dogma of Biology Nucleus Mitochondrion Chloroplast DNA T T C T C A T C G A A C A A A G A G G G G T A T C T C C C A G C A T T T C G A T A T G T G A A A A G G G A A A U U U U U U mRNA U U U A A A C C C G G G A A A G G G U U U C C C G G G G G G A A A A A A A A A C C C A A A U U U C C C U U U ER Translation
Purines Pyrimidines NH2 C N C N CH HC C N N H O C Adenine (A) Thymine (T) O C CH3 HN C N C HN C CH NH2 O N CH C C C H N H2N N CH N H C CH O N H Guanine (G) Cytosine (C) DNA Bases
Nucleoside Nucleotide NH2 Adenine (base) C N C N HC C CH N N O OCH2 Glycosidic Bond OH Sugar (Deoxyribose) Structure of Deoxyadenosine 5’ O- 3’ O- P O
OH Hydrogen Bond DNA Structure O H2C O O- P O H N O Thymine O NH C CH Adenine C N CH3 HN C C N HC C O C CH O N O- O H2C O- P O N N O- P O CH2 O H O CH O Guanine C C NH O N C HN C HC N C N O O HC C HN O O N H2C O- O P H O- P O CH2 N O O H NH CH C O Cytosine 3’ 5’ O N C Cytosine N CH C NH C N O HN C C O HC O H H2C O P O- O N N N O- O CH2 O P O- HC C O Adenine CH O C N C N Guanine C NH H3C C HN 3’ 5’ HC C O H O N O H2C O- P O O Thymine OH
DNA Double Helix C G G A T C C T A A G T T G A T C C T A A G G T T C G A A T C C T A A G G T C A A T T C A G
DNA Structure Nucleosome 4 Histone protein pairs Histone H1 Chromosome DNA
ElectronMicrograph Karyotype Centromere Telomere Human Chromosomes
Heterochromatin (condensed DNA) Nucleus Euchromatin (actively transcribed DNA) DNA Structure: Chromatin
Bases Found in DNA vs. RNA O O CH3 H C C C C H C N H H C N H N N C C O O Deoxyribose Ribose
Transfer RNA (tRNA) A U G A A G G Transfer RNA C A C A C G U U U A C G C U G C G C C C G C C A G Anti-codon U U G U U C A C Codon U A G C G A G C G A A G G A A U U U U A A U U A A C C G G U U U U A A C C G G G A A A A A A G G Mesenger RNA (mRNA) C C U U C C C G G C C A G U C Methionine
Electron Micrograph of Translation Process + Ribosomes mRNA Protein chains
DNA as a Language • Four “letters” ( bases A, U, G, C) • 64 three letter “words” (codons) • “Redundant” – Many “words” have the identical “meaning” • 20 unique “words” (amino acids) • Unlimited “sentences” (proteins)
Transcription Protein Transcription/Translation Nucleus DNA T T C T C A T C G A A C A A A G A G G G G T A T C T C C C A G C A T T T C G A T A T G T G A A A A G G G A A A U U U U U U mRNA U U U A A A C C C G G G A A A G G G U U U C C C G G G G G G A A A A A A A A A C C C A A A U U U C C C U U U ER Translation
DNA Design:Alternative Splicing of RNA Multiple proteins from one gene
Exons DNA introns (between exons) UTR UTR Pre-mRNA Transcribed region mRNA 5’ 3’ Protein Translated region Genes to Proteins: DNAmRNA
Alternative Splicing of RNA Exon2 Exon2 Exon2 Exon4 Exon4 Exon4 Exon5 Exon1 Exon1 Exon1 Exon3 Exon3 Exon5 Exon5 Int1 Int2 Int3 Int4 Pre-mRNA mRNA Protein isoform A Protein isoform B
DNA Design: Duons Overlapping regulatory and protein codes
Promoter region Exons DNA -6800 -800 -300 -250 -200 -150 -100 Y2 Y1 NFAT NFkB NFAT NFAT AP-1 AP-1 AP-2 Transcription Factors
Genome-Wide Transcription Factor Binding Sites • Used enzyme DNase I • Digested DNA from 81 different cell lines • Sequenced and mapped the location of all TF binding sites » » » DNase I cleavage per nucleotide (PLBD2 gene) « « « « « « « « « NRSF USF SP1 SP1
Duon Sequences • 86% of genes expressed at least one duon sequence • Duons comprise 14% of all exonic coding • Over 12 million base pairs Andrew B. Stergachis et al. 2013. ExonicTranscription Factor Binding Directs Codon Choice and Affects Protein Evolution. Science342, 1367.
Example of Duon in DNA CELSR2 Gene: Chr1:109806358-109806387 Protein Sequence Leu Gln Gln Ile Thr Arg Gly Arg Ser Thr DNA Sequence C C T G C A G G C C A T C A C C A G G G G G C G C A G C A C CTCF Binding Sequence A C C A C C A G G G G G C G C Andrew B. Stergachis et al. 2013. ExonicTranscription Factor Binding Directs Codon Choice and Affects Protein Evolution. Science342, 1367.
Duons are Functional Andrew B. Stergachis et al. 2013. ExonicTranscription Factor Binding Directs Codon Choice and Affects Protein Evolution. Science342, 1367.
DNA Design: Dual Coding Genes Multiple proteins from alternative reading frames
Reading Frames C C T G C A G G C C A T C A C C A G G G G G C G C A G C A C Cys Leu Gln Ala Ala Ala Arg Gln Leu Ala Pro Gly Gln Pro Gly Trp His Met Ile Ser Asp Stop His Val Thr Pro Ala Trp Gln Leu Arg Gly Pro Pro Gly Pro Gly Gly Gly Ala Ala Arg Arg Arg Cys Ala Gln Leu Ser Arg Cys His Ala Val Thr Gly G G A C G T C C G G T A G T G G T C C C C C G C G T C G T G
Dual-Coding Genes Coding of multiple proteins by overlapping reading frames is not a feature one would associate with eukaryotic genes. Indeed, codependency between codons of overlapping protein-coding regions imposes a unique set of evolutionary constraints, making it a costly arrangement. Yet in cases of tightly coexpressed interacting proteins, dual coding may be advantageous. Here we show that although dual coding is nearly impossible by chance, a number of human transcripts contain overlapping coding regions. Wen-Yu Chung, et al. A First Look at ARFome: Dual-Coding Genes in Mammalian Genomes. PLoSComputational Biology 3 (5) e91.
Finding Dual Coding Genes • Evolutionary assumptions underestimate true numbers of dual coding genes • 9% of human and 7% of mouse • Less than 30% shared: mouse:human • 90% of genes on opposite strands • 1259 human alternative proteins detected by mass spectrometry Chaitanya R Sanna, et al. Overlapping genes in the human and mouse genomes. BMC Genomics 2008, 9:169. Benoıt Vanderperre, et al. 2013. Direct Detection of Alternative Open Reading Frames Translation… PLoS ONE 8(8): e70698.
“Sentences” from Two Directions • A man, a plan, a canal: Panama • Live not on evil • Was it a car or a cat I saw?
Dual Coding Gene: EIF6 (ITGB4BP) 285 182 107 134 Frame 1 108 aa 66 86 43 5 312 275 177 182 Frame 2 226 aa Asp Thr Arg Glu Arg Glu Asn Ile Leu Ser Gly Ala Asp Arg Cys Val Leu Ala Gln Gln Gly Val Glu Gly Ser Val Phe Leu Arg Gln Gln Thr Asp Thr Ser Val Gly Ala Asp Arg Pro Gln Gly Val Leu Ala Ser Val Arg Gly Lys Ser Leu Tyr Cys Leu Cys Val Phe Leu Gln Ser Asn Gln Gln Ser 66 10 5 C C A G G T G C T A G T A G G A A G C T A C T G T G T C T T C A G C A A T C A G A C A G A A G A A A T T C T G G C A G A T G T G C T C A A G G T G G A A G T C T T C A G A C A G A C A G T G G C C G A Han Liang and Laura F. Landweber. 2006. A genome-wide study of dual coding regions in human alternatively spliced genes. Genome Research 16:190–196.
Long Intermediate Dual Coding Gene: Ncaph2 Alternative Transcripts Alternative Reading Frame Exon 1 Exon 2 Short Leu Asp Gln Ile Met Glu Asp Val Glu Val Arg Phe Ala His Leu Leu Gln Pro Ile Arg Asp Leu Thr Lys Asn Trp Glu Val Asp Val Ala Ala Gln Leu Gly Glu Glu Exon 2 Met Trp Arg Cys Ala Leu Leu Thr Ser Cys Ser Pro Ser Gly Ile Leu Arg Thr Gly Arg Thr Gly Arg Trp Thr Trp Arg His Ser Trp Leu Asp Gln Ile Met Glu Asp Val Glu Exon 2 Leu Asp Gln Ile Angelo Theodoratos, et al. Splice variants of the condensin II gene Ncaph2 include alternative reading frame... FEBS Journal 279 (2012) 1422–1432.
Protein Products of Ncaph2 Bone Marrow Thymus Muscle Ladder Spleen Kidney Testis Lung Heart Brain Liver Long 232 bp Int 215 bp 200 bp Short 140 bp 50 bp
Conclusions • At least three independent examples of design in DNA • Alternative splicing producing multiple proteins from one gene • Duons–overlapping sequences of coding and transcription factor binding • Dual coding genes