170 likes | 367 Views
DNA STRUCTURE. DOUBLE HELIX. 3’. 5’. 5’. 3’. Antiparallel DNA strands. Hydrogen bonds between bases. Fig.1.8. HOW TO DEFINE A GENE? (there are many descriptions...). - sequence of DNA essential for specific function - codes for protein. or structural RNA. ATG. TAA. DNA. 3’. 5’.
E N D
DNA STRUCTURE DOUBLE HELIX 3’ 5’ 5’ 3’ Antiparallel DNA strands Hydrogen bonds between bases Fig.1.8
HOW TO DEFINE A GENE?(there are many descriptions...) - sequence of DNA essential for specific function - codes forprotein or structural RNA ATG TAA DNA 3’ 5’ 5’ 3’ “structural” gene Transcription & RNA processing Gene + flanking regulatory sequences AUG UAA RNA 5’ 3’ UTRs - untranslated regions which flank the coding sequence in a mRNA (so in transcribed region) Where is translation initiation site? Where is transcription initiation site? promoter?
Eukaryotic (but not prokaryotic) genes usually contain introns Intron 2 Intron 1 ATG TAA 5’ 3’ DNA 3’ 5’ “Exon 1” Exon 2 “Exon 3” mRNA 3’ 5’ 3’ UTR coding region 5’ UTR Exon 1 Exon 2 Exon 3 Intron- non-coding sequences removed from pre-RNA (by splicing) Exon- sequences that remain in mature RNA (mostly coding) Nomenclature “problem”: • Textbooks (& papers) often show only coding sequences as exons, but first exon includes 5’UTR and last exon includes 3’UTR • Dilemma because often the positions of RNA ends are not known or tissue-specific differences • Introns can also occur within UTR regions
Example of human pax6 gene Lines: introns Bars: exons Tall bars: coding exons Short bars: non-coding exons What does the bent arrow signify? Where would the initiation and stop codons be? Mercer Nat Rev Genet 10: 155, 2009
1. Human genes: Intron length: typically ~200 nt to > 10 kb Number per gene: several to dozens… Exon length: typically 100 - 200 nt Extreme example: dystrophin gene (~2400 kb) with ~78 introns!! Tennyson, Klamut & Worton (1995) “The human dystrophin generequires 16 hours to be transcribed and is cotranscriptionally spliced”Nat Genet.9:184-90 Genes-within-genes! Other genes are sometimes located within long introns! … in same or opposite orientation (see Practice set #1, question 4) 2. Plant genes: Intron density similar to animals, but shorter length: typically 100 - 300 nt 3. Yeast genes: < 5% haveintrons (vs. mammals where >95% genes have introns) - mostly in tRNA genes (intron length ~ 20-30 nt) …and in ribosomal protein genes (intron length ~ 100-500 nt)
Structure of NF2 (neurofibromatosis type II) gene in various animals What features of this gene are different among these animals? Golovnina et al. BMC Evol Biol 2005
Gene 1 Gene 2 Bacterial genes are often organized in operons with short intergenic spacers - polycistronic mRNA, but each gene has its own start and stop codons Gene B Gene A Gene C But neighbouring operons might be in opposite orientation in genome 5’…ATAGGACAT 5’ …gatcgctctataggaggtgc ATGCAATGG…3’ 3’…TATCCTGTA ctagcgagatatcctccacg TACGTTACC…5’ Aside: My examples will often show unrealistically short sequences What are N-terminal sequences of proteins encoded by genes 1 and 2? See also Practice question #2
Where would promoter(s) for genes 1 and 2 be located? Gene 2 Gene 1 Presence of genes located close together but encoded on opposite strands is sometimes also seen in eukaryotic genomes bidirectional promoter ? Adachi & Lieber Cell 109: 807, 2002
5’ RNA structure Features of RNA vs. DNA RNA synthesis 5’ 3’ “Coding strand” Template strand mRNA has same sequence as coding strand (except U instead of T) RNA synthesized in 5’ to 3’ direction with antiparallel DNA strand as template 3’ Alberts Fig.6.4 Fig.1.11
RNA content of a cell small regulatory RNAs small non-coding (nc) regulatory RNAs are also present in bacteria sRNAs snRNAs (small nuclear) - role in splicing snoRNA(small nucleolar) - role in methylation of rRNAs Fig.1.12 miRNA (microRNAs) &siRNA (short interfering RNAs) - role in regulation of expression of individual genes
RNA processing in eukaryotes - presence of long introns (& short exons) can make finding genes in eukaryotic DNA sequences difficult - may be alternative splicing pathways so more than one protein generated from one gene (Discussed later, Chapter 6) Fig.1.13
Link between transcriptome & proteome Mediated by tRNAs (codon-anticodon) Genetic code “standard code” - can deduce amino acid sequence of protein from nt coding sequence … using genetic code table See Practice question #1 Fig.1.20 Fig.1.2
PROTEIN-CODING GENES divided into triplets (codons) “coding strand” 5’ …. ATG GGA TTG CCC GCC …. 3’ DNA 3’ .… TAC CCT AAC GGG CGG …. 5’ “template strand” 5’ …. AUG GGA UUG CCC GCC …. 3’ mRNA • in research papers DNA usually shown as single-stranded • with coding strand in 5’ to 3’ orientation (left to right) … so genetic code table can be used directly
Amino acid one-letter abbreviation often used instead of 3-letters Translation termination codons Initiation codon Remember that although AUG is the standard initiation codon, there can also be AUG triplets within an ORF, … specifying internal Met residues in the protein And when analyzing DNA data obtained in the lab, initiation codon might be located outside the sequenced region Alberts Fig. 6-50
Examples of deviation from the standard genetic code in mitochondria and microbes Table 1.3
PROTEIN SEQUENCE & STRUCTURE Fig.13.24 Fig.1.17 Different proteins can be generated from single precursor polypeptide through post-translational events …so can have larger proteome (set of proteins) than predicted from number of genes in genome
Cis-acting element: DNA (or RNA) sequences near a gene, that are important for its expression Latin word “cis” means "on the same side as” Trans-acting factor: protein (or RNA) that binds to cis-element to control gene expression TAA ATG 3’ 5’ DNA 5’ 3’ Cis-elements can actually be quite far away from genes they control in intergenic spacers (ENCODE project) and within introns