260 likes | 754 Views
Genome, transcriptome and proteome in evolution. Xuhua Xia xxia@uottawa.ca http://dambe.bio.uottawa.ca. Transcription and Translation. Gene 1 Gene 2 Gene 3. Polycistronic mRNA. RNA polymerase. GCC~ tRNA Gly. Ribosome. UCC~ tRNA Gly. Protein.
E N D
Genome, transcriptome and proteome in evolution Xuhua Xia xxia@uottawa.ca http://dambe.bio.uottawa.ca
Transcription and Translation Gene 1 Gene 2 Gene 3 Polycistronic mRNA RNA polymerase GCC~tRNAGly Ribosome UCC~tRNAGly Protein UCC~tRNAGly Initiation: Met-Gly-... Elongation: Mn + M Mn+1 UCC~tRNAGly
Ribonucleotide concentration Measured in the exponentially proliferating chick embryo fibroblasts, 2hrs, in moles 10-12 per 106 cells. The difference is expected to be more extreme in mitochondria. NNA would seem to be a more efficient codon than NNC XIA, X., 1996. Genetics 144: 1309-1320.
EVOLUTIONARY INFORMATION FROM DNA SEQUENCES GENE - sequence of DNA (or RNA) that is essential for a specific function 1. Protein-coding genes U.S. Dept of Energy Human Genome Program, http://www.ornl.gov/hgmis 2. RNA-specifying genes 3. Functional DNA elements • - regulatory - structural Do not use term in text (p.9): “Untranscribed genes” for #3
More on genes “SILENT” GENE - untranscribed, but potentially functional at DNA level PSEUDOGENE • non-functional DNA with high degree of similarity to a • functional gene How can pseudogenes arise during evolution? Orthologous genes - descendants of an ancestral gene that was present in the last common ancestor of two or more species Paralogous genes - arose by gene duplication within a lineage
“Typical” Eukaryotic Protein-coding Gene 5’ UTR? 3’ UTR? Where is the promoter? What regions will be present in the mRNA? Is there an error in this figure? Fig.1.4
“Typical” Bacterial Gene Organization How many promoters in this region? How many proteins encoded? Operon = cluster of co-transcribed genes Evolutionary advantages of operon organization? Fig.1.6
PROTEIN-CODING GENES “coding strand” 5’ …. ATG GGA TTG CCC GCC …. 3’ DNA 3’ .… TAC CCT AAC GGG CGG …. 5’ “template strand” 5’ …. AUG GGA UUG CCC GCC …. 3’ mRNA • DNA usually shown as single-stranded • with coding strand in 5’ to 3’ orientation … so genetic code table can be used directly
Amino acids Fig. 1.9
Which amino acid to have in protein? • Will it do its job? • Amino acid properties and protein function • Mutability (Is it likely to mutate into some amino acid that is quite different in physiochemical properties?) • Is it abundant in food or cheap to synthesize (if not present in large quantities in food)? • Does it have many tRNAs to carry it?
Why study amino acid properties? • Protein properties often depends on the properties of their amino acids: Effect of mutation • Diagnosis, e.g., protein electrophoresis Normal polypeptide (Hb-A): Val-His-Leu-Thr-Pro-Glu-Glu……GAA Sickel-cell polypeptide (Hb-S): Val-His-Leu-Thr-Pro-Val-Glu……GUA
Energetic Cost Hiroshi Akashi and Takashi Gojobori, PNAS 99:3695–3700
Standard Genetic Code Codon families have 1 – 6 members Synonymous and nonsynonymous substitutions 0-fold, 2-fold, 3-fold, 4-fold degenerate sites 0-fold degenerate = non-degenerate 5’ …. AUG GGA UUG CCC CAC …. 3’
Genetic code is not “universal” Some mitochondria, a few bacteria, a few protists use a non-standard code Table 1.4 Vertebrate mitochondrial code UGA = Trp (instead of stop codon) AUA, AUG = Met AGA, AGG = stop codons Possible implications of different codes in nature?
Amino acid dissimilarities Table 4.7 Grantham’s distance: F(V, P, C) Miyata’s distance: F(V, P)
Amino acid substitution matrices 10 20 30 40 50 60 ----|----|----|----|----|----|----|----|----|----|----|----|-- S1 RWFFSTNHKDIGTLYLVFGAWAGMVGTALSLLIRAELSQPGALLGDDQIYNVIVTAHAFVMI S2 RWLFSTNHKDIGTLYLLFGAWAGVLGTALSLLIRAELGQPGNLLGNDHIYNVIVTAHAFVMI BLOSUM = BLOcks Substitution Matrixa substitution matrix used for sequence alignment of proteins (to score alignments between evolutionarily divergent protein sequences).
Different types of codon substitution Table 1.5