720 likes | 934 Views
Gene Expression: Transcription. Drs. Sutarno, MSc., PhD. Overview of Gene Expression. An organism may contain many types of somatic cells, each with distinct shape and function. However, they all have the same genome.
E N D
Gene Expression:Transcription Drs. Sutarno, MSc., PhD.
Overview of Gene Expression • An organism may contain many types of somatic cells, each with distinct shape and function. However, they all have the same genome. • The genes in a genome do not have any effect on cellular functions until they are "expressed". • Different types of cells express different sets of genes, thereby exhibiting various shapes and functions.
Essential steps involved in the expression of protein genes. The Central Dogma of Molecular Biology:
Overview of Gene Expression "Gene expression" means the production of a protein or a functional RNA from its gene. Several steps are required: • Transcription: A DNA strand is used as the template to synthesize a RNA strand, which is called the primary transcript. • RNA processing: This step involves modifications of the primary transcript to generate a mature mRNA(for protein genes) or a functional tRNA or rRNA.
For RNA genes (tRNA and rRNA), the expression is complete after a functional tRNA or rRNA is generated. • However, protein genes require additional steps: • Nuclear transport: mRNA has to be transported from the nucleus to the cytoplasm for protein synthesis. • Protein synthesis: In the cytoplasm, mRNA binds to ribosomes, which can synthesize a polypeptide based on the sequence of mRNA.
Transcription • Transcription:The process of copying DNA to produce an RNA transcript. • This is the first step in the expression of any gene. • The resulting RNA, if it codes for a protein, will be spliced, polyadenylated, transported to the cytoplasm, and by the process of translation will produce the desired protein molecule.
Overview of Transcription • Transcription is a process in which one DNA strand is used as template to synthesize a complementary RNA. The following is an example: • Note that uracil (U) of RNA is paired with adenine (A) of DNA. • The DNA strand which serves as the template may be called "template strand", "minus strand", or "antisense strand". • The other DNA strand may be termed "non-template strand", "coding strand", "plus strand", or "sense strand".
Since both DNA coding strand and RNA strand are complementary to the template strand, they have the same sequences except that T in the DNA coding strand is replaced by U in the RNA strand.
Schematic illustration of transcription. (a) DNA before transcription. (b) During transcription, the DNA should unwind so that one of its strand can be used as template to synthesize a complementary RNA.
Essential steps of transcription • (i) Binding of polymerases to the initiation site. The DNA sequence which signals the initiation of transcription is called the promoter. • (ii) Unwinding of the DNA double helix (pilinan double heliks membuka). The enzyme which can unwind the double helix is called helicase. Prokaryotic polymerases have the helicase activity, but eukaryotic polymerases do not. Unwinding of eukaryotic DNA is carried out by a specific transcription factor.
(iii) Synthesis of RNA based on the sequence of the DNA template strand. RNA polymerases use nucleoside triphosphates (NTPs) to construct a RNA strand. • (iv) Termination of synthesis. Prokaryotes and eukaryotes use different signals to terminate transcription. • Transcription in eukaryotes is much more complicated than in prokaryotes, partly because eukaryotic DNA is associated with histones, which could hinder (menghalangi) the access of polymerases to the promoter.
The Relationship between Genes and Proteins • Most genes encode the information for the synthesis of a protein • The sequence of bases in DNA codes for the sequence of amino acids in proteins
An Illustration of the transcription of DNA to RNA to protein which forms the backbone of molecular biology.
DNA codes for the production of RNA. • RNA codes for the production of protein. • Protein does not code for the production of protein, RNA or DNA.
The function of RNA polymerases • Both RNA and DNA polymerases can add nucleotides to an existing strand, extending its length. However, there is a major difference between the two classes of enzymes: RNA polymerases can initiate a new strand but DNA polymerases cannot. • The chemical reaction catalyzed by RNA polymerases
The function of RNA polymerases The nucleotides used to extend a growing RNA chain are ribonucleoside triphosphates (NTPs). Two phosphate groups are released as pyrophosphate (PPi) during the reaction. Strand growth is always in the 5' to 3' direction. The first nucleotide at the 5' end retains its triphosphate group
Gene's Regulatory Elements • Transcriptional regulation is mediated by the interaction between transcription factors and their DNA binding sites which are the cis-acting elements, whereas the sequences encoding transcription factors are trans-acting elements. The cis-acting elements may be divided into the following four types: • Promoters • Enhancers • Silencers • Response elements
Gene organization. The transcription region consists of exons and introns. The regulatory elements include promoter, response element, enhancer and silencer (not shown). Downstream refers to the direction of transcription, and upstream is opposite to the transcription direction. The number increases along the direction of transcription, with "+1" assigned for the initiation site. There is no "0" position. The base pair just upstream of +1 is numbered "-1", not "0".
1. Promoter • Promoter is the DNA region where the transcription initiation takes place. In prokaryotes, the sequence of a promoter is recognized by the Sigma (s) factor of the RNA polymerase. In eukaryotes, it is recognized by specific transcription factors. • Pada E. coli • E. coli has five sigma factors: • Sigma 70: Regulate expression of most genes. • Sigma 32: Regulate expression of heat shock proteins. • Sigma 28: Regulate expression of flagellar operon (involved in cell motion). • Sigma 38: Regulate gene expression against external stresses. • Sigma 54: Regulate gene expression for nitrogen metabolism.
Pada Eukaryotes • In eukaryotes, there is a significant difference between the transcription of protein genes and RNA genes. • The most common promoter element in eukaryoticprotein genes is the TATA box, located at -35 to -20. Another promoter element is called the initiator (Inr). It has the consensus sequence PyPyAN(T/A)PyPy, where Py denotes pyrimidine (C or T), N = any, and (T/A) means T or A. The base A at the third position is located at +1 (the transcriptional start site). • TATA box and initiator are the core promoter elements. There are other elements often located within 200 bp of the transcriptional start site, such as CAAT box and GC box which may be referred to as promoter-proximal elements. • The protein which interacts with the initiator and TATA box is known as the TATA-box binding protein (TBP), because the TATA box was discovered earlier than the initiator
2. Enhancers • Enhancer: is a nucleotide sequence to which transcription factor(s) bind, and which increases the transcription of a gene. • Enhancers are the positive regulatory elements located either upstream or downstream of the transcriptional initiation site. However, most of them are located upstream. • In prokaryotes, enhancers are quite close to the promoter, but eukaryotic enhancers could be far from the promoter. • An enhancer region may contain one or more elements recognized by transcriptional activators. • Enhancers are "conditional" - in other words, they enhance transcription only under certain conditions, for example in the presence of a hormone.
3. Silencer • Elements that are very similar to enhancers except that they have the function of binding proteins and inhibiting transcription.
4. Response elements • Response elements are the recognition sites of certain transcription factors. Most of them are located within 1 kb from the transcriptional start site.
1. Initiation of Transcription • RNA polymerase able to recognize the beginning of a gene so that it knows where to start synthesizing an mRNA. • It is directed to the start site of transcription by one of its subunits' affinity to a particular DNA sequence that appears at the beginning of genes. This sequence is called a promoter. • It is a unidirectional sequence on one strand of the DNA that tells the RNA polymerase both where to start and in which direction (that is, on which strand) to continue synthesis.
2. Elongation of Transcription • The RNA polymerase then stretches open the double helix at that point in the DNA and begins synthesis of an RNA strand complementary to one of the strands of DNA. • The RNA polymerase recruits rNTPs (ribonucleic nucleotides triphosphates) in the same way that DNA polymerase recruits dNTPs. However, since synthesis is single stranded and only proceeds in the 5' to 3' direction, there is no need for Okazaki fragments. • It is important to note that synthesis is proceeds in a unidirectional fashion.
3. Termination of Transcription • How does RNA polymerase know when to stop transcribing a gene? • This system has been elucidated in prokaryotes. It is important to know that since there is no nucleus in prokaryotes, ribosomes can begin making protein from an mRNA immediately upon its synthesis. At the end of a gene, the sequence of the mRNA allows it to form a hairpin loop, which blocks the ribosome. The ribosome falls off the mRNA, and that is the termination signal recognized by the RNA polymerase. As soon as the ribosome falls off the mRNA, the RNA polymerase falls off the DNA and transcription ceases.
RNA Processing • RNA Processing:pre-mRNA --> mRNA • All the primary transcripts produced in the nucleus must undergo processing steps to produce functional RNA molecules for export to the cytosol.
RNA processing • RNA processing is to generate a mature mRNA (for protein genes) or a functional tRNA or rRNA from the primary transcript. Processing of pre-mRNA involves the following steps: • Capping - adding 7-methylguanylate (m7G) to the 5' end. • Polyadenylation - adding a poly-A tail to the 3' end. • Splicing - removing introns and joining exons. • In some cases, RNA editing is also involved.
5'-Capping • Cap site: Two usages: In eukaryotes, the cap site is the position in the gene at which transcription starts, and really should be called the "transcription initiation site". The first nucleotide is transcribed from this site to start the nascent RNA chain. That nucleotide becomes the 5' end of the chain, and thus the nucleotide to which the cap structure is attached (see "Cap"). In bacteria, the CAP site (note the capital letters) is a site on the DNA to which a protein factor (the Catabolite Activated Protein) binds. • Capping occurs shortly after transcription begins. The chemical structure of the "cap" is shown in the following figure, where m7G is linked to the first nucleotide by a special 5'-5' triphosphate linkage. In most organisms, the first nucleotide is methylated at the 2'-hydroxyl of the ribose. In vertebrates, the second nucleotide is also methylated.
3'-Polyadenylation • A stretch of adenylate residues are added to the 3' end. The poly-A tail contains ~ 250 A residues in mammals, and ~ 100 in yeasts. Polyadenylation at the 3' end. The major signal for the 3' cleavage is the sequence AAUAAA. Cleavage occurs at 10-35 nucleotides downstream from the specific sequence. A second signal is located about 50 nucleotides downstream from the cleavage site. This signal is a GU-rich or U-rich region.
RNA splicing • RNA splicing is a process that removes introns and joins exons in a primary transcript. An intron usually contains a clear signal for splicing (e.g., the beta globin gene). In some cases (e.g., the sex lethal gene of fruit fly), a splicing signal may be masked by a regulatory protein, resulting in alternative splicing. In rare cases (e.g., HIV genes), a pre-mRNA may contain several ambiguous splicing signals, resulting in a few alternatively spliced mRNAs. • Splicing signal • Most introns start from the sequence GU and end with the sequence AG (in the 5' to 3' direction). They are referred to as the splice donor and splice acceptor site, respectively. However, the sequences at the two sites are not sufficient to signal the presence of an intron. Another important sequence is called the branch site located 20 - 50 bases upstream of the acceptor site. The consensus sequence of the branch site is "CU(A/G)A(C/U)", where A is conserved in all genes. • In over 60% of cases, the exon sequence is (A/C)AG at the donor site, and G at the acceptor site. • Figure 5-A-4. The consensus sequence for splicing. Pu = A or G; Py = C or U.
Splicing mechanism • The detailed splicing mechanism is quite complex. In short, it involves five snRNAs and their associated proteins. These ribonucleoproteins form a large (60S) complex, called spliceosome. Then, after a two-step enzymatic reaction, the intron is removed and two neighboring exons are joined together. The branch point A residue plays a critical role in the enzymatic reaction. • Schematic drawing for the formation of the spliceosome during RNA splicing. U1, U2, U4, U5 and U6 denote snRNAs and their associated proteins. The U3 snRNA is not involved in the RNA splicing, but is involved in the processing of pre-rRNA.
Summary of the steps • several protein transcription factors bind to promoter sites, usually on the 5' side of the gene to be transcribed • RNA polymerase, binds to the complex of transcription factors , working together, they open the DNA double helix • RNA polymerase proceeds down one strand moving in the 3' -> 5' direction as it does so, it assembles ribonucleotides (supplied as triphosphates, e.g., ATP) into a strand of RNA • each ribonucleotide is inserted into the growing RNA strand following the rules of base pairing. Thus for each C encountered on the DNA strand, a G is inserted in the RNA; for each G, a C; and for each T, an A. However, each A on the DNA guides the insertion of the pyrimidine uracil (U, from uridine triphosphate, UTP). There is no T in RNA. • synthesis of the RNA proceeds in the 5' -> 3' direction. • as each nucleoside triphosphate is brought in to add to the 3' end of the growing strand, the two terminal phosphates are removed
Types of RNA • Several types of RNA are synthesized: • messenger RNA (mRNA). This will later be translated into a polypeptide. • ribosomal RNA (rRNA). This will be used in the building of ribosomes: machinery for synthesizing proteins by translating mRNA. • transfer RNA (tRNA). RNA molecules that carry amino acids to the growing polypeptide. • small nuclear RNA (snRNA). DNA transcription of the genes for mRNA, rRNA, and tRNA produces large precursor molecules ("primary transcripts") that must be processed within the nucleus to produce the functional molecules for export to the cytosol. Some of these processing steps are mediated by snRNAs.
Types of RNA • Ribosomal RNA (rRNA) • There are 4 kinds. In eukaryotes, these are • 18S rRNA. One of these molecules, along with some 30 different protein molecules, is used to make the small subunit of the ribosome. • 28S, 5.8S, and 5S rRNA. One each of these molecules, along with some 45 different proteins, are used to make the large subunit of the ribosome. • The name given each type of rRNA reflects the rate at which the molecules sediment in the ultracentrifuge. The larger the number, the larger the molecule (but not proportionally).
Types of RNA • Transfer RNA (tRNA) • There are some 32 different kinds of tRNA in a typical eukaryotic cell. • each is the product of a separate gene • they are small (~4S), containing 73-93 nucleotides • many of the bases in the chain pair with each other forming sections of double helix • the unpaired regions form 3 loops • each kind of tRNA carries (at its 3' end) one of the 20 amino acids (thus most amino acids have more than one tRNA responsible for them) • at one loop, 3 unpaired bases form an anticodon • base pairing between the anticodon and the complementary codon on a mRNA molecule brings the correct amino acid into the growing polypeptide chain.
Types of RNA • Messenger RNA (mRNA) • Messenger RNA comes in a wide range of sizes reflecting the size of the polypeptide it encodes. Most cells produce small amounts of thousands of different mRNA molecules, each to be translated into a peptide needed by the cell. • Many mRNAs are common to most cells, encoding "housekeeping" proteins needed by all cells (e.g. the enzymes of glycolysis). Other mRNAs are specific for only certain types of cells. These encode proteins needed for the function of that particular cell (e.g., the mRNA for hemoglobin in the precursors of red blood cells).
Types of RNA • Small Nuclear RNA (snRNA) • Approximately a dozen different genes for snRNAs, each present in multiple copies, have been identified. • The snRNAs have various roles in the processing of the other classes of RNA. For example, several snRNAs are part of the spliceosome that participates in converting pre-mRNA into mRNA by excising the introns and splicing the exons.
The RNA polymerases • The RNA polymerases are huge multi-subunit protein complexes. Three kinds are found in eukaryotes. • RNA polymerase I (Pol I).It transcribes the rRNA genes for the precursor of the 28S, 18S, and 5.8S molecules. (and is the busiest of the RNA polymerases) • RNA polymerase II (Pol II). It transcribes the mRNA and snRNA genes. • RNA polymerase III (Pol III). It transcribes the 5S rRNA genes and all the tRNA genes.
However, the "Central Dogma" has had to be revised a bit. It turns out that you CAN go back from RNA to DNA, and that RNA can also make copies of itself. It is still not possible to go from Proteins back to RNA or DNA, and no known mechanism has yet been demonstrated for proteins making copies of themselves.
2. Synthesizing Proteins from the Instructions of DNA • Genetic information flows in a cell from: • DNA ->RNA-> Protein • In a prokaryotic cell, this process happens at the same time:
However, in an eukaryotic cell, the transcription & translation occur in different places: