RNA synthesis
The synthesis of RNA is performed by enzymes called RNA polymerases. In higher organisms there are three main RNA polymerases, designated I, II, and III (or sometimes A, B, and C). Each is a complex protein consisting of many subunits. RNA polymerase I synthesizes three of the four types of rRNA (called 18S, 28S, and 5.8S RNA); therefore it is active in the nucleolus, where the genes encoding these rRNA molecules reside. RNA polymerase II synthesizes mRNA, though its initial products are not mature RNA but larger precursors, called heterogeneous nuclear RNA, which are completed later (see below Processing of mRNA). The products of RNA polymerase III include tRNA and the fourth RNA component of the ribosome, called 5S RNA.
All three polymerases start RNA synthesis at specific sites on DNA and proceed along the molecule, linking selected nucleotides sequentially until they come to the end of the gene and terminate the growing chain of RNA. Energy for RNA synthesis comes from high-energy phosphate linkages contained in the nucleotide precursors of RNA. Each unit of the final RNA product is essentially a sugar, a base, and one phosphate, but the building material consists of a sugar, a base, and three phosphates. During synthesis two phosphates are cleaved and discarded for each nucleotide that is incorporated into RNA. The energy released from the phosphate bonds is used to link the nucleotides. The crucial feature of RNA synthesis is that the sequence of nucleotides joined into a growing RNA chain is specified by the sequence of nucleotides in the DNA template: each adenine in DNA specifies uracil in RNA, each cytosine specifies guanine, each guanine specifies cytosine, and each thymine in DNA specifies adenine. In this way the information encoded in each gene is transcribed into RNA for translation by the protein-synthesizing machinery of the cytoplasm.
In addition to specifying the sequence of amino acids to be polymerized into proteins, the nucleotide sequence of DNA contains supplementary information. For example, short sequences of nucleotides determine the initiation site for each RNA polymerase, specifying where and when RNA synthesis should occur. In the case of RNA polymerases I and II, the sequences specifying initiation sites lie just ahead of the genes. In contrast, the equivalent information for RNA polymerase III lies within the gene—that is, within the region of DNA to be copied into RNA. The initiation site on a segment of DNA is called a promoter. The promoters of different genes have some nucleotide sequences in common, but they differ in others. The differences in sequence are recognized by specific proteins called transcription factors, which are necessary for the expression of particular types of genes. The specificity of transcription factors contributes to differences in the gene expression of different types of cells.
Processing of mRNA
During and after synthesis, mRNA precursors undergo a complex series of changes before the mature molecules are released from the nucleus. First, a modified nucleotide is added to the start of the RNA molecule by a reaction called capping. This cap later binds to a ribosome in the cytoplasm. The synthesis of mRNA is not terminated simply by the RNA polymerase’s detachment from DNA, but by chemical cleavage of the RNA chain. Many (but not all) types of mRNA have a simple polymer of adenosine residues added to their cleaved ends.
In addition to these modifications of the termini, startling discoveries in 1977 revealed that portions of newly synthesized RNA molecules are cut out and discarded. In many genes, the regions coding for proteins are interrupted by intervening sequences of nucleotides called introns. These introns must be excised from the RNA copy before it can be released from the nucleus as a functional mRNA. The number and size of introns within a gene vary greatly, from no introns at all to more than 50. The sum of the lengths of these intervening sequences is sometimes longer than the sum of the regions coding for proteins.
The removal of introns, called RNA splicing, appears to be mediated by small nuclear ribonucleoprotein particles (snRNP’s). These particles have RNA sequences that are complementary to the junctions between introns and adjacent coding regions. By binding to the junction ends, an snRNP twists the intron into a loop. It then excises the loop and splices the coding regions.
Regulation of genetic expression
Although all the cell nuclei of an organism generally carry the same genes, there are conspicuous differences between the specialized cell types of the body. The source of these differences lies not so much in the occasional modification of DNA, as outlined above, but in the selective expression of DNA through RNA; in particular, it can be traced to processes regulating the amounts and activities of mRNA both during and after its synthesis in the nucleus.
Regulation of RNA synthesis
The first level of regulation is mediated by variations in chromatin structure. In order to be transcribed, a gene must be assembled into a structurally distinct form of active chromatin. A second level of regulation is achieved by varying the frequency with which a gene in the active conformation is transcribed into RNA by an RNA polymerase. There is evidence for regulation of RNA synthesis at both these levels—for example, in response to hormone induction. At both levels, protein factors are believed to perform the regulation—for example, by binding to special promoter DNA regions flanking the transcribed gene.