Also found in: Dictionary, Thesaurus, Medical, Legal, Financial, Wikipedia.
The rules by which the base sequences of deoxyribonucleic acid (DNA) are translated into the amino acid sequences of proteins. Each sequence of DNA that codes for a protein is transcribed or copied into messenger ribonucleic acid (mRNA). Following the rules of the code, discrete elements in the mRNA, known as codons, specify each of the 20 different amino acids that are the constituents of proteins. During translation, another class of RNAs, called transfer RNAs (tRNAs), are coupled to amino acids, bind to the mRNA, and, in a step-by-step fashion provide the amino acids that are linked together in the order called for by the mRNA sequence. The specific attachment of each amino acid to the appropriate tRNA, and the precise pairing of tRNAs via their anticodons to the correct codons in the mRNA, form the basis of the genetic code. See Deoxyribonucleic acid (DNA), Protein, Ribonucleic acid (RNA)
The genetic information in DNA is found in the sequence or order of four bases that are linked together to form each strand of the two-stranded DNA molecule. The bases of DNA are adenine, guanine, thymine, and cytosine, which are abbreviated as A, G, T, and C. Chemically, A and G are purines, and C and T are pyrimidines. The two strands of DNA are wound about each other in a double helix that looks like a twisted ladder. Each rung of the ladder is formed by two bases, one from each strand, that pair with each other by means of hydrogen bonds. For a good fit, a pyrimidine must pair with a purine; in DNA, A bonds with T, and G bonds with C. See Purine, Pyrimidine
Ribonucleic acids such as mRNA or tRNA also comprise four bases, except that in RNA the pyrimidine uracil (U) replaces thymine. During transcription a single-stranded mRNA copy of one strand of the DNA is made.
If two bases at a time are grouped together, then only 4 × 4 or 16 different combinations are possible, a number that is insufficient to code for all 20 amino acids that are found in proteins. However, if the four bases are grouped together in threes, then there are 4 × 4 × 4 or 64 different combinations. Read sequentially without overlapping, those groups of three bases constitute a codon, the unit that codes for a single amino acid.
The 64 codons can be divided into 16 families of four (see illustration), in which each codon begins with the same two bases. With the number of codons exceeding the number of amino acids, several codons can code for the same amino acid. Thus, the code is degenerate. In eight instances, all four codons in a family specify the same amino acid. In the remaining families, the two codons that end with the pyrimidines U and C often specify one amino acid, whereas the two codons that end with the purines A and G specify another. Furthermore, three of the codons, UAA, UAG, and UGA, do not code for any amino acid but instead signal the end of the protein chain.
On the ribosome, the nucleic acid code of an mRNA is converted into an amino acid sequence with the aid of tRNAs. These RNAs are relatively small nucleic acids, varying from 75 to 93 bases in length, that are folded in three dimensions to form an L-shaped molecule to which an amino acid can be attached. At the other end of the tRNA molecule, three bases are free to pair with a codon in the mRNA. These three bases of a tRNA constitute the anticodon. Each amino acid has one or more tRNAs, and because of the degeneracy of the code, many of the tRNAs for a specific amino acid have different anticodon sequences. However, the tRNAs for one amino acid are capable of pairing their anticodons only with the codon or codons in the mRNA that specify that amino acid. The tRNAs act as interpreters of the code, providing the correct amino acid in response to each codon by virtue of precise codon-anticodon pairing. The tRNAs pair with the codons and sequentially insert their amino acids in the exact order specified by the sequence of codons in the mRNA. See Ribosomes
The rules of the genetic code are virtually the same for all organisms, but there are some interesting exceptions. In the microorganism Mycoplasma capricolum, UGA is not a stop codon; instead it codes for tryptophan. This alteration in the code is also found in the mitochondria of some organisms. In addition to changes in the meanings of codons, a modified system for reading codons that requires fewer tRNAs is found in mitochondria. See Gene, Gene action, Genetics
a system of coding genetic information in molecules of nucleic acids that is realized in animals, plants, bacteria, and viruses in the form of a sequence of nucleotides.
Natural nucleic acids—deoxyribonucleic acid (DNA) and ribonucleic acid (RNA)—contain five common types of nucleotides (four in each nucleic acid) that differ with respect to the nitrogen base. The following bases are found in DNA: adenine (A), guanine (G), cytosine (C), and thymine (T). RNA contains uracil (U) instead of thymine. In addition, the nucleic acids contain about 20 rare (so-called noncanonical) bases as well as unusual sugars. Because the number of coding signs (four) and number of varieties of amino acids in protein (20) do not coincide, the code number (that is, the number of nucleotides that code one amino acid) cannot equal 1. There can only be 42 = 16 different dinucleotide combinations, but this too is not sufficient to code all the amino acids. The American scientist G. Gamow proposed
(1954) the model of a triplet genetic code, that is, one in which a group of three nucleotides, called a codon, codes a single amino acid. The number of possible triplets is 43 = 64, but since this is more than triple the number of common amino acids, it was hypothesized that several codons exist for each amino acid (that is, the code is degenerate).
Many different models of the genetic code have been suggested, of which three merit serious attention (see Figure 1): overlapping code without commas, nonoverlapping code without commas, and code with commas. In 1961 F. Crick (Great Britain) and his coworkers confirmed the hypothesis of a triplet, nonoverlapping code without commas. The following main regularities of the genetic code were established. (1) There is a linear correlation between the sequence of nucleotides and the sequence of amino acids coded (colinearity of the genetic code). (2) The genetic code begins from a fixed point. (3) It proceeds in one direction within a single gene. (4) The code is nonoverlapping. (5) There are no intervals in the code (the code has no commas). (6) The genetic code is, as a rule, degenerate, that is, two or more triplet-synonyms code one amino acid (the degeneracy of the genetic code decreases the probability that the mutational substitution of a base in the triplet will result in an error). (7) The code number is three. (8) The code is universal for all organisms (with a few exceptions). The universality of the genetic code was confirmed by experiments with protein synthesis in vitro. If one adds to a cell-free system obtained from one organism (for example, the colon bacillus) a nucleic acid matrix obtained from another organism evolutionarily remote from it (for example, pea seedlings), protein synthesis will take place in this system. Thanks to the work of the American geneticists M. Nirenberg, S. Ochoa, and H. Khor-ana, not only the composition but also the order of the nucleotides in all the codons is known. (See Table 1 for data from experiments with the colon bacillus.)
Of 64 codons in bacteria and bacteriophages, three (UAA,
UAG, and UGA) do not code amino acids. They function as a signal for the release of the polypeptide chain from the ribosome; that is, they signal that the synthesis of the polypeptide is completed. They are called terminating codons. There are also three signals for the start of synthesis, the so-called initiatory codons (AUG, GUG, and UUG), which are incorporated at the beginning of the corresponding messenger RNA and determine the incorporation of formylmethionine in the first position of the polypeptide chain being synthesized. The foregoing facts apply to bacterial systems; much is still unknown concerning the higher organisms. For example, the codon UGA in higher organisms may be meaningful. Nor is the mechanism of initiation of a polypeptide too clear.
The genetic code is realized in a cell in two stages. The first takes place in the nucleus. It is called transcription and consists of the synthesis of molecules of messenger RNA on the corresponding segments of DNA. In this process, the sequence of DNA nucleotides is “transcribed” into the sequence of RNA nucleotides. The second, or translation, stage takes place in the cytoplasm, on the ribosomes, where the sequence of messenger RNA nucleotides is translated into the sequence of amino acids in protein. This stage involves the participation of transfer RNA and corresponding enzymes.
REFERENCES“Obshchaia priroda geneticheskogo koda dlia belkov.” In the collection Molekuliarnaia genetika. Moscow, 1963. (Translated from English.)
Crick, F. “Geneticheskii kod (I).” In Struktura i funktsiia kletki. Moscow, 1964. Pages 9-23. (Translated from English.)
Nirenberg, M. “Geneticheskii kod (II).” Ibid., pp. 24-41.
Hayes, W. Genetika bakterii i bakteriofagov. Moscow, 1965. (Translated from English.)
Hartman, P., and S. Suskind. Deistvie gena. Moscow, 1966. (Translated from English.)
Bresler, S. E. Vvedenie v molekuliarnuiu biologiiu, 2nd ed. Moscow-Leningrad, 1966.
Ingram, V. Biosintez makromolekul. Moscow, 1966. (Translated from English.)
Lobashev, M. E. Genetika, 2nd ed. Leningrad, 1967.
Watson, J. Molekuliarnaia biologiia gena. Moscow, 1967. (Translated from English.)
Soifer, V. N. Molekuliarnye mekhanizmy mutageneza. Moscow, 1969.
Dubinin, N. P. Obshchaia genetika. Moscow, 1970.
N. P. DUBININ and V. N. SOIFER