Figure 4.8. The genetic code and the corresponding amino acid side chains. The three- and one-letter amino acid abbreviations are shown. Hydrophilic side chains are shown in green, hydrophobic side chains in black. The significance of this distinction is discussed in Chapter 9.
To save time we usually write an amino acid as either a three-letter abbreviation, for example, glycine is written as gly and leucine as leu, or as a one-letter code, for example, glycine is G and leucine is L. Figure 4.8 shows the full name, and the three- and one-letter abbreviations, used for each of the 20 amino acids found in proteins.
To introduce the terms degenerate and ambiguous, consider the English language. English shows considerable degeneracy, meaning that the same concept can be indicated using a number of different words—think, for example, of lockup, cell, pen, pound, brig, and dungeon. English also shows ambiguity, so that it is only by context that one can tell whether cell means a lockup or the basic unit of life. Like the English language the genetic code shows degeneracy, but unlike language the code is unambiguous.
The 64 codons of the genetic code are shown in Figure 4.8 together with the side chains of the amino acids for which each codes. Amino acids with hydrophilic side chains are shown in green while those with hydrophobic side chains are in black. The importance of this distinction will be discussed in Chapter 9. Sixty-one codons specify an amino acid, and the remaining three act as stop signals for protein synthesis. Methionine and tryptophan are the only amino acids coded for by single codons. The other 18 amino acids are encoded by either 2, 3, 4, or 6 codons and so the code is degenerate. No triplet codes for more than one amino acid and so the code is unambiguous. Notice that when two or more codons specify the same amino acid, they usually only differ in the third base of the triplet. Thus mutations can arise in this position of the codon without altering amino acid sequences. Perhaps degeneracy evolved in the triplet system to avoid a situation in which 20 codons each meant one amino acid, and 44 specified none. If this were the case, then most mutations would stop protein synthesis dead.
The order of the codons in DNA and the amino acid sequence of a protein are colinear. The start signal for protein synthesis is the codon AUG specifying the incorporation of methionine. Because the genetic code is read in blocks of three, there are three potential reading frames in any mRNA. Figure 4.9 shows that only one of these results in the synthesis of the correct protein. When we look at a sequence of bases, it is not obvious which of the reading frames should be used to code for protein. As we shall see later met his glu tyr A U G I C U A I G A AI U A C ... reading frame 1
cys stop asn AI U G C I U A G IA A U IA C ... reading frame 2
ala arg ile A UIG C UIA G AIA U AIC ... reading frame 3
Figure 4.9. Reading frames—the genetic code is read in blocks of three.
frameshift mutation met AUG
trp val glu
Ugg guc gag a.
U deleted ser UCG
normal mutant nonsense mutation met AUG
Was this article helpful?