Gene Ther Mol Biol Vol 10, 1-12, 2006

 

Low-usage codons and rare codons of Escherichia coli

Mini Review

 

Dequan Chen* and Donald E. Texada

Department of Ophthalmology, Louisiana State University Health Sciences Center, Shreveport, LA 71130

__________________________________________________________________________________

*Correspondence: Dequan Chen, Ph.D., Institute for Retina Research, 8210 Walnut Hill Lane, PBI, Suite 010, Dallas, TX 75231, USA; Phone: (214) 345-6801; email: dequan.chen@irrdallas.org.

Key words: low-usage codon, rare codon, common codon, major codon, codon usage, infrequent codon, rare tRNA, codon optimization, and rare tRNA supplementation

Abbreviations: Escherichia coli, (E. coli); rare codon, (RC); rare codon cluster, (RCC)

Received: 4 September 2005; Accepted: 28 December 2005; electronically published: January 2006

 

Summary

In Escherichia coli (E. coli), a low-usage codon is defined as a codon that is used rarely or infrequently in the genome with usage frequency lower than the smallest value (or frequency cut-off) among the usage frequencies of non-degenerate codons (Met codon AUG and Trp codon UGG) and the optimal codons for amino acids Leu, Ile, Val, Ser, Pro, Thr, Ala, Arg, Gly and Gln that have 2 or more degenerate codons with each having specific corresponding cognate tRNA for the optimal codon. A rare codon (RC), an infrequent codon or a minor codon is equivalently defined as a synonymous codon or a stop codon that is not only used rarely or infrequently in a genome but also decoded by a low-abundant tRNA (rare tRNA) or other factor(s) in an organism. The translational rate for a sense RC is much lower than that for a common (major) codon due to tRNA availability. A low-usage codon is not necessarily a RC, e.g., Cys codons UGU and UGC, Thr codons ACU and ACG, or His codons CAC and CAU are not rare codons of E. coli. However, a RC is definitely a low-usage codon. In E. coli, there are about 30 low-usage synonymous sense codons but only 20 of them are determined to be the bacterial RCs including 7 (AGG, AGA, CGA, CUA, AUA, CCC and CGG) used at a frequency of < 0.5% (Group I) and 13 (ACA, CCU, UCA, GGA, AGU, UCG, CCA, UCC, GGG, CUC, CUU, UCU and UUA) used at a frequency of > 0.5% (Group II). Studies have demonstrated that all the RCs in Group I and the first 6 RCs in Group 2 can cause translational problems in E. coli.

I. Introduction

Many proteins including those that can be used in treatment of certain disease (e.g., insulin), can rarely be obtained in large quantities from their natural sources. Besides, their purification or isolation is often not easy, and the cost is often pretty high. Recombinant DNA techniques have been successfully used in the past to express and purify these kinds of proteins. The bacterium E. coli has been and will continue to be the main, popular and first-choice expression host because it facilitates recombinant protein expression by its relative simplicity, its inexpensive and fast high-density cultivation, its well-known genetics and the availability of a large number of compatible tools including mutant strains, recombinant fusion partners and plasmids (Gold, 1990; Hodgson, 1993; Olins and Lee, 1993; Kane, 1995; Makrides, 1996; Jonasson et al, 2002; Sorensen and Mortensen, 2005a; Sorensen and Mortensen, 2005b). However, not every foreign gene can be efficiently expressed in E. coli, probably due to the unique and subtle structure of the target gene, the mRNA low stability and slow translational efficiency, the uneasy protein folding, the target protein degradation by E. coli proteases, the different codon usage between the organism of the foreign gene and native E. coli, or the toxicity of the expressed target protein (Olins et al, 1993; Makrides, 1996; Jonasson et al, 2002).

A number of studies have revealed that RCs and rare codon clusters (RCC) are capable of qualitatively and quantitatively causing expression problems in E. coli or other organisms (Kane, 1995; Makrides, 1996; Gurvich et al, 2005), and these problems mainly occur on translation level rather than on transcription level or other levels. The main translational problems caused by RCs or RCCs include (a) that rare codons reduce the translation rate of the target gene, (b) the expressed target protein is low or undetectable, (c) amino acids are misincorporated into the target protein, (d) truncated or amino acids-deleted peptides or proteins are synthesized, and (e) frame-shifted peptides or proteins are synthesized (Pedersen, 1984; Pohlner et al, 1986; Sorensen et al, 1989; Gurskii et al, 1992b; Kane et al, 1992; Gursky and Beabealashvilli, 1994; Vilbois et al, 1994; Kane, 1995; Calderone et al, 1996; Kleber-Janke and Becker, 2000; Kapust et al, 2002; McNulty et al, 2003; Flick et al, 2004; Shu et al, 2004; Choi et al, 2004; Chen et al, 2004; Gurvich et al, 2005). However, different groups often arbitrarily used different sets of codons as their rare or low-usage codons, or equivalently used low-usage codons and rare-tRNA associated codons. This may at least result in the following problems: (a) some codons are rare codons or low-usage codons to some groups but not to others, and vice versa; and (b) over- or under-estimation of the effects of rare codons on the expression of a gene even in the same system just because of the difference of low-usage or rare codons being defined or studied. To overcome these problems, universal meanings for low-usage codon and/or rare codon should be defined and the list of low-usage codons or rare codons in an organism should be determined. Therefore, the objectives of this review are mainly to unify and differentiate the meanings of a low-usage codon and a rare-tRNA associated codon (RC in short) as well as to determine the lists of the low-usage codons and rare-tRNA associated codons in E. coli.

II. Codon usage in E. coli

Codon usage was defined by Zhang et al in 1999 as the number of times (frequency) a codon is translated per unit time in the cell of an organism. This is a definition for real-time codon usage. But it is hard to be measured in vivo. Zhang et al, used 3 different methods to estimate the codon usage in E. coli and other organisms in their studies including measuring the average frequencies of codons in the sequenced protein-coding genes in an organism. All their methods gave approximately the same results as regards the hierarchy for "most used' and "least used" codons within each synonymous codon family (Zhang et al, 1991). Therefore, it is reasonable to use averaged codon frequency of the sequenced protein-coding reading frames of an organism to roughly represent the real-time codon usage although this may over-estimate the usages of infrequently or rarely used codons and underestimate those of frequently used codons because different reading frames are used and translated for different number of times in the organism at a given time (Zhang et al, 1991). Besides, this is what codon usage generally means to many scientists in the past and at present.

Before the 1980s and after the discovery of genetic code redundancy or degeneracy (an amino acid except Met and Trp is encoded by 2 to 6 codons), it was often thought that degenerate codons for the same amino acid were used randomly in a genome. This is based on the simplest assumption that all genomes have uniform codon usage meaning that synonymous codons (degenerate codons for same amino acid) are used with equal frequency. As more and more sequence data (especially the gene sequences of bacterium E. coli and yeast Saccharomyces cerevisiae) appeared in the late 1970s and early 1980s, it came to light that (a) synonymous codon usage is consistently similar for all genes within each type of genome or organism (Grantham et al, 1980a,b, 1981), and (b) synonymous codon usage was not random, i.e., synonymous codons are not used with equal frequency in a genome (Ikemura, 1985; Sharp et al, 1988; Zhang et al, 1991; Sorensen et al, 2005a). This is also true for non-synonymous codon usage (non-random usage of different codons for different amino acids). Therefore, the codon usage among degenerate codons in each organism is biased, with some codons more preferred (at higher usage frequency or used more frequently) than the other(s). Further, studies also found that codon usage bias is greater in highly expressed genes than poorly expressed genes (Gouy and Gautier, 1982; Sharp and Li, 1986; Makrides, 1996). That is to say, highly expressed genes in an organism mostly use preferred codons (especially the most preferred or optimal codons) and avoid non-preferred codons while poorly expressed genes use fewer preferred or optimal codons but more non-preferred codons (Ikemura, 1985). Meanwhile, codon pair usage was even found not to be random (Nussinov, 1981; Lipman and Wilbur, 1983; Gutman and Hatfield, 1989; Irwin et al, 1995).

The codon usage frequencies for the 64 codons (3 stop codons, and 61 sense codons - codons that encode amino acids) of E. coli, calculated from the GenBank genetic sequence data (Releases # 63, 69 and 147), are shown in Table 1. The data in the table demonstrate that as the total number of the codons or protein-coding genes included in each GenBank release increases (especially from #69 to #147, which is about 8 times increase), the calculated frequency for a given sense codon changes as follows:

(a) The frequencies of low-usage codons (highlighted by red, purpurple and blue) has a tendency to increase (those of CUC, GUA, UCG, CCA, CAU, UGU, CGA, UCU and ACU increase very little while those of others a lot) except those of CAC, UGC and UCC (the last two usage frequencies decrease very little);

(b) The frequencies of some high-usage codons change very little (those of GUA, ACU, GCC, GCA and GAU increase while those of AUG, GUU, GUC, GCU, AAA, GAG and AGC decrease),

(c) The frequencies of some high-usage codons have a tendency to increase (for those of UUU, AUU, UAU, CAA, AAU and UGG) while the frequencies of other high-usage codons, on the contrary, have a tendency to decrease (for those of UUC, CUG, AUC, GUG, CCG, ACC, GCG, UAC, CAG, AAC, GAC, GAA, CGU, CGC GGU and GGC).

The above results may imply the following:

(1) Some codons, whether at high-usage (see above b) or at low-usage (see above a), are used at about the same frequency in the old sequenced proteins (e.g., the proteins included in GenBank release #69) as in the new sequenced proteins (e.g., the proteins included in GenBank release #147 but not in #69). Therefore, their usage-frequencies change very little between the GenBank releases, and the usage frequency calculated from GenBank release #69 or 147 should all well represent their real-time codon usage frequencies (Table 1).

(2) Most low-usage codons (see above a) and some high-usage codons (see above c) are not well used by the old sequenced proteins, and the new sequenced proteins (e.g., the proteins included in GenBank release #147 but not in #69) have used more of them. Therefore, their usage frequencies increase over the total number of protein genes included in the GenBank releases. Two factors should contribute to the phenomenon: one factor is that these codons especially low-usage codons are more frequently used in the new sequenced protein genes (most of them are poorly expressed), which results in the increase of their calculated usage frequencies; the other factor, on the contrary, is that poorly expressed genes have low expression rates at a give time of the bacterial life, and averaging over the entire genome without weighting the number of times different reading frames are being translated leads to over-estimation of their codon usage frequencies (Zhang et al, 1991). Therefore, the real-time codon usage frequencies for these codons should be much lower than the data calculated from GenBank release #147 but somewhere around the data from GenBank release #69 (Table 1).

(3) Some high-usage codons (see above c) are well used by the old sequenced proteins, and the new sequenced proteins (e.g., the proteins included in GenBank release #147 but not in #69) have used less of them. Therefore, their usage frequencies decrease over the total number of protein genes included in the GenBank releases. The above two factors should also contribute to this but in a reverse direction: the first factor is that these codons are less frequently used in the new sequenced protein genes (most of them are poorly expressed), which results in the true decrease of their calculated usage frequencies; the second is that poorly expressed genes have low expression rates at a give time of the bacterial life, and averaging over the entire genome without weighting the number of times different reading frames are being translated leads to under-estimation of the codon usage frequencies for these codons (Zhang et al, 1991). Therefore, the real-time codon usage frequencies for these codons should be much higher than the data from GenBank release #147 but somewhere around the data from GenBank release #69 or even #63 (Table 1).

The above analysis suggests that the frequency values listed in the II columns (calculated from GenBank release #69) of Table 1 most likely and approximately better represent the real-time codon usage frequencies in E. coli.

Dong et al, measured E. coli codon usage frequencies at different bacterial growth rates (0.4-2.5 doublings per hour), which were calculated from the coding frames of 140 protein mRNAs (Dong et al, 1996). The results has been adapted and presented to Table 2.

Table 1. Codon frequencies used by protein-coding reading frames of E. colia

I b

II c

III d

I b

II c

III d

I b

II c

III d

I b

II c

III d

UUU

18.85

19.2

22.46

UCU

10.47

10.4

10.94

UAU

15.09

15.4

18.34

UGU

4.80

4.7

5.35

UUC

18.07

18.2

15.62

UCC

9.43

9.4

9.29

UAC

13.29

13.4

12.01

UGC

6.07

6.1

5.99

UUA

10.52

10.9

14.98

UCA

6.52

6.8

9.94

UAA

1.99

2.0

1.99

UGA

0.80

0.8

1.04

UUG

11.33

11.5

12.86

UCG

7.89

8.0

8.52

UAG

0.20

0.2

0.29

UGG

12.90

12.8

13.78

CUU

9.92

10.2

12.49

CCU

6.57

6.6

7.90

CAU

11.35

11.6

12.47

CGU

24.70

24.1

18.92

CUC

9.70

9.9

10.08

CCC

4.19

4.3

5.63

CAC

10.74

10.7

8.82

CGC

21.50

22.1

18.38

CUA

2.97

3.2

4.47

CCA

8.12

8.2

8.63

CAA

13.07

13.2

14.38

CGA

3.06

3.1

4.03

CUG

54.10

54.6

46.04

CCG

23.91

23.8

19.35

CAG

29.68

30.1

28.12

CGG

4.62

4.6

6.49

AUU

27.27

27.2

29.67

ACU

10.83

10.2

11.02

AAU

16.30

16.3

22.83

AGU

7.37

7.2

10.73

AUC

26.97

26.5

22.69

ACC

24.37

24.3

21.39

AAC

24.35

23.9

21.20

AGC

14.95

15.2

15.00

AUA

3.94

4.1

8.22

ACA

6.53

6.5

10.70

AAA

37.47

36.5

35.60

AGA

2.14

2.1

4.47

AUG

26.33

26.5

25.95

ACG

12.54

12.7

13.78

AAG

11.94

12.0

13.05

AGG

1.32

1.4

2.56

GUU

20.79

20.1

20.04

GCU

17.86

17.4

17.36

GAU

32.14

32.3

32.88

GGU

28.48

27.6

24.93

GUC

14.09

14.2

14.04

GCC

23.18

23.5

23.87

GAC

22.03

21.8

18.83

GGC

30.41

30.2

25.66

GUA

12.06

11.6

11.90

GCA

20.92

20.8

21.60

GAA

43.75

43.4

38.02

GGA

6.95

7.0

10.61

GUG

24.68

25.3

23.47

GCG

32.94

33.1

27.99

GAG

19.03

19.2

18.80

GGG

9.63

9.7

11.58

a. The usage of each codon is expressed as the frequency per 1000 codons, which is calculated by division of the absolute number of the indicated codon by the total number of codons used in all the sequenced E. coli protein-coding sequences or reading frames.

b. Taken from Zhang et al (1991). Codon usage frequency was calculated from 323059 codons of 968 protein coding reading frames (CDS) (GenBank Version 63.0, 15 March 1990).

c. Taken from Wada et al (1991). Codon usage frequency was calculated from 524410 codons of 1562 protein coding reading frames (CDS) (GenBank Version 69.0, September 1991).

d. Taken and adapted from http://www.kazusa.or.jp/codon (Nakamura et al, 2000). Codon usage frequency was calculated from 4182749 codons of 13778 protein coding reading frames (CDS) (GenBank Version 147.0, 1 June 2005).

Table 2. Real-time codon frequencies used by protein-coding reading frames of E. colia

Growth

Rate b

Growth

Rate b

Growth

Rate b

Growth

Rate b

0.4

1.07

2.5

0.4

1.07

2.5

0.4

1.07

2.5

0.4

1.07

2.5

UUU

12.55

10.30

7.92

UCU

13.12

14.14

16.33

UAU

10.68

8.90

6.72

UGU

4.23

3.64

2.76

UUC

22.68

22.44

23.25

UCC

11.15

12.09

11.68

UAC

16.20

16.71

16.52

UGC

5.29

4.77

3.81

UUA

6.13

4.64

2.73

UCA

3.89

3.09

1.98

UAA

2.77

3.38

4.18

UGA

0.31

0.23

0.19

UUG

6.63

5.72

4.27

UCG

6.05

4.58

2.51

UAG

0.00

0.00

0.00

UGG

9.76

8.69

7.03

CUU

5.70

4.64

3.86

CCU

4.99

4.79

4.38

CAU

9.23

8.11

6.78

CGU

31.12

36.61

43.82

CUC

6.19

5.52

4.09

CCC

3.32

2.10

1.09

CAC

13.90

13.91

14.21

CGC

22.25

22.39

20.59

CUA

2.15

1.53

0.82

CCA

6.52

6.40

5.18

CAA

10.91

8.98

7.01

CGA

1.32

0.99

0.67

CUG

60.13

61.29

60.75

CCG

29.51

28.88

28.82

CAG

29.24

28.33

27.28

CGG

1.75

1.23

0.62

AUU

21.38

19.26

15.79

ACU

13.88

16.76

20.64

AAU

9.79

7.79

5.61

AGU

3.99

3.01

2.19

AUC

36.68

39.15

43.86

ACC

26.51

27.10

26.70

AAC

27.95

28.64

29.21

AGC

11.97

10.69

9.31

AUA

0.93

0.75

0.52

ACA

3.48

2.99

2.61

AAA

44.43

49.07

55.01

AGA

1.12

0.84

0.63

AUGc

25.32

25.82

25.90

ACG

7.53

6.21

4.17

AAG

12.08

13.74

17.22

AGG

0.09

0.05

0.03

GUU

31.31

35.63

43.18

GCU

28.85

32.14

39.49

GAU

24.25

22.40

19.27

GGU

38.29

40.49

45.55

GUC

11.25

9.71

7.67

GCC

19.80

16.81

11.81

GAC

28.72

30.93

33.74

GGC

35.62

35.54

34.17

GUA

15.87

18.65

22.31

GCA

22.13

22.38

24.87

GAA

53.10

55.10

57.86

GGA

2.71

2.21

1.26

GUG

21.40

18.93

14.98

GCG

30.33

28.45

24.11

GAG

16.57

17.04

16.97

GGG

4.81

3.57

2.36

a. Taken from Dong et al (1996). The usage of each codon is expressed as the frequency per 1000 codons. The codon frequencies were the averages from 140 proteins and calculated on the basis of the relative weight fraction of each protein and on the assumption that the amount of a protein accumulated in the cell during the steady growth is proportional to the amount of its corresponding mRNA in the bacteria.

b. Growth rate is expressed as doublings per hour. Different growth rates were obtained by varying the nutrient contents of the culturing media .

c. The data for AUG usage frequency are the sum of the frequencies for Metf1, Metf2, and Metm.

 

Although the number of protein coding frames used is very small, the frequency values were obtained by weighting every protein amount at each growth rate of E. coli according to the data reported by Pedersen et al (Pedersen et al, 1978). Therefore, the codon usage frequencies in Table 2 are real-time codon usage values. The data in Table 2 demonstrate that (a) E. coli codon usage is biased at all studied bacterial growth rate, (b) the frequencies of low-usage sense codons (marked by red and purple) decrease with increasing growth rate, and (c) the frequencies of some high-usage sense codons (UUC, AUC, GUU, GUA, UCU, ACU, GCU, GCA, CAC, AAC, AAA, GAC, GAA, CGU and GGU) increase while those of others decrease over the increase of growth rate. In addition, most codon usage frequencies in Table 2 are in good agreement with those in Table 1.

III. tRNA abundance in E. coli

Codon usage bias in an organism may have been formed during evolution by the combinatory effects of various factors such as the adaptation of gene expressivity to various growth conditions (Gouy et al, 1982), the adaptation of codons to tRNA availability (Ikemura, 1980, 1981a,b, 1985), the adaptation of codon-anticodon paring or interaction to have optimal or intermediate energy strength (Grosjean et al, 1978; Grosjean and Fiers, 1982), the adaptation of codon mutations to form specific mRNA secondary structure(s), etc. But codon adaptation to tRNA availability are attributed to have played a key role in the formation of biased codon usage because organism-specific codon usage patterns were demonstrated to correlate with the abundance spectra of organism-specific populations of isoaccepting or cognate tRNAs (Ikemura, 1980, 1981a,b, 1985).

The relative contents of tRNAs for normally growing E. coli, which were measured by Ikemura (1981a, 1981b, 1985), are listed in Table 3. The data (relative contents for 38 or 40 tRNAs) in the table demonstrate that: (a) the abundance of tRNAGly3 is the highest (relative amount is 1.1) among all the E. coli tRNAs and it can recognize/decode two codons (GGU and GGC), immediately followed by tRNAVal1, tRNAAla(GCY) and tRNAIle1 (relative amounts are 1.05, 1.04 and 1.0 respectively) in succession with the first recognize 3 codons (GUA, GUG and GUU) while the latter 2 recognizing 2 codons (GCC and GCU, AUU and AUC respectively); (b) tRNALeu1 is a tRNA that recognizes only one single codon (CUG) and at the same time has the highest abundance (relative amount is 1.0); (c) some tRNAs including the cognate tRNAs for CUA, AUA, CGG, AGA and AGG, ACA and ACG, CCC, or UGU and UGC, have very low abundances while the abundances for other tRNAs are different with relative amounts ranging from 0.1 to 0.9; and (d) UCU (for Ser), GUU (for Val), GCU (for Ala), and GGG (for Gly) are recognized by 2 isoacceptor-tRNAs. In addition, the relative contents of tRNAs (43 or 45 tRNAs) for E. coli growing at different rates (0.4, 0.7, 1.07, 1.6 and 2.5 doublings/hour), measured by Dong et al (Dong et al, 1996), are listed in Table 4.The data in Table 4 suggest that tRNA abundance in E. coli varies with bacterial growth rate - increases over the increase of growth rate (the increase amplitude varies with different tRNAs). Most tRNA relative contents in Table 4 are in agreement with those in Table 3. The data of Tables 1 and 2, with those of Tables 3 and 4, altogether support the concept that the usage frequency of synonymous codon often reflects or correlates with the abundance of its cognate tRNA in E. coli (Garel, 1974; Garel et al, 1981; Ikemura, 1985; Bulmer, 1987; Emilsson and Kurland, 1990; Emilsson et al, 1993; Kane, 1995; Makrides, 1996; Dong et al, 1996).

IV. Definition of low-usage codon and rare codon

Low-usage codon is often called RC, minor codon, and infrequent codon because all of them imply that the usage of such a codon in a genome or an organism is low or very low, in other words, the codon is used rarely or infrequently in a genome or an organism. All the above terms have been equivalently used in the past. But different groups defined different sets of codons as their low-usage codons (although most groups included the several least usage codons in their low-usage codon sets) due to (a) the different numbers of the available protein-coding gene sequences for calculating the codon usage frequencies, and (b) the arbitrary frequency cut-offs which were used by different people (e.g. 0.5%, 1.0%, or 1.1%) to define the boundary between low-usage codons and common codons. The above may result in the following problems: (a) some codons, are low-usage codons to some people but not to others, and vice versa; e.g., GUC and GCC were considered as RC by Pedersen (1984) but not us; (b) different results or conclusions regarding the effects of low-usage codons on the expression of a gene(s) (often over- or under-estimation occurs) may be obtained for the same system just because of the difference of low-usage codons being defined or studied; (c) the results from different groups, for the same gene, are often hard to be compared with each other. Therefore, universal definition(s) for the above terms, or universal terms with fixed meanings is required.

The correlation of the usage frequency of a synonymous codon with its cognate tRNA abundance (such as high-usage codons with high-abundant tRNAs and low-usage codons with low-abundant tRNAs), together with the so far reported expression problems derived from low-usage codons and/or their cognate tRNA availability, suggest that just one term to cover all the above meanings is not enough. To satisfy the above requirements, a RC, an infrequent codon or a minor codon is equivalently defined as a synonymous codon that is not

Table 3. Relative contents of tRNAs in E. coli a

tRNA

Recognized codon

Content b

Leu:

1

CUG

1.00

2

CUU, CUC

0.30

UUR

UUA, UUG

0.25

CUA

CUA

minor

Val:

1

GUA, GUG, GUU*

1.05

2

GUC, GUU*

0.40

Gly:

1

GGG*

0.10

2

GGA,GGG*

0.15

3

GGU, GGC

1.10

Ala:

1

GCA, GCG, GCU*

0.85

GCY

GCC, GCU*

1.04

Arg:

1, 2

CGU, CGC, CGA

0.90

CGG

CGG

minor

AGR

AGA, AGG

minor

Ile:

1

AUU, AUC

1.00

2

AUA

0.05

Lys

AAA, AAG

1.00

Glu

2 (1)

GAA, GAG

0.90

Asp

1

GAU, GAC

0.80

Thr:

1+3

ACU, ACC

0.80

4

ACA, ACG

minor

Asn

AAU, AAC

0.60

Gln:

1

CAA

0.30

2

CAG

0.40

Tyr:

1+2

UAU, UAC

0.50

Ser:

1

UCU*, UCA, UCG

0.25

3

AGU, AGC

0.25

UCY

UCC, UCU*

His

CAC, CAU

0.40

Trp

UGG

0.30

Pro:

1

CCG

major

2

CCC

minor

3

CCU, CCA, CCG

major

Phe

UUU, UUC

0.35

Cys

UGU, UGC

minor

Met:

m

AUG

0.30

f1

AUG

0.40

f2

AUG

0.10

a. Taken and adapted from Ikemura (1981a,b, 1985)

b. The content is the relative amount to that of tRNALeu1(CUG) that is normalized to 1.0 and approximately on the order of 104 molecules per cell for normally growing E. coli.

*. A single codon is recognized by 2 tRNAs.

Table 4. Relative contents of tRNAs in E. coli at different growth rates

tRNA

Recognized codon(s)

Growth Rate (doublings per hour)

0.4

0.7

1.07

1.6

2.5

Leu:

1

CUG

1.00

1.06

1.19

1.51

1.57

2

CUC, CUU

0.21

0.25

0.29

0.33

0.42

3

CUA, CUG

0.15

0.18

0.19

0.23

0.22

4

UUG

0.43

0.45

0.49

0.68

0.66

5

UUA,UUG

0.25

0.25

0.29

0.26

0.27

Val:

1

GUA, GUG, GUU

0.86

0.86

0.78

1.35

1.45

2A

GUC, GUU

0.14

0.14

0.17

0.19

0.20

2B

GUC, GUU

0.14

0.17

0.19

0.26

0.31

Gly:

1+2

(GGG) / (GGA,GGG)

0.48

0.51

0.55

0.78

0.79

3

GGC, GGU

0.98

1.08

1.19

1.41

1.77

Ala:

1B

GCU, GCA, GCG

0.73

0.83

1.00

1.24

1.49

2

GCC

0.1