One of the major achievements in recombinant technology is the high-level expression (overproduction) of foreign proteins in procaryotic cells such as Escherichia coli (E. coli). In recent years, this technology has improved the availability of medically and scientifically important proteins, several of which are already available for clinical therapy and scientific research. Overproduction of protein in procaryotic cells is demonstrated by directly measuring the activity of the enzyme with a suitable substrate or by measuring the physical amount of specific protein produced. High levels of protein production can be achieved by improving expression of the gene encoding the protein. An important aspect of gene expression is efficiency in translating the nucleotide sequence encoding the protein. There is much interest in improving the production of bacterial enzymes that are useful reagents in nucleic acid biochemistry itself, for example, DNA ligase, DNA polymerase, etc.
Unfortunately, this technology does not always provide high protein yields. One cause of low protein yield, is inefficient translation of the nucleotide sequences encoding the foreign protein. Amplification of protein yields depends, inter alia, upon ensuring efficient translation.
Through extensive studies in several laboratories, it is now recognized that the nucleotide sequence at the N-terminus-encoding region of a gene is one of the factors strongly influencing translation efficiency. It is also recognized that alteration of the codons at the beginning of the gene can overcome poor translation. One strategy is to redesign the first portion of the coding sequence without altering the amino acid sequence of the encoded protein, by using the known degeneracy of the genetic code to alter codon selection.
However, the studies do not predict, teach, or give guidance as to which bases are important or which sequences should be altered for a particular protein. Hence, the researcher must adopt an essentially empirical approach when he attempts to optimize protein production by employing these recombinant techniques.
An empirical approach is laborious. Generally, a variety of synthetic oligonucleotides including all the potential codons for the correct amino acid sequence is substituted at the N-terminus encoding region. A variety of methods can then be employed to select or screen for one oligonucleotide which gives high expression levels. Another approach is to obtain a series of derivatives by random mutagenesis of the original sequence. Extensive screening methods will hopefully yield a clone with high expression levels. This candidate is then analyzed to determine the "optimal" sequence and that sequence is used to replace the corresponding fragments in the original gene. This shot-gun approach is laborious.
These tedious strategies are employed to amplify the synthesis of a desired protein which is produced by the unaltered (native) gene only in small quantities. The thermostable DNA polymerase from Thermus aquaticus (Taq Pol) is such a product.
Taq Pol catalyzes the combination of nucleotide triphosphates to form a nucleic acid strand complementary to a nucleic acid template strand. The application of thermostable Taq Pol to the amplification of nucleic acid by polymerase chain reaction (PCR) was the key step in the development of PCR to its now dominant position in molecular biology. The gene encoding Taq Pol has been cloned, sequenced, and expressed in E. coli, yielding only modest amounts of Taq Pol.
The problem is that although Taq Pol is commercially available from several sources, it is expensive, partly because of the modest amounts recovered by using the methods currently available. Increased production of Taq Pol is clearly desirable to meet increasing demand and to make production more economical.