One of the goals of recombinant DNA technology is to obtain efficient expression of the cloned DNA. It is desirable to obtain the expression product in as high yields as possible. Several possible techniques for expression are available as options, and may include (a) modification of the coding sequences to provide an exact desired translational starting point; (b) selection or construction of an optimal expression vector; (c) post-translational processing, either by exploiting in vivo processing activity of the host or by in vitro chemical means; and (d) direct expression.
Cloned DNA can be expressed as a fusion protein which contains the protein coded for by the cloned DNA as the C-terminal end. The protein coded for by the foreign gene or cDNA can be expressed as a fusion protein by insertion of the foreign gene or cDNA into appropriate sites within expressed operons (expression vectors) including, for example, the Pst I site in the .beta.-lactamase gene of pBR322 (Villa-Komaroff, L., et al, Proc. Nat. Acad. Sci. USA, 75, 3727 (1978) and Seeburg, P., et al, Nature, 274, 795 (1978)), the EcoRI site of pBR322 carrying the lac control region and coding sequence for .beta.-galactosidase (Itakura, K., et al, Science, 198, 1056 (1977)) or the HindIII site of the trpD gene of plasmid ptrpED50 (Martial, J., et al, Science, 205, 602 (1979)). Modifications of sequence length, if needed, by one or two nucleotides in order to achieve correct reading frame phase are well known in the art.
Cloned DNA can be expressed directly under certain circumstances. Chang, A.C.Y., et al, Proc. Nat. Acad. Sci. USA, 77, 1442 (1980) have reported that they obtained direct expression of mouse dihydrofolate reductase (DHFR). The mouse DHFR coding sequence had been dC-tailed and inserted into the dGtailed, Pst I site of pBR322. The authors found that transformed bacteria synthesized a protein having enzymatic properties, immunological reactivity and molecular size of the mouse DHFR. They also found that the cDNA for DHFR was in a different translation reading frame from the bacterial .beta.-lactamase gene into which it had been inserted. These findings implied that translation was re-initiated at the start codon for the mouse DHFR under these circumstances, i.e., method of insertion, to produce mouse DHFR directly and not as part of a fusion protein.
A second technique for direct expression involves replacing the coding segment normally transcribed and translated by the bacterial control region. The essential component of the control region to be preserved is termed the expression unit, which includes a promoter and a ribosomal binding site capable of acting in the host organism. It is not necessary to remove all of the nucleotides coding for the host portion of the fusion protein. Optimal protein expression is dependent on the distance and sequence between the ribosomal binding site and the start codon (AUG) of the protein to be expressed. Goeddel, I. et al, Nature 281, 544 (1979); Roberts, T. M. et al, Proc. Natl. Acad. Sci. USA 76, 5596 (1979); and Taniguchi, T. et al, Proc. Natl. Acad. Sci. USA 77, 5230 (1980). Although the exact distance for optimal protein expression is not known, it has been reported that the start codon may be located anywhere within 3-11 nucleotides of the ribosomal binding site. Shine, J., et al, Proc. Nat. Acad. Sci. USA, 71, 1342 (1974) and Steitz, J., et al, Proc. Nat. Acad. Sci. USA, 72, 4734 (1975). In this 3-11 nucleotide region, the first AUG to be encountered sets the reading frame for translation. In the case of ptrpE30, derived from ptrpED50, described supra, and containing the operator, promoter, leader, attenuator and ribosome binding sequence of the E protein of the tryptophan operon together with the nucleotide sequence coding for seven amino acids of the trp E protein followed by a HindIII site, the removal of a minimum of 23-29 nucleotides from the HindIII site provides a site for insertion of the cDNA insert under tryptothan operon control. In this method, the foreign DNA is prepared so that it begins at or near the start codon. This DNA is then inserted into the modified ptrpE30 to obtain direct expression of the insert DNA.
The trp operon has proved useful for the expression of a fusion protein or for direct expression. Several expression vectors containing the trp operon have been prepared for use in synthesizing fusion proteins. Hallewell, R. A. and Emtage, R. A., Gene, 9, 27 (1980) describe the preparation of an expression vector, ptrpED5-1, containing the promoter, operator, leader, attenuator, trp E gene and 15% of the trp D gene sequences. This expression vector has been utilized to produce a fusion protein containing human growth hormone (Martial, J., et al, supra). Tacon, W., et al, Molec. Gen. Genet., 177, 427 (1980) describe the preparation of expression vectors pWT 111, pWT 121 and pWT 131. These expression vectors are derived from ptrpED5-1 by digestion with HinfI to remove the DNA sequences of the trp D gene and all but 21 deoxyribonucleotide of the trp E gene.
In each of the above expression methods utilizing the trp operon, maximum expression is not obtainable. The trp operon contains two transcriptional control points. The primary control point is the promoter/operator region. Transcription of the operon is regulated by trp repressor molecules binding at this site and repressing the operon. The addition of 3.beta.-indolylacrylic acid induces the trp operon approximately 50-fold. A secondary control point involves the leader and attenuator sequence of the operon. This sequence regulates transcription of the trp operon by approximately 10-fold, by terminating transcription at this point (Bertrand, K., et al, Science, 189, 22 (1975)). The leader is a potentially translatable region for a peptide of 14 amino acids. It possesses its own ribosome binding site and a coding sequence containing two tandem trp codons. The mechanism of attenuation appears to involve secondary structure changes in the nascent messenger that are influenced by the translation of these two codons. Lee, F., and YANOFSKY, C., PROC. NATL, ACAD. SCI. USA. 75, 5988-5992 (1978).
When trp tRNA is limiting, translation pauses at these two codons and transcription continues past the attenuator. However, when trp tRNA is abundant, translation continues and transcription terminates at the attenuator, yielding a 140 bp transcript corresponding to the leader region.
While it is possible to induce the trp operon 50-fold with 3.beta.-indolylacrylic acid, it is not possible to maximize transcription and hence expression when the trp operon expression vector contains the attenuator sequence. Applicants have prepared an expression vector derived from the trp operon which maximizes expression of the foreign gene and provides for the direct expression of the foreign gene.