One of the goals of recombinant DNA technology is to obtain efficient expression of the cloned DNA. It is desirable to obtain the expression product in as high yields as possible. Several possible techniques for expression are available as options, and may include (a) modification of the coding sequences to provide an exact desired translational starting point; (b) selection or construction of an optimal expression vector; (c) post-translational processing, either by exploiting in vivo processing activity of the host or by in vitro chemical means; and (d) direct expression.
Cloned DNA can be expressed as a fusion protein which contains the protein coded for by the cloned DNA as the C-terminal end. The protein coded for by the foreign gene or cDNA can be expressed as a fusion protein by insertion of the foreign gene or cDNA into appropriate sites within expressed operons (expression vectors) including, for example, the Pst I site in the .beta.-lactamase gene of pBR322 (Villa -Komaroff, L., et al, Proc. Nat. Acad. Sci. USA, 75, 3727 (1978) and Seeburg, P., et al, Nature, 274, 795 (1978)), the EcoRI site of pBR322 carrying the lac control region and coding sequence for .beta.-galactosidase (Itakura, K., et al, Science, 198, 1056 (1977)) or the HindIII site of the trpD gene of plasmid ptrpED50 (Martial, J., et al, Science, 205, 602 (1979)). Modifications of sequence length, if needed, by one or two nucleotides in order to achieve correct reading frame phase are well known in the art.
Cloned DNA can be expressed directly under certain circumstances. Chang, A.C.Y., et al, Proc. Nat. Acad. Sci. USA, 77, 1442 (1980) have reported that they obtained direct expression of mouse dihydrofolate reductase when the coding sequence therefor has been dC-tailed and inserted into the dG-tailed, Pst I site of pBR322. A second technique for direct expression involves replacing the coding segment normally transcribed and translated by a bacterial control region, which includes a promoter and ribosomal binding site, with any desired coding sequence. Application Ser. 213,879, filed Dec. 8, 1980 (and its continuation; Ser. No. 518,613, filed July 29, 1983), incorporated herein by reference describes the synthesis of a direct expression vector containing the control region of the trp operon.
The trp operon has proved useful for the expression of a fusion protein or for direct expression. Several expression vectors containing the trp operon have been prepared for use in synthesizing fusion proteins. Hallewell, R. A. and Emtage, R. A., Gene, 9, 27 (1980) describe the preparation of an expression vector, ptrpED5-1, containing the promoter, operator, leader, attenuator, trp E gene and 15% of the trp D gene sequences. This expression vector has been utilized to produce a fusion protein containing part of the trp D protein and human growth hormone (Martial, J ., et al, supra). Tacon, W., et al, Molec. Gen. Genet., 177, 427 (1980) describe the preparation of expression vectors pWT 111, pWT 121 and pWT 131. These expression vectors are derived from ptrpED5-1 by digestion with HinfI to remove the DNA sequences of the trp D gene and all but 21 deoxyribonucleotides of the trp E gene.
In each of the above expression methods utilizing the trp operon, maximum expression is not obtainable. The trp operon contains two transcriptional control points. The primary control point is the promoter/operator region. Transcription of the operon is regulated by trp repressor molecules binding at this site and repressing the operon. The addition of 3.beta. indolylacrylic acid induces the trp operon approximately 50-fold. A secondary control point involves the leader and attenuator sequence of the operon. This sequence regulates transcription of the trp operon by approximately 10-fold, by terminating transcription at this point (Bertrand, K., et al, Science, 189, 22 (1975)). When trp tRNA is limiting, translation pauses at these two codons and transcription continues past the attenuator. However, when trp tRNA is abundant, translation continues and transcription terminates at the attenuator, yielding a 140 bp transcript corresponding to the leader region.
While it is possible to induce the trp operon 50-fold with 3.mu.-indolylacrylic acid, it is not possible to maximize transcription and hence expression when the trp operon expression vector contains the attenuator sequence. Application Ser. No. 213,879 (and its continuation, supra) describes a direct expression vector (ptrpL1) which is derived from the trp operon and lacks the attenuator sequence. Although this expression vector is suitable for the direct expression of many proteins, it has been discovered that it is not suitable for the direct expression of all proteins. For example, applicants discovered that insertion of the hepatitis B surface antigen (HBsAg) gene into the Cla I site of ptrpL1 did not result in the production of HBsAg.
While it is possible to synthesize fusion proteins containing a part of the .beta.-lactamase protein by the prior art methods, it has often not been possible to obtain expression of some fusion proteins in a significant amount to make the prior art methods practical. Applicants have discovered that the location of the trp promoter upstream from the .beta.-lactamase gene in ptrpL1 results in the overproduction of .beta.-lactamase when the trp promoter is induced by 3.beta.-indolylacrylic acid. Applicants further discovered that a fusion protein is also overproduced when foreign DNA is inserted into the .beta.-lactamase gene. Applicants hypothesize that the trp promoter is overriding the .beta.-lactamase promoter to cause the overproduction of the .beta.-lactamase or the fusion protein.