With the advent of recombinant DNA technology, the controlled bacterial production of an enormous variety of useful polypeptides has become possible. Already in hand are bacteria modified by this technology to permit the production of such polypeptide products such as somatostatin (K. Itakura, et al., Science 198, 1056 [1977]), the (component) A and B chains of human insulin (D. V. Goeddel, et al., Proc Nat'l Acad Sci, USA 76, 106 [1979]), and human growth hormone (D. V. Goeddel, et al., Nature 281, 544 [1979]). More recently, recombinant DNA techniques have been used to occasion the bacterial production of thymosin alpha 1, an immune potentiating substance produced by the thymus (U.S. patent application Ser. No. 125,685 of Roberto Crea and Ronald Wetzel, filed Feb. 28, 1980 now abandoned and assigned to the assignee of the present application). Such is the power of the technology that virtually any useful polypeptide can be bacterially produced, putting within reach the controlled manufacture of hormones, enzymes, antibodies, and vaccines against a wide variety of diseases. The cited materials, which describe in greater detail the representative examples referred to above, are incorporated herein by reference, as are other publications referred to infra, to illuminate the background of the invention.
The work horse of recombinant DNA technology is the plasmid, a non-chromosomal loop of double-stranded DNA found in bacteria, oftentimes in multiple copies per bacterial cell. Included in the information encoded in the plasmid DNA is that required to reproduce the plasmid in daughter cells (i.e., a "replicon") and ordinarily, one or more selection characteristics, such as resistance to antibiotics, which permit clones of the host cell containing the plasmid of interest to be recognized and preferentially grown in selective media. The utility of bacterial plasmids lies in the fact that they can be specifically cleaved by one or another restriction endonuclease or "restriction enzyme", each of which recognizes a different site on the plasmidic DNA. Thereafter heterologous genes or gene fragments may be inserted into the plasmid by endwise joining at the cleavage site or at reconstructed ends adjacent the cleavage site. As used herein, the term "heterologous" refers to a gene not ordinarily found in, or a polypeptide sequence ordinarily not produced by, E. coli, whereas the term "homologous" refers to a gene or polypeptide which is produced in wild-type E. coli. DNA recombination is performed outside the bacteria, but the resulting "recombinant" plasmid can be introduced into bacteria by a process known as transformation and large quantities of the heterologous gene-containing recombinant plasmid obtained by growing the transformant. Moreover, where the gene is properly inserted with reference to portions of the plasmid which govern the transcription and translation of the encoded DNA message, the resulting expression vehicle can be used to actually produce the polypeptide sequence for which the inserted gene codes, a process referred to as expression.
Expression is initiated in a region known as the promoter which is recognized by and bound by RNA polymerase. In some cases, as in the trp operon discussed infra, promoter regions are overlapped by "operator" regions to form a combined promoter-operator. Operators are DNA sequences which are recognized by so-called repressor proteins which serve to regulate the frequency of transcription initiation at a particular promoter. The polymerase travels along the DNA, transcribing the information contained in the coding strand from its 5' to 3' end into messenger RNA which is in turn translated into a polypeptide having the amino acid sequence for which the DNA codes. Each amino acid is encoded by a unique nucleotide triplet or "codon" within what may for present purposes be referred to as the "structural gene", i.e. that part which encodes the amino acid sequence of the expressed product. After binding to the promoter, the RNA polymerase first transcribes nucleotides encoding a ribosome binding site, then a translation initiation or "start" signal (ordinarily ATG, which in the resulting messenger RNA becomes AUG), then the nucleotide codons within the structural gene itself. So-called stop codons are transcribed at the end of the structural gene whereafter the polymerase may form an additional sequence of messenger RNA which, because of the presence of the stop signal, will remain untranslated by the ribosomes. Ribosomes bind to the binding site provided on the messenger RNA, in bacteria ordinarily as the mRNA is being formed, and themselves produce the encoded polypeptide, beginning at the translation start signal and ending at the previously mentioned stop signal. The desired product is produced if the sequences encoding the ribosome binding site are positioned properly with respect to the AUG initiator codon and if all remaining codons follow the initiator codon in phase. The resulting product may be obtained by lysing the host cell and recovering the product by appropriate purification from other bacterial protein.
Polypeptides expressed through the use of recombinant DNA technology may be entirely heterologous, as in the case of the direct expression of human growth hormone, or alternatively may comprise a heterologous polypeptide and, fused thereto, at least a portion of the amino acid sequence of a homologous peptide, as in the case of the production of intermediates for somatostatin and the components of human insulin. In the latter cases, for example, the fused homologous polypeptide comprised a portion of the amino acid sequence for beta galactosidase. In those cases, the intended bioactive product is bioinactivated by the fused, homologous polypeptide until the latter is cleaved away in an extracellular environment. Fusion proteins like those just mentioned can be designed so as to permit highly specific cleavage of the precusor protein from the intended product, as by the action of cyanogen bromide on methionine, or alternatively by enzymatic cleavage. See, eg., G.B. Patent Publication No. 2 007 676 A.
If recombinant DNA technology is to fully sustain its promise, systems must be devised which optimize expression of gene inserts, so that the intended polypeptide products can be made available in high yield. The beta lactamase and lactose promoter-operator systems most commonly used in the past, while useful, have not fully utilized the capacity of the technology from the standpoint of yield. A need has existed for a bacterial expression vehicle capable of the controlled expression of desired polypeptide products in higher yield.
Tryptophan is an amino acid produced by bacteria for use as a component part of homologous polypeptides in a biosynthetic pathway which proceeds: chorismic acid.fwdarw.anthranilic acid.fwdarw.phosphoribosyl antranilic acid.fwdarw.CDRP [enol-1-(o-carboxyphenylamino)-1-desoxy-D-ribulose-5-phosphate].fwdarw.ind ol-3-glycerol-phosphate, and ultimately to tryptophan itself. The enzymatic reactions of this pathway are catalyzed by the products of the tryptophan or "trp" operon, a polycistronic DNA segment which is transcribed under the direction of the trp promoter-operator system. The resulting polycistronic messenger RNA encodes the so-called trp leader sequence and then, in order, the polypeptides referred to as trp E, trp D, trp C, trp B and trp A. These polypeptides variously catalyze and control individual steps in the pathway chorismic acid tryptophan.
In wild-type E. coli, the tryptophan operon is under at least three distinct forms of control. In the case of promoter-operator repression, tryptophan acts as a corepressor and binds to its aporepressor to form an active repressor complex which, in turn, binds to the operator, closing down the pathway in its entirety. Secondly, by a process of feedback inhibition, tryptophan binds to a complex of the trp E and trp D polypeptides, prohibiting their participation in the pathway synthesis. Finally, control is effected by a process known as attenuation under the control of the "attenuator region" of the gene, a region within the trp leader sequence. See generally G. F. Miozzari et al, J. Bacteriology 133, 1457 (1978); The Operon 263-302, Cold Spring Harbor Laboratory (1978), Miller and Reznikoff, eds.; F. Lee et al, Proc. Natl. Acad. Sci. USA 74, 4365 (1977) and K. Bertrand et al, J. Mol. Biol. 103, 319 (1976). The extent of attenuation appears to be governed by the intracellular concentration of tryptophan, and in wild-type E. coli the attenuator terminates expression in approximately nine out of ten cases, possibly through the formation of a secondary structure, or "termination loop", in the messenger RNA which causes the RNA polymerase to prematurely disengage from the associated DNA.
Other workers have employed the trp operon to obtain some measure of heterologous polypeptide expression. This work, it is believed, attempted to deal with problems of repression and attenuation by the addition of indole acrylic acid, an inducer and analog which competes with tryptophan for trp repressor molecules, tending toward derepression by competitive inhibition. At the same time the inducer diminishes attenuation by inhibiting the enzymatic conversion of indole to tryptophan and thus effectively depriving the cell of tryptophan. As a result more polymerases successfully read through the attentuator. However, this approach appears problematic from the standpoint of completing translation consistently and in high yield, since tryptophan-containing protein sequences are prematurely terminated in synthesis due to lack of utilizable tryptophan. Indeed, an effective relief of attenuation by this approach is entirely dependent on severe tryptophan starvation.
The present invention addresses problems associated with tryptophan repression and attenuation in a different manner and provides (1) a method for obtaining an expression vehicle designed for direct expression of heterologous genes from the trp promoter-operator, (2) methods for obtaining vehicles designed for expression, from the tryptophan operator-promoter, of specifically cleavable polypeptides coded by homologous-heterologous gene fusions and (3) a method of expressing heterologous polypeptides controllably, efficiently and in high yield, as well as the associated means.