Many systems have been developed for expression of genes which encode products of commercial interest. For various purposes, such genetic expression systems frequently employ genetic elements that have the capacity to replicate when separated from the genome of the host cell in which they replicate and, accordingly, may be designated as independently replicating genetic elements. Independently replicating elements used in expression systems comprise, for example, variants of DNA or RNA virus genomes including proviruses. Such elements further comprise nonviral genomes that are typically encoded in circular DNA molecules including plasmids.
It is well known in the art that the level of production of a protein by a cell usually is increased substantially by increasing the number of copies per cell of the gene encoding that protein. Therefore, many gene expression systems include a means for providing multiple copies in each host cell of the gene for the desired protein, in other words, a means for amplifying the DNA encoding the desired gene.
Certain observations about replication of bacterial plasmids, for example, have been exploited for the purpose of gene amplification in genetic expression systems. In particular, plasmids occurring in several bacteria in nature possess genetic mechanisms for regulating their replication. These regulatory mechanisms maintain plasmid replication in concert with that of the host cell. Genetic variants of plasmids are known that exhibit different ratios of the number of plasmid copies to the number of host cell genomes per cell under typical cell growth conditions. Thus, the so-called "copy number" is a genetically determined attribute (i.e., a "phenotype") of a given independently replicating element such as a plasmid.
Some plasmids are typified by an unconditional "high-copy-number" phenotype, which is useful for providing limited gene amplification in bacterial expression systems. Even more useful for gene amplification are certain conditional mutants of plasmids which suffer complete abrogation of genetic regulatory restraints on plasmid replication. These mutants express or exhibit uncontrolled plasmid replication that outstrips or runs away from the replication of the host cell genome. Accordingly, a phenotype characterized by such uncontrolled replication is known in the art as a "runaway-replication" phenotype. (Such a phenotype may also be called a "runaway-copy-number" phenotype.)
Continuous expression of a runaway-replication phenotype results in excessive plasmid accumulation that ultimately leads to death of the host cell. However, temperature-sensitive (ts) mutations affecting regulation of plasmid replication are known which exhibit a conditional (more specifically, a temperature-dependent) runaway-replication phenotype that is particularly useful for gene amplification in expression systems. Under the appropriate environmental conditions (i.e., below a critical temperature), replication of ts runaway-replication plasmids is sufficiently limited to allow continuous cell growth. Above that critical thermal point, however, such plasmids accumulate in the cell in amounts far beyond the levels achievable with plasmids of unconditional high-copy-number, which necessarily must not accumulate in lethal amounts. Use of a plasmid for gene expression that exhibits a conditional runaway-replication phenotype thus allows greater plasmid accumulation and, hence, greater amplification of any gene inserted into that plasmid, than can be obtained using a plasmid with an unconditional high-copy-number phenotype.
A more detailed understanding of the genetic basis of regulation of plasmid replication in general, and particularly of runaway-replication phenotypes, especially in specific plasmids of Escherichia coli (hereinafter, E. coli), will be helpful for appreciation of the workings of the present invention. Regulation of plasmid replication has been studied most extensively, in fact, in E. coli, in a group of plasmids related to a prototype known as the Col1E plasmid.
It has long been established that to replicate as an independent DNA molecule, any independently replicating element requires specific sequences for initiation of DNA replication that are designated the "origin of replication". This origin provides a recognition site for an RNA polymerase (i.e., a promoter) to begin de novo synthesis of an RNA strand complementary to a short sequence of DNA adjacent to the promoter. The resulting short RNA, which is called the "primer" for DNA synthesis, is then extended by a DNA polymerase to form an long DNA strand covalently linked with the RNA primer. This indirect method of initiating DNA replication via RNA synthesis appears to be necessary because the DNA polymerase cannot initiate a new strand but can only extend existing polynucleotides.
Further, the requirement for RNA synthesis to initiate DNA replication provides a means for regulation of plasmid replication involving the RNA primer and a small RNA called "RNA I". RNA I is transcribed from the antisense strand of DNA in the region encoding the primer (which historically has been designated as "RNA II"; Lacatena, R. M., et al, 1984, Cell 37, 1009-1014; Tomizawa, J. et al, 1986, Cell 47, 89-97). Binding between RNA I and RNA II leads to transcription termination, thereby preventing DNA synthesis at the origin of replication. More specifically, RNA I is thought to exert its negative control on plasmid copy number through the base pairing interactions of its particular structural features called "stem-loops" with similar stem-loop structures of the primer transcript (Davison, J., 1984, Gene 28, 1-15; Cesareni, G. et al, 1985, Trends Biochem. Sci. 10, 303-306; Wong, E. M. et al, 1985, Cell 42, 959-966; Tomizawa, J. et al, 1986, Cell 47, 89-97).
The conditional runaway-replication phenotype of some plasmids used for gene expression in E. coli derives from ts mutations in the primer transcript (RNA II) that cause elevated copy number when cells harboring the plasmid are grown at a temperature higher than some critical temperature. Several gene expression systems are known that combine the features of a particular thermoinducible runaway plasmid, for example, pKN402 (Uhlin, B. E. et al, 1979, Gene 6, 91-106) with those of efficient promoters, to produce high-level expression in bacteria (Bittner, M. et al, 1981, Gene 15, 319-329; Masui, Y., et al, 1983, Academic Press, New York, N.Y., pp. 15-32; Remaut et al., 1983, Gene 22, 103-113). Another example of a runaway plasmid useful for gene expression is the 7.3 kb plasmid pEW2762 (Wong et al., 1982, Proc. Natl. Acad. Sci. USA 79, 3570-3574) which contains two ts mutations in the particular stem-loop of the primer transcript (RNA II) designated stem-loop IV. Together these two ts mutations cause elevated copy number when cells harboring the plasmid are grown at 42 C., but do not adversely affect regulation of plasmid replication at 30 C. (Wong et al., 1982, Proc. Natl. Acad. Sci. USA 79, 3570-3574; Wong and Polisky, 1985, Cell 42, 959-966). It is presumed that these ts mutations act by changing the conformation of RNA II, in a temperature-dependent manner, so that at elevated temperatures only, RNA II cannot bind RNA I. This temperature-sensitive copy-number phenotype of the plasmid bearing such ts primer mutations has been designated the Cop.sup.ts phenotype.
It is also noteworthy in relation to the present invention that the regulation of replication in at least some plasmids is further influenced by diffusible factors encoded by the plasmid. One particularly relevant example of such a factor is a small plasmid-encoded protein known as Rop which is found in some ColE1 and ColE1-like plasmids. This protein was named `repressor of primer`, or Rop, because of its evident ability to regulate transcription initiation at the promoter for the RNA primer in the plasmid origin of replication; for example, this protein reduced .beta.-galactosidase production to background levels when the lacZ gene was placed under the control of the replication primer promoter (Cesareni et al., 1982, Natl. Acad. Sci. USA 79, 6313-6317). Subsequent research has shown that Rop acts in concert with RNA I to negatively regulate copy number. More specifically, Rop influences plasmid copy number by enhancing or modulating the binding between the primer transcript (RNA II) and RNA I. Because of this modulatory effect of Rop on the RNA I-RNA II interaction, some researchers refer to this protein as Rom (`RNA one [inhibition] modulator`; Tomizawa and Som, 1984, 1984, Cell 38, 871-878). Although details of the role of Rop in plasmid replication are not completely understood, it is thought that loss of Rop alone may produce a high-copy-number phenotype but not a runaway-replication phenotype as distinguished herein. More extensive reviews of how Rop function controls plasmid copy number have been published (Davison, J., 1984, Gene 28, 1-15; Cesareni, G. et al, 1985, Trends Biochem. Sci. 10, 303-306).
Although regulation of plasmid replication has not been as extensively studied in many systems outside of E. coli, it is notable in connection with the potential application of the present invention that there is a gene called rep in Bacillus subtilis that appears to be analogous to the rop gene (i.e., the gene which encodes the Rop protein) in E. coli.
It may also be noted here that in the art there is known a general strategy, which has been used by several investigators, for identifying those runaway-replication mutations in plasmid regulatory sequences that affect a diffusible factor (Shepard et al., 1979, Cell 18, 267-275; Twigg and Sherrat, 1980, Nature 283, 216-218). This strategy involves testing the ability of a second plasmid, which is co-resident in the same cell as the mutant plasmid, to suppress the lethal effects of the runaway-replication mutation by supplying the normal form of the factor that is affected by the runaway mutation. Thus, there is a readily utilizable testing scheme for identification, in plasmids or in other independently replicating genetic elements, of mutations that exhibit a conditional runaway-replication phenotype that is suppressible by a diffusible factor.
Besides the above particulars on the regulation of plasmid replication, certain other aspects of the art of gene expression systems are relevant to comprehension of the present invention. For instance, it is well known that production of some foreign proteins in bacterial or other host cells is lethal to those cells; or, in any case, the highest possible expression of any gene at the least limits the ability of the cells to grow rapidly and to reach high densities under practical conditions. Therefore, many expression systems designed for high level protein production utilize some form of inducible gene expression mechanism that can be controlled by environmental conditions. This be controlled by environmental conditions. This inducible mechanism serves to eliminate or minimize production of the desired protein during growth of the cells until sufficient cell mass and optimum cell density for the needed level of protein production are obtained. At that point, gene expression is induced by some environmental stimulus, typically by means of adding some chemical inducer or by suddenly raising the temperature of the culture by several degrees. Thus, the use of an inducible gene expression system optimizes overall yield by minimizing inhibitory effects on cell growth caused by the actual production of the desired protein.
Certain gene expression systems are known that combine the advantages of inducible runaway plasmid replication, for increasing gene copy number, with those of inducible gene expression, for minimizing interference with cell growth. For example, a system may include multiple ts mutations, in both copy number control and gene expression control functions. In a plasmid including temperature-dependent means for inducing both runaway replication and expression of the desired protein, both of these functions are inhibited during growth of the cells at a lower than normal temperature. Raising the culture temperature a few degrees, however, inactivates inhibitory factors for both functions, thereby simultaneously inducing both runaway plasmid replication and high level expression of the desired gene product.
In temperature-inducible gene expression systems, a ts mutation affecting expression of the gene for the desired protein typically lies in a regulatory gene encoding a repressor protein that inhibits transcription initiation at a particular promoter element. This promotor is located in the plasmid so that it controls expression of any gene of interest that is inserted in the plasmid in the appropriate manner.
Other expression systems combine heat-inducible runaway plasmids with alternative means for inducible gene expression, such as a promoter and associated repressor that are regulated by a chemical inducer. For example, some chemically inducible expression systems employ a promoter which normally functions in a bacterial cell in the conditional regulation of genes for a biochemical pathway that provides an essential nutrient such as an amino acid. Such an inducible promoter serves to shut down production of the enzymes needed for synthesis of that amino acid when that nutrient is present in the extracellular environment at concentrations sufficient to sustain bacterial growth, thereby conserving resources that would otherwise be expended needlessly on unnecessary metabolic capacity. When the relevant amino acid is depleted from the environment, the repressor of such an inducible gene expression system becomes less able to inhibit transcription initiation from its related promoter; accordingly, expression of any genes under the control of this promoter is induced by the removal of the critical amino acid from a culture.
In practice, however, exhaustive depletion from growth medium of a nutrient, such as an amino acid, is difficult to achieve in any case. This depletion is particularly difficult to achieve in a readily controllable fashion that permits gene expression to be induced efficiently in large scale cultures at an optimum cell density without undue manipulations (e.g., changing the culture medium) that may interfere with protein production. It is advantageous for inducible gene expression, therefore, to exploit the well known observation that repressors of certain chemically regulated promoters may respond to the presence of some intermediary metabolite as well as to the absence of the product of the inducible biochemical pathway which is regulated by that promoter and repressor.
For instance, the promoter for the complex of genes involved in tryptophan synthesis (i.e., the trp promoter) is subject to dual chemical regulation and is frequently employed in inducible gene expression systems, due in large measure to its inherent propensity for high levels of transcription of any associated gene. Although inhibition of this promoter by its repressor is attenuated in the absence of tryptophan, more complete induction of the promoter is obtained in the presence of an intermediate in the tryptophan synthetic pathway. This intermediate is produced by control of the trp promoter. Therefore, as tryptophan is depleted from a culture, normally the synthetic pathway enzymes are partially induced; the metabolic intermediate produced by the pathway then more fully blocks the action of the repressor and thereby completely induces the trp promoter. In practice, induction of expression of genes under the control of the trp promoter is most efficiently and fully achieved, even in the presence of low levels of tryptophan, by the addition of a nonmetabolizable analog of the relevant intermediary metabolite.
In the use of bacterial expression systems that combine temperature-dependent regulation of plasmid copy number with chemical regulation of gene expression, when the cells have reached an optimum density, the increase in plasmid copy number (i.e., "gene amplification") may be carried out prior to induction of expression of the desired gene product, thereby minimizing possible inhibitory effects of that gene expression on the gene amplification process. After plasmid accumulation has reached an optimum level for protein production, the an optimum level for protein production, the inducible promoter may be activated by addition of the necessary chemical inducer.
The abrupt environmental changes needed for efficient induction of most gene expression systems pose considerable engineering problems for production of proteins in large scale cultures that are required for many commercial purposes. For example, many ts mutations in repressor proteins are expressed (i.e., become effective by inactivating the repressor) upon a shift in cell temperature from a low temperature (e.g., about 30 C.) to a higher temperature (e.g., in the range of 37 to 42 C.). If gene expression is to be fully induced by temperature shift while maintaining a particular cell density, then the shift must be completed well within the time required for a cell replication cycle, typically on the order of half an hour in the operative temperature range.
Further, incubation for a few minutes at temperatures slightly higher than 42 C. (e.g., 5 to 10 minutes at 45 C.) is actually beneficial for achieving complete inactivation of some ts mutant repressors. On the other hand, more prolonged incubation of cells under these conditions, or even brief exposure to higher temperatures, begins to kill cells, resulting in rapid loss of protein production capacity in the culture. Since it is difficult to design and operate large scale culturing equipment which is capable of the rapid and accurate control of the temperature shifts demanded by ts repressor mutants for optimum performance, the use of such heat inducible gene expression systems is problematic for applications requiring more than a few liters of culture.
Although not all ts mutations in repressor genes or in plasmid copy number control genes require the precise up-and-down regimen of temperature shifts outlined above, nevertheless, even the less demanding task of raising the temperature relatively rapidly without excessive temperature excursions beyond tolerable limits is formidable in large scale cultures. The use of a chemically inducible gene expression system reduces the engineering problems associated with precise control of gene induction by temperature shift and allows separate control of gene amplification and expression; nevertheless, substantial equipment for rapid and thorough admixing of the added inducer must be provided.
In conclusion, although many genetic expression systems already have exploited inducible promoters for controlling gene expression, either alone or in combination with inducible runaway-replication plasmids, virtually all such inducible systems suffer from the general problems that the environmental stimulus required for induction is difficult to provide in large scale cultures, and that the process of providing the inducing stimulus may interfere with protein production.
Accordingly, a major object of the present invention is to provide genetic expression systems for producing proteins at consistently high yields and on scales suitable for commercial purposes, that are inducible without temperature shifts, chemical inducers, or specialized cell growth medium. The present invention contemplates utilization of novel combinations of genetic alterations that produce a runaway-replication phenotype with particular characteristics, together with other approaches for inducible gene expression and protein production, to achieve this major object and other related objects of this invention that are described below.