DNA sequencing projects have provided coding sequences for hundreds of thousands of proteins from organisms across the evolutionary spectrum. Recombinant DNA technology makes it possible to clone these coding sequences into expression vectors that can direct the production of the corresponding proteins in suitable host cells. The resulting proteins are widely useful, as objects of biochemical, biophysical, structural and functional studies for understanding basic biological processes, as enzymes to serve as research tools or produce valuable chemicals, as diagnostics, vaccines, therapeutics or targets for developing medically useful drugs, or for protein chips, to mention a few.
The T7 expression system comprises in vivo inducible expression, in T7 expression system host strains, of T7 RNA polymerase from a chromosomal copy of a cloned gene for the T7 RNA polymerase enzyme (gene 1 of bacteriophage T7), followed in turn by recognition and binding of a T7 promoter sequence contained in T7 expression vectors carried in the host strain, followed by intense transcription of any gene(s) cloned downstream of the T7 promoter sequence, and, where the cloned sequence is a protein coding sequence, subsequent translation of the transcripts. This recombinant gene expression system, originally developed in Escherichia coli and which has become the standard by which other prokaryotic expression systems are judged, has been adapted for use in other bacterial species including Salmonella enteric serovar Typhimurium (McKinney, J., et al. (2002) J. Bacteriology 184:6056-6059), Pseudomonas (Schweizer, H P (2001) Curr. Opin. Biotechnol. 12:439-445), Rhodobacter capsulatus (Drepper, T., et al. (2005) Biochem. Soc. Trans. 33:56-58), Ralstonia eutropha (Barnard, G. C., et al. (2004) Prot. Exp. & Purif. 38:264-271) and Bacillus subtilis (Conrad, B., et al. (1996) Mol. Gen. Genet. 250:230-236).
The inducible T7 expression system is highly effective and widely used to produce RNAs and proteins from cloned coding sequences in the bacterium Escherichia coli (Studier and Moffatt, J. Mol. Biol. 189: 113-130 (1986); Studier et al., Methods in Enzymology 185: 60-89 (1990); Novagen). The coding sequence for T7 RNA polymerase is typically present in the chromosome under control of the inducible lac or lacUV5 promoter in the chromosome of host cells such as BL21(DE3), B834(DE3) and HMS174(DE3), and derivatives of such cells such as ER2566 and ER 2833 (New England Biolabs). In another derivative strain, designated “BL21-AI” (Invitrogen), gene 1 is under the control of the arabinose-inducible araBAD promoter. In the absence of inducing compounds, transcription by the host cell RNA polymerase is blocked by the natural endogenous repressor. In the case of the lac promoter, it is the lac repressor and for the araBAD promoter, the product of the araC gene is the repressor.
The coding sequence for the desired RNA or protein (referred to as the target RNA or protein) is typically placed in a plasmid under control of a T7 promoter, that is, a promoter recognized specifically by T7 RNA polymerase. In the absence of an inducer for the lacUV5 promoter, little T7 RNA polymerase or target protein should be present and the cells should grow well. However, upon addition of an inducer, typically IPTG (isopropyl-β-D-thiogalactoside), T7 RNA polymerase will be made and will transcribe almost any DNA controlled by the T7 promoter. T7 RNA polymerase is so specific, active and processive that the amount of target RNA produced can be comparable to the amount of ribosomal RNA in a cell. Thus, large amounts of RNAs that are useful in themselves, such as ribozymes, can be produced. If the target RNA contains the coding sequence for a protein and appropriate translation initiation signals (such as the sequence upstream of the start codon for the T7 major capsid protein), target protein can be produced, often accumulating to become a substantial fraction of total cell protein. See also U.S. Pat. Nos. 4,952,496; 5,693,489; and 5,869,320, the contents of which are incorporated herein by reference.
In strains in which the T7 gene 1 (T7 RNA polymerase gene) is under control of the lac or lacUV5, IPTG has typically been used to induce expression of target proteins. Lactose will also cause induction and, being much cheaper than IPTG, may be preferable for large-scale production. Neubauer et al., Appl. Microbiol. Biotechnol. 36: 739-744 (1992) obtained induction by lactose with the same efficiency as with IPTG by careful monitoring of the glucose level in fermentation and by addition of lactose when the glucose was nearly depleted. Hoffman et al., Protein Expression and Purification 6: 646-654 (1995) used similar procedures to obtain comparable levels of protein synthesis with lactose or IPTG induction in a fermenter process.
A problem in using inducible T7 expression systems is that T7 RNA polymerase is so active that a small basal level can lead to a substantial expression of target protein even in the absence of added inducer. If the target protein is sufficiently toxic to the host cell, establishment of the target plasmid in the expression host may be difficult or impossible, or the expression strain may be unstable or accumulate mutations (Kelley et al., Gene 156: 33-36 (1995)). An effective means to reduce basal expression (and thereby increase the range and stability of target proteins that can be established and expressed) is to place the lac operator sequence (the binding site for lac repressor) just downstream of the start site of a T7 promoter, creating a T7lac promoter (Dubendorff and Studier, J. Mol. Biol. 219: 45-59 (1991)). Lac repressor bound at the operator sequence interferes with establishment of an elongation complex by T7 RNA polymerase at a T7lac promoter and substantially reduces the level of target mRNA produced. If sufficient lac repressor is present to saturate all of its binding sites in the cell, the basal level of target protein in uninduced cells is substantially reduced, but induction unblocks both the lacUV5 and T7lac promoters and leads to the typical high levels of expression. Thus, the T7lac promoter increases the convenience and applicability of the T7 system for expressing a wide range of proteins.
It was early noticed that growth of T7 expression cultures to saturation could cause problems, and Grossman et al., Gene 209: 95-103 (1998) showed that cultures growing in certain complex media induce the target protein to high levels upon approach to saturation even when the T7lac promoter was used. They pointed out that such unintended induction could be a problem in isolating and using strains that express proteins that are toxic to E. coli. They concluded that the known inducer lactose was not responsible for this effect, but that cyclic AMP is required, and they recommended using a mutant unable to make cyclic AMP as an expression host.
Although such basal level of expression from T7 expression vectors can be suppressed through use of the T7lac promoter in such vectors, this does not solve problems resulting from unintended induction of T7 expression strains when they are grown to saturation, which was noted by Grossman et al. (Gene 209: 95-103 (1998)). Grossman et al. showed that cultures growing in certain complex media induce the target protein to high levels upon approach to saturation even when the T7lac promoter was used. They pointed out that such unintended induction could be a problem in isolating and using strains that express proteins that are toxic to E. coli. In their work, they concluded that lactose had not been responsible for this effect, but that cyclic AMP was required, and they recommended using a mutant host strain that is unable to make cyclic AMP. They also found that addition of 1% glucose to late log phase cells prevented the unintended induction, and the Novagen web site references their paper and recommends adding 1% glucose to the medium to manage this problem.
Structural genomics is an area where multi-milligram amounts of many different proteins over a wide evolutionary range are required for determination of protein structures by X-ray crystallography or nuclear magnetic resonance (NMR). Fabrication of protein chips is another application where many different proteins are needed. Expressing cloned coding sequences in the T7 system is an efficient, widely used method for obtaining these proteins. Screening large numbers of clones for protein expression level and solubility makes it desirable to have procedures that can be applied to many clones in parallel, preferably using automation. The need to process many cultures in parallel dictates batchwise growth of cultures in small vessels such as culture tubes or multi-well plates such as the 24-, 96- or 384-well plates commonly available. A high level of protein production per volume of culture is also desirable. The needed multi-milligram amounts of pure protein could be produced in fermenters, but cultures grown batchwise in vessels aerated by shaking (a baffled flask on a rotary shaker, for example), bubbling air, or oxygen can typically produce this amount of protein in the T7 expression system in a liter or less of culture, allowing several cultures, each producing a different protein, to be grown and induced in parallel.
In trying to develop reliable procedures for growing and inducing protein synthesis in many cultures in parallel, a significant difficulty was to obtain all of the cultures in a comparable state of growth so that they could be induced simultaneously in parallel. Substantial effort was required to measure the cell density of each culture and add inducer at the proper time, even using a plate reader that could measure the densities of cultures in all of the different wells of a plate in a single reading. Even if comparable amounts of culture could be inoculated in each well, differences in lag time or growth rate typically generated situations where cultures in different wells would be ready for induction at substantially different times. If the entire plate was to be collected at once, cultures would also vary in the length of time in which they had been producing target protein, possibly making it difficult to choose a time when all had been induced to optimal levels without substantial overgrowth of some cultures by cells that had lost plasmid.
An obvious strategy was to grow the entire plate to saturation in a small volume of medium in each well, dilute by adding fresh medium, grow for an appropriate time (determined by previous testing or by direct measurement of cell densities), and add inducer to all wells at the same time. The hope was that all cultures in a plate would saturate at near enough to the same density and grow after dilution with similar enough kinetics that the culture-to-culture variation in density at the time of induction would be tolerable. However, in trying to implement this strategy, when certain lots of complex growth media were used, the problem described by Grossman et al. (1998) was encountered, namely, induction during the growth to saturation. Indeed, it was found that media made with a particular lot of N-Z-amine showed this induction behavior, whereas otherwise identical media made with a second lot from the same supplier did not. Unwanted induction at saturation would make it extremely difficult to obtain sufficient uniformity of growth to permit parallel manipulation of cultures expressing target proteins of different, usually unknown degrees of toxicity. Although addition of glucose could suppress this induction (Novagen), the saturated cultures could become very acid, which would limit the saturation density and again make it difficult to get uniform growth upon dilution. Screening different lots of N-Z-amine for those without the inducing behavior did not seem to be an attractive solution, as there was no guarantee that such lots would always be available. Thus, the approaches taken, leading to the present invention, were to determine causes of and ways to prevent unwanted induction and to develop means to promote desirable auto-induction of expression strains.
The ability to control the problem of sporadic, unwanted induction in complex media would represent a significant advance in the art. A systematic analysis of the components of both complex and defined media was undertaken. The goal was to define requirements for batchwise growth of T7 expression strains to high density under conditions suitable for growth and induction of many cultures in parallel, and, complementarily, to develop formulations that would reliably grow cultures of expression strains to saturation with little or no induction.