More than 155 recombinantly produced proteins and peptides have been approved by the U.S. Food and Drug Administration (FDA) for use as biotechnology drugs and vaccines, with another 370 in clinical trials. Unlike small molecule therapeutics that are produced through chemical synthesis, proteins and peptides are most efficiently produced in living cells. In many cases, the cell or organism has been genetically modified to produce or increase the production of the protein.
When a cell is modified to produce large quantities of a target protein, the cell is placed under stress and often reacts by inducing or suppressing other proteins. The stress that a host cell undergoes during production of recombinant proteins can increase expression of, for example, specific proteins or cofactors to cause degradation of the overexpressed recombinant protein. The increased expression of compensatory proteins can be counterproductive to the goal of expressing high levels of active, full-length recombinant protein. Decreased expression or lack of adequate expression of other proteins can cause misfolding and aggregation of the recombinant protein. While it is known that a cell under stress will change its profile of protein expression, it is not known in any given example which specific proteins will be upregulated or downregulated.
Microarrays
Microarray technology can be used to identify the presence and level of expression of a large number of polynucleotides in a single assay. See for eg. U.S. Pat. No. 6,040,138, filed Sep. 15, 1995, U.S. Pat. No. 6,344,316, filed Jun. 25, 1997, U.S. Pat. No. 6,261,776, filed Apr. 15, 1999, U.S. Pat. No. 6,403,957, filed Oct. 16, 2000, U.S. Pat. No. 6,451,536, filed Sep. 27, 2000, U.S. Pat. No. 6,532,462, filed Aug. 27, 2001, U.S. Pat. No. 6,551,784, filed May 9, 2001, U.S. Pat. No. 6,420,108, filed Feb. 9, 1998, U.S. Pat. No. 6,410,229, filed Dec. 14, 1998, U.S. Pat. No. 6,576,424, filed Jan. 25, 2001, U.S. Pat. No. 6,687,692, filed Nov. 2, 2000, U.S. Pat. No. 6,600,031, filed Apr. 21, 1998, and U.S. Pat. No. 6,567,540, filed Apr. 16, 2001, all assigned to Affymetrix, Inc.
U.S. Pat. No. 6,607,885 to E. I. duPont de Nemours and Co. describes methods to profile and identify gene expression changes after subjecting a bacterial cell to expression altering conditions by comparing a first and second microarray measurement.
Wei et al. used a microarray analysis to investigate gene expression profiles of E. coli with lac gene induction (Wei Y., et al. (2001) High-density microarray-mediated gene expression profiling of Escherichia coli. J Bacteriol. 183(2):545-56). Other groups have also investigated transcriptional profiles regulated after mutation of endogenous genes or deletion of regulatory genes (Sabina, J. et al (2003) Interfering with Different Steps of Protein Synthesis Explored by Transcriptional Profiling of Escherichia coli K-12 J Bacteriol. 185:6158-6170; Lee J H (2003) Global analyses of transcriptomes and proteomes of a parent strain and an L-threonine-overproducing mutant strain. J Bacteriol. 185(18):5442-51; Kabir M M, et al. (2003) Gene expression patterns for metabolic pathway in pgi knockout Escherichia coli with and without phb genes based on RT-PCR J Biotechnol. 105(1-2): 11-31; Eymann C., et al. (2002) Bacillus subtilis functional genomics: global characterization of the stringent response by proteome and transcriptome analysis. J Bacteriol. 184(9):2500-20).
Gill et al. disclose the use of microarray technology to identify changes in the expression of stress related genes in E. coli after expression of recombinant chloramphenicol acetyltransferase fusion proteins (Gill et al. (2001) Genomic Analysis of High-Cell-Density Recombinant Escherichia coli Fermentation and “Cell Conditioning” for Improved Recombinant Protein Yield Biotech. Bioengin. 72:85-95). The stress gene transcription profile, comprising only 16% of the total genome, at high cell density was used to evaluate “cell conditioning” strategies to alter the levels of chaperones, proteases, and other intracellular proteins prior to recombinant protein overexpression. The strategies for “conditioning” involved pharmacological manipulation of the cells, including through dithiothreitol and ethanol treatments.
Asai et al. described the use of microarray analysis to identify target genes activated by over-expression of certain sigma factors that are typically induced after cell stresses (Asai K., et al. (2003) DNA microarray analysis of Bacillus subtilis sigma factors of extracytoplasmic function family. FEMS Microbiol. Lett. 220(1):155-60). Cells overexpressing sigma factors as well as reporter genes linked to sigma factor promoters were used to show stress regulated gene induction.
Choi et al. described the analysis and up-regulation of metabolic genes that are down-regulated in high-density batch cultures of E. coli expressing human insulin-like growth factor fusion protein (IGF-If) (Choi et al. (2003) Enhanced Production of Insulin-Like Growth Factor I Fusion Protein in Escherichia coli by Coexpression of the Down-Regulated Genes Identified by Transcriptome Profiling App. Envir. Microbio. 69:4737-4742). The focus of this work was on the metabolic changes that occur during high-density conditions after protein induction. Genes that were down regulated after induction of recombinant protein production during high density growth conditions were identified and specific metabolic genes that had been down-regulated were expressed in cells producing recombinant IGF-If. The work showed that increasing metabolic production of certain nucleotide bases and amino acids could increase protein production and that growth rates could be modified by increasing expression of a down-regulated metabolic transporter molecule. These strategies were designed to alter the cellular environment to reduce metabolic stresses associated with the protein production generally or with high density culture.
Protein Degradation
Unwanted degradation of recombinant protein presents an obstacle to the efficient use of certain expression systems. The expression of exogenous proteins often induces stress responses in host cells, which can be, for example, natural defenses to a limited carbon source. All cells contain a large number of genes capable of producing degradative proteins. It is not possible to predict which proteases will be regulated by a given host in response to expression of a particular recombinant protein. For example, the bacteria P. fluorescens contains up to 200 proteases and protease related proteins.
In the cytoplasm of E. coli, proteolysis is generally carried out by a group of proteases and cofactor molecules. Most early degradation steps are carried out by five ATP-dependent Hsps: Lon/La FtsH/HflB, ClpAP, ClpXP, and ClpYQ/HslUV (Gottesman S (1996) Proteases and their targets in Escherichia coli. Annu. Rev. Genet. 30:465-506). Along with FtsH (an inner membrane-associated protease the active site of which faces the cytoplasm), ClpAP and ClpXP are responsible for the degradation of proteins modified at their carboxyl termini by addition of the non-polar destabilizing tail (Gottesman S, et al. (1998). The ClpXP and ClpAP proteases degrade proteins with carboxyl-terminal peptide tails added by the SsrA-tagging system. Genes Dev. 12:1338-1347; Herman C, et al. (1998) Degradation of carboxy-terminal-tagged cytoplasmic proteins by the Escherichia coli protease HflB (FtsH). Genes Dev. 12:1348-1355).
Several approaches have been taken to avoid degradation during recombinant protein production. One approach is to produce host strains bearing mutations in a protease gene. Baneyx and Georgiou, for example, utilized a protease-deficient strain to improve the yield of a protein A-β-lactamase fusion protein (Baneyx F, Georgiou G. (1991) Construction and characterization of Escherichia coli strains deficient in multiple secreted proteases: protease III degrades high-molecular-weight substrates in vivo. J Bacteriol 173: 2696-2703). Park et al. used a similar mutational approach to improve recombinant protein activity 30% compared with the parent strain of E. coli (Park S. et al. (1999) Secretory production of recombinant protein by a high cell density culture of a protease negative mutant Escherichia coli strain. Biotechnol. Progr. 15:164-167). U.S. Pat. Nos. 5,264,365 and 5,264,365 describe the construction of protease-deficient E. coli, particularly multiply protease deficient strains, to produce proteolytically sensitive polypeptides. PCT Publication No. WO 90/03438 describes the production of strains of E. coli that include protease deficient strains or strains including a protease inhibitor. Similarly, PCT Publication No. WO 02/48376 describes E. coli strains deficient in proteases DegP and Prc.
Protein Folding
Another major obstacle in the production of recombinant proteins in host cells is that the cell often is not adequately equipped to produce either soluble or active protein. While the primary structure of a protein is defined by its amino acid sequence, the secondary structure is defined by the presence of alpha helixes or beta sheets, and the ternary structure by covalent bonds between adjacent protein stretches, such as disulfide bonds. When expressing recombinant proteins, particularly in large-scale production, the secondary and tertiary structure of the protein itself is of critical importance. Any significant change in protein structure can yield a functionally inactive molecule, or a protein with significantly reduced biological activity. In many cases, a host cell expresses folding modulators (FMs) that are necessary for proper production of active recombinant protein. However, at the high levels of expression generally required to produce usable, economically satisfactory biotechnology products, a cell often can not produce enough native folding modulator or modulators to process the recombinant protein.
In certain expression systems, overproduction of exogenous proteins can be accompanied by their misfolding and segregation into insoluble aggregates. In bacterial cells these aggregates are known as inclusion bodies. In E. coli, the network of folding modulators/chaperones includes the Hsp70 family. The major Hsp70 chaperone, DnaK, efficiently prevents protein aggregation and supports the refolding of damaged proteins. The incorporation of heat shock proteins into protein aggregates can facilitate disaggregation. However, proteins processed to inclusion bodies can, in certain cases, be recovered through additional processing of the insoluble fraction. Proteins found in inclusion bodies typically have to be purified through multiple steps, including denaturation and renaturation. Typical renaturation processes for inclusion body targeted proteins involve attempts to dissolve the aggregate in concentrated denaturant and subsequent removal of the denaturant by dilution. Aggregates are frequently formed again in this stage. The additional processing adds cost, there is no guarantee that the in vitro refolding will yield biologically active product, and the recovered proteins can include large amounts of fragment impurities.
One approach to reduce protein aggregation is through fermentation engineering, most commonly by reducing the cultivation temperature (see Baneyx F (1999) In vivo folding of recombinant proteins in Escherichia coli. In Manual of Industrial Microbiology and Biotechnology, Ed. Davies et al. Washington, D.C.: American Society for Microbiology ed. 2:551-565 and references therein). The more recent realization that in vivo protein folding is assisted by molecular chaperones, which promote the proper isomerization and cellular targeting of other polypeptides by transiently interacting with folding intermediates, and by foldases, which accelerate rate-limiting steps along the folding pathway, has provided additional approaches combat the problem of inclusion body formation (see for e.g. Thomas J G et al. (1997). Molecular chaperones, folding catalysts and the recovery of active recombinant proteins from E. coli: to fold or to refold. Appl Biochem Biotechnol, 66:197-238).
In certain cases, the overexpression of chaperones has been found to increase the soluble yields of aggregation-prone proteins (see Baneyx, F. (1999) Recombinant Protein Expression in E. coli Curr. Opin. Biotech. 10:411-421 and references therein). The process does not appear to involve dissolution of preformed recombinant inclusion bodies but is related to improved folding of newly synthesized protein chains. For example, Nishihara et al. coexpressed groESL and dnaJK/grpE in the cytoplasm to improve the stability and accumulation of recombinant Cryj2 (an allergen of Japanese cedar pollen) (Nishihara K, Kanemori M, Kitagawa M, Yanagi H, Yura T. 1998. Chaperone coexpression plasmids: differential and synergistic roles of DnaK-DnaJ-GrpE and GroEL-GroES in assisting folding of an allergen of Japanese cedar pollen, Cryj2, in Escherichia coli. Appl. Environ. Microbiol. 64:1694). Lee and Olins also coexpressed GroESL and DnaK and increased the accumulation of human procollagenase by tenfold (Lee S, Olins P. 1992. Effect of overproduction of heat shock chaperones GroESL and DnaK on human procollagenase production in Escherichia coli. JBC 267:2849-2852). The beneficial effect associated with an increase in the intracellular concentration of these chaperones appears highly dependent on the nature of the overproduced protein, and success is by no means guaranteed.
A need exists for processes for development of host strains that show improved recombinant protein or peptide production, activity or solubility in order to reduce manufacturing costs and increase the yield of active products.
It is therefore an object of the invention to provide processes for improving recombinant protein expression in a host.
It is a further object of the invention to provide processes that increase expression levels in host cells expressing recombinant proteins or peptides.
It is another object of the invention to provide processes to increase the levels of soluble protein made in recombinant expression systems.
It is yet another object of the invention to provide processes to increase the levels of active protein made in recombinant expression systems.