According to IMS health, the biopharmaceuticals' share of the total pharmaceutical market is forecast to grow from 6 percent in 1999 to 14 percent ($90 billion) in 2009. This increased demand of biologicals is primarily due to their generally highly specific target of action which results in significantly reduced and well-defined risk of toxicity compared to small molecule based drugs. Further, by employing recombinant techniques to produce these biologicals in contrast with older techniques of purifying them from tissue extracts or body fluids, products of very high purity, and well-defined safety and physicochemical characteristics, can be easily produced. Despite having all these patient-friendly qualities, most recombinant biologics remain inaccessible to most people in the world because they continue to remain prohibitively expensive. Therefore, life saving drugs like erythropoietin, drugs like etanercept that significantly improve quality of life, and many anti-cancer drugs like rituximab, trastuzumab and all other monoclonal antibodies etc., are afforded only by a very small percentage of people while a vast majority of sick people around the world cannot use them enough. There is therefore an urgent need to bring down the cost of these drugs. A large component of this high cost is that associated with manufacturing them. This invention provides a solution to this problem by providing expression vectors than can give high expression of protein in mammalian host cells transfected with them.
In recent years, recombinant DNA technology has advanced to a stage where, in general, it is readily possible to obtain a desired gene encoding a desired protein product, also called a biological when the protein is subsequently used as a therapeutic. Once the gene is obtained, a variety of hosts can be employed for its expression by first cloning the gene into any one of a number of available host-specific expression vectors, followed by introducing the gene-carrying vector into the specific host by using a variety of transformation or transfection methods. In order to get protein production from these gene-transformed host cells a variety of conditions of fermentation are also available. Selection of the host and the subsequent interdependent selections such as that of expression vectors, transformation and transfection techniques and fermentation methodologies, depend upon many factors such as, characteristics of the protein to be produced, ultimate use of the protein, amount of the protein required, purification methods available, overall cost and available technology etc. For example, relevant to the interest of this invention where the recombinant protein produced is meant for therapeutic applications, the primary, secondary and tertiary structure of the protein, degree and quality of its glycosylation, purity of the final product, the amount of protein produced, cost at which the drug can be sold, all contribute to the above process of selecting the expression host and making other interdependent selections.
One possible solution for addressing the above described problem of high cost of production of biologicals is to express them in bacterial host cells, such as E. coli [Marino, M. H., BioPharm, 2:18-33 (1989); Georgiou G., Protein engineering: Principles and practice, Wiley Liss, New York 101-127 (1996); Gold, L. Methods Enzymol, 185:11-14 (1990); Hodgson, J., Bio/Technology, 11:887-893 (1993); Nicaud et al, J. Biotechnol. 3:255-270 (1986); Olins, P. O., and S. C. Lee., Curr. Opin. Biotechnol. 4:520-525 (1993); Shatzman, A. R., Curr. Opin. Biotechnol, 6:491-493 (1995)]. A commonly used bacterial host, Escherichia coli, is an important host organism for the production of recombinant proteins and is widely used in industrial production. Its many advantages include easy cultivation, low cost, and high production potential [Shuhua Tan et al, Protein Expression and Purification, 25:430-436 (2002); Cornelia Rossmann et al, Protein expression and Purification 7:335-342 (1996)]. However, bacterial hosts are generally not ideal for the production of biologicals because they do not have the necessary machinery to glycosylate proteins [Old R W, and Primrose S. B., Principles of Gene Manipulations, An introduction to genetic Engineering, Blackwell science, United Kingdom. (1994)], and most mammalian therapeutic proteins are not fully functional without proper glycosylation. Additionally, the lack of a secretion mechanism for the efficient release of protein into the culture medium, the limited ability to facilitate extensive disulfide bond formation, improper folding, degradation of the protein by host cell proteases, significant differences in codon usage, other modifications such as glycations etc., together make bacterial systems much less attractive than mammalian systems [Fuh, G. et al., J. Biol. Chem., 265:3111-3115 (1990); Liang et al., Biochem. J., 229:429-439 (1985); Sarmientos et al., Bio/Technology, 7:495-501 (1989); Savvas C. Makrides, Microbiological Reviews, September 1996. 512-538; N. Jenkins and E. M. Curling, Enzyme Microb. Technol., 16:354-364 (1994)]. Therefore, it is not usually possible to express therapeutic proteins in bacteria and most biologicals utilize eukaryotic host cells for expression even though this means higher cost of production, due to lower expression levels of recombinant protein, more stringent requirements for culturing, slower growth rates etc (Cornelia Rossmann, Protein Expression and Purification, 7:335-342 (1996); Geoff T. Yarranton, Current Opinion in Biotechnology, 1:133-140 (1990)].
Since mammalian host systems are highly advantageous in therapeutic protein production, it is highly important to address this issue of high cost of production associated with them. Since cost of manufacturing can be brought down by increasing productivity, a considerable effort has been expended on increasing the amount of product which can be produced by these host cells. The factors which normally control the amount of product produced by a host cell, include factors which are external to the cell such as the culture conditions, and those which are internal to the cells majority of which include factors that regulate the efficiency and quality of transcription [Foecking and Hofstetter, Gene, 45:101-105 (1986); Kaufman et al, Journal of molecular Biology, 159:601-621 (1985); Wurm et al, PNAS, 1983:5414-5418 (1986); Reiser and Hauser, Drug Research, 37:482-485 (1987); Zettimeissl et al, Biotechnology, 5:720-725 (1987)] and translation [(R. Grabherr and K. Bayer, Food Technol. Biotechnol. 39 (4) 265-269 (2001); Randal J. Kaufman et al. Molecular Biotechnology, 16 (2), 151-160, (2000); Juraj Hlavaty et al., Virology 341, 1-11, (2005); C. M. Stenstrom et al., Gene. 273(2), 259-65, (2001). M. Ibba and D. Soil, Science, 186, 1893, (1999)] and are predominantly dependent upon the design of the expression vector itself. Even though, literature reports many efforts to increase host cell productivity by improving the culturing conditions [Palermo D. P. et al., Journal of Biotechnology, 19:35-48 (1991); Birch and Froud, Biologicals, 22:127-133 (1994); Osman et al, Biotechnology and Bioengineering, 77:398-407 (2003); Dezengotita et al, Biotechnology and Bioengineering, 77:369-380 (2002); Schmelzer & Miller, Biotechnology Prog., 18:346-353 (2002); Dezengotita et al., Biotechnology and Bioengineering, 78:741-752 (2002); and Sun et al, Biotechnology Prog., 20:576-589 (2004)], improvements in these external factors can increase the expression to a limited extent only and are commercially ineffective unless the expression vector has been optimized first to get an ideal basal level of expression.
A vast number of studies have been reported in the prior art that address the internal factors for improving gene expression. The internal factors described below are also known as regulatory elements that regulate gene expression in many ways. It is well known in the prior art that for a gene of interest to get expressed from an expression vector it has to be placed under the control of appropriate 5′ and 3′ flanking sequences which allow the gene to be transcribed into mRNA and then accurately translated into protein. Many important 5′ and 3′ flanking sequences, such as TATA boxes [Boshart, M. et al., Cell, 41:521-530 (1985); Browning, K. S. et al. J. Biol. Chem., 263:9630-9634 (1998); Dorsch-Hasler, K. et al., PNAS, 82:8325-8329 (1985)], promoters such as viral promoters like the CMV immediate early, promoter, SV40 early or late promoters, the adenovirus major late promoter [Luigi R., Gene, 168:195-198 (1996); Pizzorno, M. C. et al., J. Virol., 62:1167-1179 (1988); Okayama and Berg, Mol. Cell. Biol., 2:161-171 (1982); Wong et al., Science, 228:810-815 (1985); Foecking and Hofsteffer, Gene, 45:101-105 (1986)], and mammalian promoters such as the mouse metallothionin promoter, the chicken β-actin promoter [Nicole Israel et al., Gene, 51:197-204 (1987); Karin, M. and Richards, Nature, 299:797-802 (1982); Miyazaki et al., Proc. Natl. Acad. Sci. USA, 83:9537-9541 (1986)], enhancers such as CMV immediate early enhancer [Cockett, M. I. et al., Nucleic Acids Research, 19:319-325 (1996)], translation start and stop codons [Lehninger et al, Principles of Biochemistry—3rd edition, Worth Publishers, Chapter 27, p1025], and polyadenylation sites such as bovine growth hormone (BGH) and SV40 polyadenylation sites [Carswell, S. and Alwine, J. C, Mol. Cell. Biol. 9:4248-4258 (1989)] have been reported. Introns are another internal factor that normally form an integral part of eukaryotic genes as intervening sequences between exons and that are precisely deleted from the primary transcript by a process known as RNA splicing to form mature mRNA. RNA splicing has been widely demonstrated to be responsible for mRNA stability [Buchman et al, Mol. Cell. Biol. 8:4395-404 (1988); Peterson et al, Proc. Natl. Acad. Sci. USA, 83:8883-87 (1986)], and regulation of gene expression [Brinster et al, Proc. Natl. Acad. Sci. USA, 85:836-40 (1988); Dynan, W. S. and Tjian, R., Nature, 316:774-778 (1985)]. Synthetic chimeric introns have also been developed such as the one reported by Huang et al that consists of 5′ donor site of the adenovirus major late transcript and the 3′ splice site of an mouse immunoglobulin [Huang et al, Nucleic Acids Res., 18:937-47 (1990)]. Such chimeric introns support heterologous gene expression better than other commonly used introns [Huang et al, Nucleic Acids Res., 18:937-47 (1990); Ted Choi et al, Molecular and Cellular Biology, 11 (6): 3070-3074 (1991)]
A commonly utilized source for highly efficient internal factors is viruses. Viruses are well known to be the most efficient parasites in nature that use their own internal factors to manipulate host and viral gene expression in favor of their propagation and survival. These have also been studied extensively for their role in the design of expression vectors to ultimately improve protein production. Some of the most efficient promoters known in the art of molecular biology are derived from viruses [Luigi R., Gene, 168:195-198 (1996); Pizzorno, M. C. et al., J. Virol., 62:1167-1179 (1988); Okayama and Berg, Mol. Cell. Biol., 2:161-171 (1982); Wong et al., Science, 228:810-815 (1985); Foecking and Hofsteffer, Gene, 45:01-105 (1986)]. Many viruses have been studied extensively at the genetic level and individual sequences have been identified which can alter the nuclear and cytoplasmic metabolism of mRNA in the host cells. Adenovirus tripartite leader element (TPL) (GI: 209811) [Akusjarvi G. et al, J Mol. Biol., 134(1):143-58 (1979)] is one of such elements known to enhance the translation of even a non-viral RNA in the virus-infected cells when directly appended to it [Berkner K. L. et al, Nucleic Acids Res., 13(3):841-57 (1985)]. All the mRNAs encoded by adenovirus major late transcription unit share this common 5′ non-coding region. This element can reduce the nuclear half-life of the transcripts [Huang et al, J. Virol., 2(1):225-35 (1998)]. This element is also known to enhance the translation of the mRNAs [Kaufman R. J. et al, Proc Natl Acad Sci USA., 82(3):689-93 (1985)]. Another element is the Adenoviral Virus Associated RNA genes I & II (GI: 209811) or its functional variants. The VA RNA genes I & II (VA genes) have been shown to increase the translation efficiency of the gene containing the TPL sequence [Kaufman R. J. et al, Proc Natl Acad Sci USA., 82(3):689-93 (1985)]. The VA RNA I gene is involved in the dephosphorylation of EIF2a and thus increases the protein synthesis rates [O'Malley et al, Cell, 44:391-400 (1986); Thimmapayya B., et al, Cell, 31:543-551 (1982)].
Another commonly utilized internal factor is the gene copy number which is a favored approach for increasing gene expression [Kaufman and Sharp, Journal of molecular Biology, 159:601-602 (1982); Pendse G. J. et al, Biotechnology and Bioengineering, 40:119-129 (1992); Schimke, R. T. (Ed.), Gene Amplification. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 1982]. The most common method used to increase gene copy number is to select cells for gene amplification. In this approach—for instance as described in EP0045809 or U.S. Pat. No. 4,634,665—a host cell is transformed with a pair of genes. The first gene in the pair is encoding a desired protein and the second gene is encoding a selectable marker, e.g., dihydrofolate reductase (DHFR) [Alt, F. W. et al., J. Biol. Chem., 253:1337-1370 (1978)]. These two genes are either present on a single expression vector or on two separate expression vectors. After transfecting cells with this pair of genes, they are cultured in increasing concentrations of a toxic agent, such as methotrexate in the case of the method utilizing DHFR gene as a selection marker, the effect of which is nullified by the product of the selectable marker gene. It has been found that those cell lines, which survive in the higher concentrations of the toxic agent, have an increased copy number of both the selectable marker gene and the desired product gene. The selected host cell that has an amplified number of relevant gene copies can now produce a larger amount of the desired protein than the original cell line. A similar strategy for gene amplification has been utilized using other selection markers such as, adenosine deaminase (ADA), ornithine decarboxylase (ODC), asparagine synthetase (AS) [Chiang, T. and McConlogue, Mol. Cell. Biol. 764-769 (1988); Cartier, M. et al., Mol. Biol., 7:1623-1628 (1987); Germann, U. A. et al., J. Biol. Chem. 264:7418-7424 (1989); Mkeille Cartier and Clifford P. Stanners, Gene, 95:223-230 (1990); Wood C. R. et al., J Immunol, 145:3011-3016 (1990); Kellems R. E. et al., In Genetics and Molecular Biology of Industrial Microorganisms, American Society for Microbiology, Washington, 215-225 (1989)], and glutamine synthetase (GS) [Catherine W-H. et al., J. Biol. Chem., 276:43, 39577-39585 (2001); Bebbington, et al., Biotechnology, 10:169-175 (1992); Bebbington, C. R., Monoclonal antibodies: the next generation, Zola, H., (ed). Bios Scientific, Oxford, 65-181 (1995); Wilson, R. H., In “Gene Amplification in Mammalian Cells” ed Kellens, R. E., Marcel Dekker Inc., New York, 301-311 (1993)] also.
These above described regulatory elements or internal factors are independently, generally incapable of giving gene expression and must be used in combinations. However, while the elements themselves have been well understood the efficiency of their combinations is not absolutely predictable for high expression. Some combinations give very poor expression in comparison with others. For example, king et al using an expression vector consisting of a combination of SR α promoter, AMV RNA leader sequence and DHFR could get an erythropoietin (EPO) expression of only 45 IU/ml (equivalent to 0.346 μg/ml) [Jang H P et al, Biotechnol. Appl. Biochem, 32:167-172, (2000)], while U.S. Pat. No. 5,955,422, reports levels of EPO of 750 to 1470 U/million cells/48 Hrs (or 375 to 735 U/million cells/24 Hrs) using an expression vector consisting of another combination of elements namely, SV40 promoter and poly A sequence, and DHFR. Still another expression vector reported in U.S. Pat. No. 5,888,774, and consisting of a combination of EF1 promoter and apoB SAR elements reports an expression of 1500 to 1700 IU of EPO/million cells/24 Hrs. For other recombinant proteins such as TNFR-IgGFc (Enbrel) an expression vector containing a combination of CMV promoter, TPL, VA I & II, and DHFR has been reported [U.S. Pat. No. 5,605,690, Cindy A Jacobs and Craig A Smith].
Despite all the advances described above, the high cost of manufacturing of recombinant biologics, especially those utilizing mammalian expression systems, still remains a major concern. Therefore, even though prior art reports a large number of methods to increase protein expression by modulating the expression vector, it is still desirable to develop novel expression vectors for further increasing the productivity of eukaryotic host cells. Surprisingly, despite the tremendous amount of knowledge generated in this area over the last two decades even today a person skilled in the art cannot simply pick and choose a combination of internal factors or regulatory elements to design an expression vector that would give guaranteed high expression. A particular element when added to a combination may not provide any significant additive or synergistic effect to the expression potential of the vector. Therefore the process of developing a novel expression vector that would give high level of protein expression still requires empirically testing many possibilities. We have invented a novel expression vector that upon stable transfection in CHO-DHFR− cells gives an expression of 11,830 IU/ml (91 μg/ml) in a 168 hrs culture, which is equivalent to 2366 to 3549 IU/106 cells/24 Hrs or 18.2 to 27.3 μg/106 cells/24 Hrs. Surprisingly, this level of expression for EPO is 80-100% higher than some of the best vectors reported in the literature. This novel expression vector will significantly bring down the cost of production of EPO and other recombinant biologicals.