Proteins are produced in systems for a wide range of applications in biology and biotechnology. These include research into cellular and molecular function, production of proteins as biopharmaceuticals or diagnostic reagents, and modification of the traits or phenotypes of livestock and crops. Biopharmaceuticals are usually proteins that have an extracellular function, such as antibodies for immunotherapy or hormones or cytokines for eliciting a cellular response. Proteins with extracellular functions exit the cell via the secretory pathway and undergo post-translational modifications during secretion (Chevet et al. 2001). The modifications (primarily glycosylation and disulfide bond formation) do not occur in bacteria. Moreover, the specific oligosaccharides attached to proteins by glycosylating enzymes are species and cell-type specific. These considerations often limit the choice of host cells for heterologous protein production to eukaryotic cells (Kaufman 2000). For expression of human therapeutic proteins, host cells such as bacteria, yeast, or plants may be inappropriate. Even the subtle differences in protein glycosylation between rodents and human, for example, can be sufficient to render proteins produced in rodent cells unacceptable for therapeutic use (Sheeley et al. 1997). The consequences of improper (i.e., non-human) glycosylation include immunogenicity, reduced functional half-life, and loss of activity. This further limits the choice of host cells to human cell lines or to cell lines such as Chinese Hamster Ovary (CHO) cells, which may produce glycoproteins with human-like carbohydrate structures (Liu 1992).
Some proteins of biotechnological interest are functional as multimers, i.e., they consist of two or more possibly different polypeptide chains in their biologically and/or biotechnologically active form, for example, antibodies (Wright and Morrison 1997). Production of such multimeric proteins in heterologous systems is technically difficult due to a number of limitations of current expression systems. These limitations include (1) difficulties in isolating recombinant cells/cell lines that produce the monomer polypeptides at high levels (predictability and yield), and (2) declines in the levels of expression during the industrial production cycle of the proteins (stability). These problems are described in more detail below.
(1) Recombinant proteins such as antibodies that are used as therapeutic compounds need to be produced in large quantities. The host cells used for recombinant protein production must be compatible with the scale of the industrial processes that are employed. Specifically, the transgene (or the gene encoding a protein of interest; the two terms are used interchangeably herein) expression system used for the heterologous protein needs to be retained by the host cells in a stable and active form during the growth phases of scale-up and production. This is achieved by integration of the transgene into the genome of the host cell. However, creation of recombinant cell lines by conventional means is a costly and inefficient process due to the unpredictability of transgene expression among the recombinant host cells. The unpredictability stems from the high likelihood that the transgene will become inactive due to gene silencing (McBurney et al. 2002). Using conventional technologies, the proportion of recombinant host cells that produce one polypeptide at high levels ranges from 1 to 2%. In order to construct a cell line that produces two polypeptides at high levels, the two transgenes are generally integrated independently. If the two transgenes are transfected simultaneously on two separate nucleic acids, the proportion of cells that will produce both polypeptides at high levels will be the arithmetic product of the proportions for single transgenes. Therefore, the proportion of such recombinant cell lines ranges from one in 2,500 to one in 10,000. For multimeric proteins with three or more subunits, the proportions decline further. These high-producing cell lines must subsequently be identified and isolated from the rest of the population. The methods required to screen for these rare high-expressing cell lines are time-consuming and expensive.
An alternative to simultaneous transfection of two transgene-bearing nucleic acids is sequential transfection. In this case the proportion of high-yielding clones will be the sum of the proportions for single transgenes, i.e., 2 to 4%. Sequential transfection however has (major) drawbacks, including high costs and poor stability. The high costs results from various factors: in particular, the time and resources required for screening for high-expressing cell lines is doubled, since high expression of each subunit must be screened for separately. The poor overall stability of host cells expressing two polypeptides is a consequence of the inherent instability of each of the two transgenes.
(2) Silencing of transgene expression during prolonged host cell cultivation is a commonly observed phenomenon. In vertebrate cells, it can be caused by formation of heterochromatin at the transgene locus, which prevents transcription of the transgene. Transgene silencing is stochastic; it can occur shortly after integration of the transgene into the genome or only after a number of cell divisions. This results in heterogeneous cell populations after prolonged cultivation, in which some cells continue to express high levels of recombinant protein while others express low or undetectable levels of the protein (Martin and Whitelaw 1996, McBurney et al. 2002). A cell line that is used for heterologous protein production is derived from a single cell, yet is often scaled up to, and maintained for long periods at, cell densities in excess of ten million cells per milliliter in cultivators of 1,000 liters or more. These large cell populations (1014 to 1016 cells) are prone to serious declines in productivity due to transgene silencing (Migliaccio et al. 2000, Strutzenberger et al. 1999).
The instability of expression of recombinant host cells is particularly severe when transgene copy numbers are amplified in an attempt to increase yields. Transgene amplification is, for example, achieved by including a selectable marker gene such as dihydrofolate reductase (DHFR) with the transgene during integration (Kaufman 2000). Increased concentrations of the selection agent (in the case of DHFR, the drug methotrexate) select for cells that have amplified the number of DHFR genes in the chromosome (Kaufman and Sharp 1982). Since the transgene and DHFR are co-localized in the chromosome, the transgene copy number increases too. This is correlated with an increase in the yield of the heterologous protein (Kaufman 1990). However, the tandem repeats of transgenes that result from amplification are highly susceptible to silencing (Garrick et al. 1998, Kaufman 1990, McBurney et al. 2002).
A need exists for an alternative (heterologous) protein expression technology and specifically a protein expression method that overcomes the above outlined problems. Even more needed is an expression system that i) provides high predictability of expression, allowing balanced expression of multiple chains, ii) provides high yields and, iii) provides stability during an extended period during which the protein needs to be produced in large quantities. This stability is particularly needed when high copy numbers are present in a cell and silencing is likely to occur.