Proteins are produced in systems for a wide range of applications in biology and biotechnology. These include research into cellular and molecular function, production of proteins as biopharmaceuticals or diagnostic reagents, and modification of the traits or phenotypes of livestock and crops. Biopharmaceuticals are usually proteins that have an extracellular function, such as antibodies for immunotherapy, or hormones or cytokines for eliciting a cellular response. Proteins with extracellular functions exit the cell via a secretory pathway, and undergo post-translational modifications during secretion (Chevet et al., 2001). The modifications (primarily glycosylation and disulfide bond formation) do not naturally occur in bacteria. Moreover, the specific oligosaccharides attached to proteins by glycosylating enzymes are typically species and cell-type specific. These considerations often limit the choice of host cells for heterologous protein production to eukaryotic cells (Kaufman, 2000). For expression of human therapeutic proteins, host cells such as bacteria, yeast, or plants may be inappropriate. Even the subtle differences in protein glycosylation between rodents and human, for example, can be sufficient to render proteins produced in rodent cells unacceptable for therapeutic use (Sheeley et al., 1997). The consequences of improper (i.e., non-human) glycosylation include immunogenicity, reduced functional half-life, and loss of activity. For proteins where this is a problem, the choice of host cells is limited further to human cell lines or to cell lines such as Chinese Hamster Ovary (CHO) cells, which may produce glycoproteins with human-like carbohydrate structures (Liu, 1992).
Some proteins of biotechnological interest are functional as multimers, i.e., they consist of two or more possibly different polypeptide chains in their biologically and/or biotechnologically active form. Examples include antibodies (Wright and Morrison, 1997), bone morphogenetic proteins (Groeneveld and Burger, 2000), nuclear hormone receptors (Aranda and Pascual, 2001), heterodimeric cell surface receptors (e.g., T cell receptors (Chan and Mak, 1989)), integrins (Hynes, 1999), and the glycoprotein hormone family (chorionic gonadotrophin, pituitary luteinizing hormone, follicle-stimulating hormone, and thyroid-stimulating hormone (Thotakura and Blithe, 1995)). Production of such multimeric proteins in heterologous systems is technically difficult due to a number of limitations of current expression systems. These limitations include: (1) difficulties in isolating recombinant cells/cell lines that produce the monomer polypeptides at high levels (predictability and yield) and (2) declines in the levels of expression during the industrial production cycle of the proteins (stability). These problems are described in more detail below.
(1) Recombinant proteins, such as antibodies that are used as therapeutic compounds, need to be produced in large quantities. The host cells used for recombinant protein production must be compatible with the scale of the industrial processes that are employed. Specifically, the transgene (or the gene encoding a protein of interest, the two terms being used interchangeably herein) expression system used for the heterologous protein needs to be retained by the host cells in a stable and active form during the growth phases of scale-up and production. This is achieved by integration of the transgene into the genome of the host cell. However, creation of recombinant cell lines by conventional means is a costly and inefficient process due to the unpredictability of transgene expression among the recombinant host cells. The unpredictability stems from the high likelihood that the transgene will become inactive due to gene silencing (McBurney et al., 2002). Using conventional technologies, the proportion of recombinant host cells that produce one polypeptide at high levels ranges from 1 to 2%. In order to construct a cell line that produces two polypeptides at high levels, the two transgenes are generally integrated independently. If the two transgenes are transfected simultaneously on two separate plasmids, the proportion of cells that will produce both polypeptides at high levels will be the arithmetic product of the proportions for single transgenes. Therefore, the proportion of such recombinant cell lines ranges from one in 2,500 to one in 10,000. For multimeric proteins with three or more subunits, the proportions decline further. These high-producing cell lines must subsequently be identified and isolated from the rest of the population. The methods required to screen for these rare high-expressing cell lines are time consuming and expensive.
An alternative to simultaneous transfection of two transgene-bearing plasmids is sequential transfection. In this case, the proportion of high-yielding clones will be the sum of the proportions for single transgenes, i.e., 2 to 4%. Sequential transfection, however, has (major) drawbacks, including high costs and poor stability. The high costs result from various factors; in particular, the time and resources required for screening for high-expressing cell lines is doubled, since high expression of each subunit must be screened for separately. The poor overall stability of host cells expressing two polypeptides is a consequence of the inherent instability of each of the two transgenes.
(2) Silencing of transgene expression during prolonged host cell cultivation is a commonly observed phenomenon. In vertebrate cells, it can be caused by formation of heterochromatin at the transgene locus, which prevents transcription of the transgene. Transgene silencing is stochastic; it can occur shortly after integration of the transgene into the genome or only after a number of cell divisions. This results in heterogeneous cell populations after prolonged cultivation, in which some cells continue to express high levels of recombinant protein, while others express low or undetectable levels of the protein (Martin and Whitelaw, 1996; McBurney et al., 2002). A cell line that is used for heterologous protein production is derived from a single cell, yet is often scaled up to, and maintained for long periods at, cell densities in excess of ten million cells per milliliter in cultivators of 1,000 liters or more. These large cell populations (1014 to 1016 cells) are prone to serious declines in productivity due to transgene silencing (Migliaccio et al., 2000; Strutzenberger et al., 1999).
The instability of expression of recombinant host cells is particularly severe when transgene copy numbers are amplified in an attempt to increase yields. Transgene amplification is achieved by including a selectable marker gene, such as dihydrofolate reductase (DHFR), with the transgene during integration (Kaufman, 2000). Increased concentrations of the selection agent (in the case of DHFR, the drug methotrexate) select for cells that have amplified the number of DHFR genes in the chromosome (Kaufman and Sharp, 1982). Since the transgene and DHFR are co-localized in the chromosome, the transgene copy number increases too. This is correlated with an increase in the yield of the heterologous protein (Kaufman, 1990). However, the tandem repeats of transgenes that result from amplification are highly susceptible to silencing (Garrick et al., 1998; Kaufman, 1990; McBurney et al., 2002).
The above-stated problems associated with conventional transgene expression technologies for protein production clearly demonstrate a need in the art for systems that overcomes these problems. Specifically, there is a need for expression systems that i) provide high predictability of expression, allowing balanced expression of multiple chains, ii) provide high yields, iii) provide stability during an extended period during which the protein needs to be produced in large quantities, and iv) result in an increased number of clones with appropriate expression levels.