Proteins are produced in systems for a wide range of applications in biology and biotechnology. These include research into cellular and molecular function, production of proteins as biopharmaceuticals or diagnostic reagents, and modification of the traits or phenotypes of livestock and crops. Biopharmaceuticals are usually proteins that have an extracellular function, such as antibodies for immunotherapy or hormones or cytokines for eliciting a cellular response. Proteins with extracellular functions exit the cell via the secretory pathway, and undergo post-translational modifications during secretion. The modifications (primarily glycosylation and disulfide bond formation) do not occur in bacteria. Moreover, the specific oligosaccharides attached to proteins by glycosylating enzymes are species and cell-type specific. These considerations often limit the choice of host cells for heterologous protein production to eukaryotic cells (Kaufman, 2000). For expression of human therapeutic proteins, host cells such as bacteria, yeast, or plants may be inappropriate. Even the subtle differences in protein glycosylation between rodents and human, for example, can be sufficient to render proteins produced in rodent cells unacceptable for therapeutic use (Sheeley et al., 1997). The consequences of improper (i.e., non-human) glycosylation include immunogenicity, reduced functional half-life, and loss of activity. This limits the choice of host cells further, to human cell lines or to cell lines such as Chinese Hamster Ovary (CHO) cells, which may produce glycoproteins with human-like carbohydrate structures (Liu, 1992).
Some proteins of biotechnological interest are functional as multimers, i.e., they consist of two or more, possibly different, polypeptide chains in their biologically and/or biotechnologically active form. Examples include antibodies (Wright & Morrison, 1997), bone morphogenetic proteins (Groeneveld & Burger, 2000), nuclear hormone receptors (Aranda & Pascual, 2001), heterodimeric cell surface receptors (e.g., T cell receptors, (Chan & Mak, 1989)), integrins (Hynes, 1999), and the glycoprotein hormone family (chorionic gonadotrophin, pituitary luteinizing hormone, follicle-stimulating hormone, and thyroid-stimulating hormone, (Thotakura & Blithe, 1995)). Production of such multimeric proteins in heterologous systems is technically difficult due to a number of limitations of current expression systems. These limitations include (1) difficulties in isolating recombinant cells/cell lines that produce the monomer polypeptides at high levels (predictability and yield), (2) difficulties in attaining production of the monomeric polypeptides in stoichiometrically balanced proportions (Kaufman, 2000), and (3) declines in the levels of expression during the industrial production cycle of the proteins (stability). These problems are described in more detail below.
(1) Recombinant proteins such as antibodies that are used as therapeutic compounds need to be produced in large quantities. The host cells used for recombinant protein production must be compatible with the scale of the industrial processes that are employed. Specifically, the transgene (or the gene encoding a protein of interest, the two terms are used interchangeably herein) expression system used for the heterologous protein needs to be retained by the host cells in a stable and active form during the growth phases of scale-up and production. This is achieved by integration of the transgene into the genome of the host cell. However, creation of recombinant cell lines by conventional means is a costly and inefficient process due to the unpredictability of transgene expression among the recombinant host cells. The unpredictability stems from the high likelihood that the transgene will become inactive due to gene silencing (McBurney et al., 2002). Using conventional technologies, the proportion of recombinant host cells that produce one polypeptide at high levels ranges from 1-2%. In order to construct a cell line that produces two polypeptides at high levels, the two transgenes are generally integrated independently. If the two transgenes are transfected simultaneously on two separate plasmids, the proportion of cells that will produce both polypeptides at high levels will be the arithmetic product of the proportions for single transgenes. Therefore, the proportion of such recombinant cell lines ranges from one in 2,500 to one in 10,000. For multimeric proteins with three or more subunits, the proportions decline further. These high-producing cell lines must subsequently be identified and isolated from the rest of the population. The methods required to screen for these rare high-expressing cell lines are time-consuming and expensive.
An alternative to simultaneous transfection of two transgene-bearing plasmids is sequential transfection. In this case the proportion of high-yielding clones will be the sum of the proportions for single transgenes, i.e., 2-4%. Sequential transfection however has (major) drawbacks, including high costs and poor stability. The high costs result from various factors: in particular, the time and resources required for screening for high-expressing cell lines is doubled, since high expression of each subunit must be screened for separately. The poor overall stability of host cells expressing two polypeptides is a consequence of the inherent instability of each of the two transgenes.
(2) Production of multimeric proteins requires balanced levels of transcriptional and translational expression of each of the polypeptide monomers. Imbalanced expression of the monomers is wasteful of the costly resources used in cell cultivation. Moreover, the imbalanced expression of one monomer can have deleterious effects on the cell. These effects include (a) sequestration of cellular factors required for secretion of the recombinant proteins (e.g., chaperones in the endoplasmic reticulum, (Chevet et al., 2001)), and (b) induction of stress responses that result in reduced rates of growth and protein translation, or even in apoptosis (programmed cell death) (Pahl & Baeuerle, 1997, Patil & Walter, 2001). These deleterious effects lead to losses in productivity and yield and to higher overhead costs.
(3) Silencing of transgene expression during prolonged host cell cultivation is a commonly observed phenomenon. In vertebrate cells it can be caused by formation of heterochromatin at the transgene locus, which prevents transcription of the transgene. Transgene silencing is stochastic; it can occur shortly after integration of the transgene into the genome, or only after a number of cell divisions. This results in heterogeneous cell populations after prolonged cultivation, in which some cells continue to express high levels of recombinant protein while others express low or undetectable levels of the protein (Martin & Whitelaw, 1996, McBurney et al., 2002). A cell line that is used for heterologous protein production is derived from a single cell, yet is often scaled up to, and maintained for long periods at, cell densities in excess of ten million cells per milliliter in cultivators of 1,000 liters or more. These large cell populations (1014-1016 cells) are prone to serious declines in productivity due to transgene silencing (Migliaccio et al., 2000, Strutzenberger et al., 1999).
The instability of expression of recombinant host cells is particularly severe when transgene copy numbers are amplified in an attempt to increase yields. Transgene amplification is achieved by including a selectable marker gene such as dihydrofolate reductase (DHFR) with the transgene during integration. Increased concentrations of the selection agent (in the case of DHFR, the drug methotrexate) select for cells that have amplified the number of DHFR genes in the chromosome. Since the transgene and DHFR are co-localized in the chromosome, the transgene copy number increases too. This is correlated with an increase in the yield of the heterologous protein (Kaufman, 1990). However, the tandem repeats of transgenes that result from amplification are highly susceptible to silencing (Garrick et al., 1998, Kaufman, 1990, McBurney et al., 2002). Silencing is often due to a decline in transgene copy number after the selection agent is removed (Kaufman, 1990). Removal of the selection agent, however, is routine during industrial biopharmaceutical production, for two reasons. First, cultivation of cells at industrial scales in the presence of selection agents is not economically feasible, as the agents are expensive compounds. Second, and more importantly, concerns for product purity and safety preclude maintaining selection during a production cycle. Purifying a recombinant protein and removing all traces of the selection agent is necessary if the protein is intended for pharmaceutical use. However, it is technically difficult and prohibitively expensive to do so, and demonstrating that this has been achieved is also difficult and expensive. Therefore, amplification-based transgenic systems that require continual presence of selection agents are disadvantageous.
Alternatively, silencing can be due to epigenetic effects on the transgene tandem repeats, a phenomenon known as Repeat Induced Gene Silencing (RIGS) (Whitelaw et al., 2001). In these cases the copy number of the transgene is stable, and silencing occurs due to changes in the chromatin structure of the transgenes (McBurney et al., 2002). The presence of a selection agent during cell cultivation may be unable to prevent silencing of the transgene transcription unit because transgene expression is independent of expression of the selectable marker. The lack of a means to prevent RIGS in conventional transgenic systems thus results in costly losses in productivity.