The increased use of nucleotide sequence data mining techniques has highlighted the need for efficient methods of producing recombinant proteins. While it is possible to use bacteria to synthesize recombinant protein, this approach cannot be conveniently applied to eukaryotic proteins that require post-translational modification for their activity. Moreover, foreign proteins may be recognized as such by bacterial host specific proteases, resulting in a low protein yield.
One strategy for obtaining a high yield of a recombinant protein by eukaryotic cells is to increase the gene dosage. This can be achieved with viral vectors, such as bovine papilloma virus, simian virus 40, and Epstein-Barr virus, which provide a high copy number per cell (see, for example, DiMaio et al., Proc. Nat'l Acad. Sci. USA 79:4030 (1982); Yates et al., Nature 313:812 (1985)). However, the use of these episomal systems is limited to certain permissive host cells that can support viral replication. In addition, expression is often transient due to vector instability.
Vector stability is improved when the vector is integrated into the genomic DNA of the host cell. Another approach, therefore, is to select cells containing vector sequences, which have been amplified after integration into genomic DNA. Typically, the selection procedure is performed by transfecting cells with a gene encoding the desired protein and a gene that encodes a protein, which confers resistance to a toxic drug. The co-amplification of transfected DNA can provide a 100- to 1000-fold increase in the expression of the desired protein.
Although over twenty selectable and amplifiable genes have been described, the most popular selectable marker gene for amplification is the dihydrofolate reductase (DHFR) gene (Kaufman, Methods Enzymol. 185:487 (1990)). In this approach, the copy numbers of the DHFR gene and an associated gene are increased by selection in methotrexate, which is a competitive inhibitor of the DHFR enzyme. Stepwise increases in methotrexate concentration result in the selection of clones that often express elevated levels of DHFR, usually due to gene amplification, and increased expression of the co-amplified gene. One disadvantage of DHFR co-amplification is the requirement of a DHFR-deficient cell line. Another drawback is that the methotrexate dose must be increased in small increments in a stepwise amplification with clones picked and expanded at each step. Consequently, a significant investment in time is required to obtain a highly amplified clone (see, for example, Barsoum, DNA and Cell Biology 9:293 (1990)). As an illustration, Chinese hamster DHFR− cells are often used for the synthesis of recombinant proteins because the recombinant genes integrated into the host chromosome along with the DHFR gene can be efficiently co-amplified by increasing the methotrexate concentration. However, it normally takes six to ten months to establish cell lines that produce desired amounts of recombinant proteins after transfection (see, for example, Choo et al., Gene 46:277 (1986)).
Gene amplification has also been obtained using selectable marker genes such as adenosine deaminase genes, ornithine decarboxylase genes, and the human multidrug resistance gene, MDR1 (Kaufman et al., Proc. Nat'l Acad. Sci. USA 83:3136 (1982); Chiang and McConlogue, Mol. Cell. Biol. 8:764 (1988); Germann et al., J. Biol. Chem. 264:7418 (1989); Kane et al., Mol. Cell. Biol. 8:3316 (1988)). Kaufman, U.S. Pat. No. 5,238,820, took advantage of the availability of multiple amplifiable genes by designing vectors that carry two or more different heterologous selectable amplifiable marker genes. The objective was to achieve higher levels of gene amplification. In this approach, transformed cells are first grown under suitable conditions for selecting and amplifying one heterologous selectable amplifiable marker gene to increase the copy number of the desired protein gene. The copy number is then further increased by growing the cells under suitable conditions for selecting, and amplifying the second heterologous selectable amplifiable marker gene. This process is repeated for each additional selectable marker that may be present.
Studies indicate that, when plasmids reach a host cell nucleus, the plasmids are cleaved and spliced into high molecular weight concatemers. In vivo gene amplification has the disadvantage that the structure of the amplified gene cannot be controlled, and success is not predictable. Barsoum, DNA and Cell Biology 9:293 (1990), described a high copy number electroporation of Chinese hamster ovary cells with high concentrations of expression vector, which had been linearized with a restriction endonuclease that left cohesive ends. A significant portion of the introduced DNA was arranged in tandem repeats of unknown length that comprised the copies of the vector in mixed orientations. Although this method provided control over the plasmid cleavage site, in vivo ligation and integration events were not controlled.
One strategy for imposing greater control into the gene amplification process is to polymerize the gene of interest in vitro before introducing the DNA into a host cell (see, for example, Leahy et al., Bioconjugate Chem. 7:545 (1996); Leahy et al., Nucl. Acids Res. 25:449 (1997)). Early attempts to generate tandem arrays of DNA fragments required the ligation of the DNA fragment into an appropriate vector, and typically, this simple approach yielded a random orientation of fragments, resulting in polymers containing both direct and inverted repeats (see, for example, Sadler et al., Gene 3:211 (1978)). While the presence of inverted repeats in a polymer led to instability of the DNA inside the host cell, a series of direct repeats was found to form stable molecules.
A problem in controlling fragment orientation is that many of the commonly used restriction enzymes produce termini that are rotationally equivalent, and therefore, self-ligation of DNA fragments with such termini is random with regard to fragment orientation. Hartley and Gregori, Gene 13:347 (1981), reported a technique to control fragment orientation during ligation, which required the introduction of AvaI sites flanking either end of the cloned fragment (also see Hartley and Gregori, U.S. Pat. No. 4,403,036). Since AvaI cleavage produces distinguishable ends, self-ligation of the fragment results in a strong bias toward head-to-tail orientation. This is so because head-to-head and tail-to-tail ligation results in base mismatches. The polymerized molecules were then inserted into a vector and used to transform E. coli. 
In a similar approach, Ikeda et al., Gene 71:19 (1988), produced head-to-tail tandem arrays of a DNA fragment encoding a human major histocompatibility antigen that was flanked by SfiI cleavage sites. SfiI produces cleaved ends that are not rotationally equivalent. A cosmid vector containing the amplified gene and hygromycin B resistance-conferring and dhfr genes was used to transfect a murine cell line.
SfiI sites have also been used to produce copolymers of gene expression cassettes and selection markers, which can be used to transfect cells (Monaco et al., Biotechnol. Appl. Biochem. 20:157 (1994); Asselbergs et al., Anal. Biochem. 243:285 (1996)). According to the method of Monaco et al., the copolymer is treated with NotI to cleave the DNA at the 3′-end of the selectable marker gene. In this way, transfected DNA molecules will contain only one selectable marker gene per copolymer.
Class IIS restriction enzymes can generate totally asymmetric site and complementary cohesive ends. Kim and Szybalski, Gene 71:1 (1988), took advantage of this quality by introducing sites for BspMI, a class-IIS restriction enzyme, at either end of cloned DNA. Self-ligation of the cloned DNA provided multimers comprising repeat units in the same orientation. Similarly, Takeshita et al., Gene 71:9 (1988), achieved tandem gene amplification by inserting a fragment encoding human protein C into a plasmid to introduce asymmetric cohesive ends into the fragment. In this case, sites for the class IIS enzyme, BstXI, were used. The multimer was then cloned into a cosmid vector comprising a neo gene, packaged into lambda phage particles, and amplified in E. coli. The cosmid vectors were then introduced into Chinese hamster ovary DHFR-cells, which were treated with G418 to select for cells that expressed the neo gene. Takeshita et al. also found that cells expressed human protein C, albeit at lower levels, following transfection with unpackaged tandem ligated DNA comprising copies of the cosmid vector and the human protein C gene.
A similar approach was also described by Lee et al., Genetic Analysis: Biomolecular Engineering 13:139 (1996), who amplified target DNA as tandem multimers by cloning the target DNA into a class IIS restriction enzyme cleavage site of a vector, excising a monomeric insert with the class IIS restriction enzyme, isolating monomeric inserts, self-ligating the inserts, and cloning the multimers into a vector. According to Lee et al., this scheme is useful for polymerizing short DNA fragments for the mass production of peptides.
Another scheme for forcing directional ligation is to devise synthetic linkers or adapters that are used to create asymmetric cohesive ends. For example, Taylor and Hagerman, Gene 53:139 (1987), modified by Hartley-Gregori approach by attaching synthetic directional adapters to a DNA fragment in order to establish complete control over fragment orientation during ligation. Following polymerization, the multimers were ligated to a linearized vector suitable for E. coli transformation. St{dot over (a)}hl et al., Gene 89:187 (1990), described a similar method for polymerizing DNA fragments in a head-to-tail arrangement. Here, synthetic oligonucleotides were designed to encode an epitope-bearing peptide with 5′-protruding ends complementary to the asymmetric cleavage site of the class IIS restriction enzyme, BspMI. After polymerization, the peptide-encoding fragments were inserted into the unique BspMI site cleavage site of a vector, which was used to transform E. coli. Clones were screening using the polymerase chain reaction, and then subcloned into prokaryotic expression vectors for production of the peptides in E. coli. 
In sum, methods that rely on in vivo gene amplification are not only time consuming, but also lack control over the final structure of the integrated and amplified gene. While in vitro gene amplification methods provide some control over the structure of the integrated gene, current methods typically require multiple cloning steps in prokaryotic hosts. In addition, presently described methods often require selection of transfected cells with a toxic drug that is rendered harmless by an enzyme product of a co-transfected gene. There is no assurance that cells possessing a sufficient level of this enzymatic activity also possess a sufficient number of copies of the desired gene to provide high levels of expression of the desired recombinant protein.
Despite advances in obtaining high levels of gene expression in recombinant host cells, therefore, a need still exists for a strategy that provides a rapid and simple method of producing high levels of recombinant protein in eukaryotic cells.