Recent advances in genomics technology have created an enormous wealth of information about living organisms at the DNA or gene level. Challenges currently facing the biotechnology community are to elucidate the function of these DNAs or putative genes, which often requires the study of these genes at the protein level. As functional genomics research usually involves the generation and subsequent sequencing of cDNA clones, further functional and/or structural study of these clones at the protein or amino acid level often involves expressing these genes or DNA inserts in a suitable system in a sufficient amount and suitable form. These cDNA clones, however, are often not suitable for protein expression purposes, and the cDNA inserts in these clones need to be subcloned into a suitable expression vector. Because the sheer number of clones in need to be studied, traditional cloning and subcloning methods clearly are not adequate because of the need for individual subcloning strategies and their low efficiency. There is currently an increasingly pressing need for technologies that allow rapid, precise, and directional gene transfer from one vector (often termed a “donor vector”), such as the cDNA clones from a genomics research project, into a suitably adapted expression vector (the “acceptor vector”). Preferably, the methodology and acceptor vector should allow (1) the elimination of individual, often awkward and time-consuming subcloning strategies, (2) rapid screening of hundreds of genes at a time (3) in a variety of host organisms or cell lines. Furthermore, the new methods and vectors should be suitable for studies at genome-wide scale using informatics and automation.
Strategies and commercial products already exist in the prior art that utilizes site-specific recombination systems, or restriction endonuclease digestion and subsequent ligation. Restriction endonuclease digestion and ligation have been used in molecular biology and biotechnology for a very long time and are well-established technology. Likewise, the recombination process and related proteins are also well known to those skilled in the art and numerous recombination systems from various organisms have been described, see, e.g. Landy, A., 1993, Current Opinion in Biotechnology 3:699–707; Abremski et al., 1986, J. Biol. Chem. 261:391; Hoess et al., 1986, Nucleic Acids Research 14:2287; Campbell, 1992, J. Bacteriol. 174:7495; Qian et al., 1992, J. Biol. Chem. 267:7794; Araki et al., 1992, Biol. 225:25; Maeser et al., 1991, Mol. Gen. Genet. 230:170–176.
The use of phage lambda enzymatic site-specific recombination and wild-type recombination sites attB and attP for the construction of a DNA segment to produce a protein in E. coli was disclosed in U.S. Pat. No. 4,673,640. Intramolecular recombination between wild type attP and attB sites which flank a promoter with lambda recombination system in vivo was described by Hasan et al. (1987, Gene 56:145–151).
Palazzolo et al., 1990, Gene 88:25–36 discloses phage lambda vectors having bacteriophage lambda arms positioned outside a cloned DNA sequence and between wild-type loxP sites. E. coli cells that express the Cre recombinase is transformed by these phage vectors, resulting in recombination between the loxP sites and in vivo excision of the plasmid replicon, including the cloned cDNA.
A method for inserting partial genomic DNA into expression vectors having a selectable marker was described in Posfai et al. (1994, Nucl. Acids Res. 22:2392–2398). The marker was flanked by two wild-type frt recognition sequences and FLP recombinase in the cells integrates the vectors into the genome at predetermined sites.
U.S. Pat. No. 5,434,066 discloses the use of site-specific recombinases such as Cre for DNA containing two loxP sites for in vivo recombination between the sites.
Waterhouse et al. (Nucleic Acids Res. 21 (9):2265 (1993)) discloses an in vivo method to clone light and heavy chains of a particular antibody in different phage vectors, using recombination between loxP and loxP 511 sites. The Cre protein acts in the host cells on the two parental molecules (one plasmid, one phage) and produce four products in equilibrium: two different cointegrates (produced by recombination at either loxP or loxP 511 sites), and two daughter molecules, one of which was the desired product.
Schlake et al. (Biochemistry 33:12746–12751 (1994)) discloses an in vivo method for exchanging expression cassettes at defined chromosomal locations, each flanked by a wild type and a spacer-mutated FRT recombination site. A double-reciprocal crossover was mediated in cultured mammalian cells by using this FLP/FRT system for site-specific recombination.
The transposase family of enzymes have also been used to transfer genetic information between replicons. Transposons are structurally variable, being described as simple or compound, but typically encode a recombinase gene flanked by DNA sequences organized in inverted orientations. Integration of transposons can be random or highly specific. Representatives such as Tn7, which are highly site-specific, have been applied to the in vivo movement of DNA segments between replicons (Lucklow et al., J. Virol. 67:4566–4579 (1993)).
Devine et al. (Nucl. Acids Res. 22:3765–3772 (1994) disclose a system that makes use of the integrase of yeast TY1 virus-like particles. The DNA segment of interest is cloned, using standard methods, between the ends of the transposon-like element TY1. In the presence of the TY1 integrase, the resulting element integrates randomly into a second target DNA molecule.
U.S. Pat. No. 6,410,317 B1, to Farmer et al. discloses a method for producing expression vectors using the Cre recombinase. In this system, the gene of interest is inserted into a polylinker site, or multiple cloning site (“MCS”), via restriction endonuclease digestion and ligation, of a “donor vector.” The MCS is flanked by two loxP sites, the specific recognition site of Cre Recombinase of bacteriophage P1, oriented in the same direction. In the presence of Cre recombinase, the gene of interest of the donor vector is transferred to an “acceptor” or a receiver vector which contains one loxP site at which the gene of interest will be inserted. The acceptor vector also contains various other elements generally required of an expression vector, such as a suitable promoter, a suitable marker gene or genes, an appropriate peptide tag, and replication origin. The acceptor vector is thus converted into a desired expression vector, which is disclosed to be suitable for expression in many expression systems.
U.S. Pat. No. 6,277,608 B1, to Hartley et al., discloses an alternative strategy using a system of at least a recombinase, an insert donor and a vector donor. The insert donor contains two site-specific recombinase recognition sites, each of which is recognized by its own recombinase, but which do not recombine with each other. Similarly, the vector donor also contains two site-specific recombinase recognition sites, recognized by the same two recombinase. The recognition site for one recombinase on the insert donor is capable of recombining with the recognition site for the same recombinase on the vector donor, and likewise, the recognition site for the other recombinase is capable of recombining with the recognition site for that recombinase. In the presence of one recombinase, the two donor molecules recombine to form a circular cointegrate, which in the presence of the second recombinase resolves into two circular molecules, one the desired expression vector, and the other a by-product. The specific recombinase/recognition site system exemplified in Hartley et al. is the well-knonw Integrase/att system from bacteriophage λ (see e.g. Landy, 1993, Current Opinions in Genetics and Devel. 3:699–707).
International Patent Application WO 02/46372 (Chestnut et al., 2002) discloses a method for cloning two or more different nucleic acid molecules simultaneously using vectors having multiple recombination sites and/or multiple topoisomerase recognistion sites. Published U.S. patent application 20030124555 (Brasch et al.) discloses a method for cloning a population of nucleic acid molecules on interest, specifically a cDNA library into a vector which has one or two recombination sites. The method requires two or more recombination steps and multiple recombination sites. These cloning methods utilizing site-specific recombination, however, require an initial construction of a donor vector that contains the desired DNA segment(s) or insert(s), followed by the transfer of the segment(s) into a second, desired expression vector for the expression of the segment(s).
In addition, in some the above methods, the DNA of interest is transferred to the acceptor vector via site-specific recombination. Invariably, the promoter and other regulatory elements of the expression vector are placed on one side (upstream) of the newly formed recombination recognition site(s), while the insert sequence or DNA of interest is invariably on the other side (downstream) of the same recombination recognition site. As a consequence, the nucleotides of the recombination recognition site(s) are placed in-between the promoter (and/or other expression signals) and the coding sequence of the DNA of interest, and may be expressed as well, resulting in additional and unwanted amino acid residues in the expressed product. For example, in the Integrase/att cloning system, the protein expressed from the insert DNA contains an extra fragment of at least eight (8) amino acids as a result of the expression of the DNA sequences of the attsite, that is flanked by a DNA sequence encoding a tag peptide and the insert DNA. In some cases, these extra amino acids are undesirable because they may affect efficiency of the protein expression, and the structure and function of the expressed protein. They may also affect the correct folding and/or configuration of the protein, and affect or change the biochemical and biophysical properties of the fusion protein, which is of great concern to investigators.
Another undesirable aspect of the prior art recombination cloning method relates to the need to synthesize long oligonucleotide primers for the cloning process. Invariably, the gene of interest needs to be amplified using the polymerase chain reaction (PCR) and then cloned into a vector. The PCR primers are engineered to contain, in addition to the gene-specific sequence at the 3′ end, linker sequences at the 5′ end containing the recombinase recognition site. These linker sequences are suitable to allow for subsequent site-specific recombination reactions. Because these linker sequences often are dozens of nucleotides long, their synthesis adds considerable costs to the cloning efforts. Furthermore, the longer the primers, the higher the possibility of errors in the primers introduced during the chemical synthesis.
In some of the recombination cloning methods described above, an in vivo step is required for the recombination reaction in order to exchange the insert DNA from one vector to another. An in vivo recombination is a very complex process that involves homologous recombination, Because of the DNA recombination repair machinery of the cell, there is a high risk that the genetic information carried by the insert DNA may be changed by these recombination cloning.
Furthermore, all of the above recombination cloning and subcloning approaches require at least two separate cloning processes in order to position expression signals and a DNA of interest in an expression vector.
There is therefore a need for alternative and improved methods and vectors for cloning and shuttle cloning or subcloning.