Vectors such as cosmids, yeast artificial chromosomes (YACs), and bacterial artificial chromosomes (BACs) permit the construction of large insert genomic DNA libraries. Such libraries have served a pivotal role for the isolation and characterization of important genomic regions and genes from a variety of organisms including bacterial, archaea, mammals etc. The bacterial artificial chromosome (BAC) system is emerging as the system of choice for constructing libraries with DNA inserts up to 300 kilobases. A major advantage of BACs is that plasmids containing large inserts can be efficiently transformed by electroporation and propagated in E. coli. The low copy number of the BAC vector (1-2 per cell), is thought to contribute to the stability of large BACs over many generations, as compared to multi-copy counterparts (Kim et al, NAR, 20(5):1083-1085). The popular BAC vector pBeloBAC11 (Research Genetics) is derived from the endogenous E. coli F plasmid. The F backbone contains four essential regions that play a role in plasmid stability and copy number. Both parA and B are required for partitioning and plasmid stability functions, parB is also required for incompatibility with regard to other F factors. OriS is the origin of F plasmid DNA replication, which is unidirectional. repE encodes protein E, essential for replication from OriS and for copy number control. A chloramphenicol resistance gene was incorporated for antibiotic selection of transformants. pBeloBAC11 encodes the lacZ gene, and thus the identification of recombinant DNA clones is simplified by blue/white selection. The most widely used E. coli strain for BAC cloning is DH10B (Grant et al. 1990. PNAS 87:4645). Key features of this strain include mutations that block: 1) restriction of foreign DNA by endogenous restriction endonucleases (hsdRMS); 2) restriction of DNA containing methylated DNA (5xe2x80x2 methyl cytosine or methyl adenine residues,and 5xe2x80x2 hydroxymethyl cytosine) (mcrA, mcrB, mcrC, and mrr); 3) recombination(recA1).
BAC plasmids are most popularly used for genome mapping, positional cloning, and DNA sequencing. One can also analyze expression of heterologous activities encoded by a BAC insert. Whereas the single copy nature of BAC vectors contributes to insertion stability, this same property is usually a liability for purifying and sequencing BAC DNA. A large volume of culture is needed to obtain enough plasmid DNA for conventional uses. The large volume introduces significant chromosomal DNA contamination of plasmid preparations, which often interferes with subsequent manipulations of the vector, including DNA sequencing reactions. To minimize co-purification of chromosomal DNA, conventional DNA isolation protocols must be considerably modified and therefore are not easily amenable to high-throughput protocols for plasmid DNA isolation and sequencing.
An additional potential liability of the single copy BAC vector relates to expression of heterologous DNA in E. coli. Expression can be limited by single plasmid copy number, especially if expression is reliant on foreign promoters present in the heterologous insert.
Our invention provides methods that facilitate 1)cloning of large inserts into BAC plasmids 2) isolation of large amounts of BAC DNA (by increasing plasmid copy number), and 3) increasing heterologous expression from BAC plasmid inserts (by increasing plasmid copy number and/or introducing promoters into the insert).
Cloning and sequencing of large DNA fragments has become increasingly necessary as more researchers enter the field of genomics. Although many vectors and tools are available for these tasks, such vectors are often low copy so that the large DNA inserts are stably maintained within the vector. A major impediment to the use of low copy number vectors is the difficulty in preparing large quantities of vector for cloning and sequencing. In particular, automated sequencing techniques are not adapted for use with low copy vectors. Expression of gene products encoded by large DNA inserts may also suffer due to the low copy number of the vectors. The invention described herein provides novel vectors for improving cloning, sequencing and expression of DNA inserts in low copy vectors. In one aspect, the invention provides a vector for increasing the copy number of plasmids, comprising a transposable element containing a moderate or high copy number origin of replication capable of in vitro transposition into a target plasmid. The target plasmid is a single or low copy plasmid, e.g. a BAC vector, that is useful for cloning large pieces of DNA. The transposon plasmid may contain any moderate or high copy origin of replication that is compatible with a bacterial host such as E. coli. Thus, an exemplary ori is the colE1 ori from pBR322. Expression of gene products encoded by the DNA inserts is facilitated by addition of a transcription control sequence to the transposable element. In certain embodiments, the transcription control sequence is the T7 promoter, which is functional in cells expressing the T7 RNA polymerase. Other promoters that are useful for increasing expression of cloned genes include endogenous bacterial promoters.
The vectors may further comprise one or more antibiotic resistance genes, such as those for ampicillin, tetracycline or kanamycin. In addition, they may contain a counterselectable marker, such as the sacB gene from B. subtilis, to insure that only transformants which take up the target plasmid will survive.
The vector components described above may be combined in a number of ways to provide novel vectors. For example, one such vector may comprise (a) a transposable element containing a high copy number origin of replication, (b) an antibiotic resistance gene and (c) a counterselectable marker. Other vectors may contain a transcription control sequence in addition to the above components. One exemplary vector is pTRANS-sacB, which contains (a) a transposable element containing a pBR322 origin of replication, (b) a kanamycin resistance gene, (c) a B. subtilis sacB gene, and (d) a T7 promoter.
Another possible combination of components is found in a vector comprising (a) a transposable element containing a high copy number origin of replication, (b) an antibiotic resistance gene, and (c) a transcription control sequence. An exemplary vector of this type is pTRANS, which contains (a) a transposable element containing a pBR322 origin of replication, (b) a kanamycin resistance gene, and (c) a T7 promoter.
The invention also provides methods for using such transposon plasmids.
For example, the invention provides a method for increasing the copy number of a target plasmid comprising: mixing, in vitro, the target plasmid with any of the vectors described above under conditions permitting introduction of the high copy number origin of replication into the target plasmid.
As mentioned, sequencing from BAC and other low copy vectors is difficult due to the necessity of using large numbers of cells to obtain sufficient DNA for sequencing. The invention thus provides a method for sequencing a gene in a low copy number plasmid, comprising mixing, in vitro, the target plasmid with transposon vector of this invention, transforming the mixture and determining the sequence of genes isolated from selected transformants. Transformants which have the transposon introduced into a useful locus in the target plasmid may be screened for by detecting a phenotypic change in the clones transformed with the mixture relative to clones transformed with BAC vector alone. Phenotypic changes that may be observed include an increase or decrease in gene expression.
Vectors containing transcription control sequences may be used to increase expression of a gene in a target plasmid by mixing such vectors in vitro with a target plasmid and then transforming the mixture into cells capable of recognizing the transcription control element and expressing the gene. For example, a target plasmid into which a transposon containing a T7 promoter has been introduced may be transformed into cells expressing T7 polymerase.
The plasmids of this invention also facilitate full length cloning of genes, e.g. those isolated from a plurality of organisms or from a genomic source. The method for full length cloning of genes comprises mixing a BAC library with a transposon plasmid of this invention to increase the copy number of the plasmids, and then isolating large amounts of DNA and cloning full length genes.
Another use for these plasmids is to generate shuttle vectors without cloning. The invention provides a method for generating a shuttle vector comprising mixing, in vitro, a target plasmid with a vector comprising a transposable element containing an origin of replication for a host different from that of the target plasmid, under conditions permitting transposition of the ori into the target plasmid. If desired, the ori may be a moderate or high copy number ori.
In another aspect, the invention provides improved BAC vectors which facilitate cloning of large DNA fragments into low copy vectors. These improved BAC vectors comprise a high copy origin of replication flanked by cleavage sites for a restriction enzyme, wherein cleavage of the vector with the restriction enzyme leaves single base extensions for cloning and removes the high copy origin of replication. In some embodiments, the vectors further comprise a BST X1 site. An exemplary vector of this type is pBacTA.PUC2.