Field of the Invention
The present invention relates to the field of molecular biology. More specifically, the invention relates to compositions having a transposase complexed with an oligonucleotide adapter in crude cell lysates or bound to a solid support by way of a specific binding pair linkage, to use of the transposase/oligonucleotide complex in purifying the complex, and to use of the complex to prepare DNA molecules for in vitro amplification, sequencing of nucleic acids, and screening of DNA libraries for sequences of interest.
Description of Related Art
Fragmentation of genomic DNA is a crucial step in DNA sample preparation for high-throughput sequencing, also referred to as next generation sequencing or NGS. Originally used sample preparation methods, such as DNA fragmentation using DNAse I, are very unreliable and often result in DNA fragmentation that is either insufficient or too extensive. In either case, the yield of DNA fragments of useful size (about 200-800 base pairs (bp)) is low. DNA shearing using sonicators, for example E220 and E220x instruments from Covaris (Woburn, Mass.), provides an alternative. However, such instruments are very expensive (over $100,000 in 2012 prices) and overall DNA shearing is a laborious and multi-stage process. It involves DNA fragmentation, fragments ends repair, first fragments purification, poly-A tailing, adapter ligation, second fragments purification, PCR amplification, and third fragments purification. A number of steps can be cut in half using oligonucleotide-transposase complexes, such as the NEXTERA™ DNA sample prep kit from Illumina (San Diego, Calif.). The oligonucleotide-transposase complex provided with the kit can effect both controlled DNA fragmentation and attachment of adapters in a single reaction, which takes only a few minutes. Such complexes are comprised of a dimer of modified Tn5 transposase and a pair of Tn5-binding double-stranded DNA (dsDNA) oligonucleotides containing a 19 bp transposase-binding sequence, or inverted repeat sequence (IR). In the NEXTERA™ system, an engineered, non-native 19 bp transposase binding sequence is used, which provides more efficient DNA fragmentation than the native Tn5 IR sequence. This binding sequence is referred to as “mosaic”.
Unlike DNAase, a single molecule of which can generate numerous breaks in a target DNA, the transposase complex is believed to create only one DNA cleavage per complex. Therefore, unlike with DNAse I, the degree of DNA fragmentation is easily controlled during transposase fragmentation by controlling the ratio of transposase complex to target DNA in the reaction mixture. Furthermore, specific nucleotide tags combined with the mosaic sequence can be attached in this transposase-mediated DNA fragmentation process, which is useful for DNA amplification in PCR and attaching the DNA fragments to sequencing chips. Despite obvious advantages in cost, time and labor, the transposase method is less frequently used as compared to sonication because it results in not entirely random fragmentation (bias) of target DNA.
To date the only transposase that is known to be suitable for DNA fragmentation and tagging in NGS is a modified Tn5 transposase. From the onset, Tn5 transposase has been problematic in several respects. First of all, the native transposase was practically impossible to produce, as it is toxic for E. coli when expressed from a strong promoter. However, this difficulty was overcome by deleting several N-terminal amino acids (Weinreich et al., J. Bacterial, 176: 5494-5504, 1994). Though this solved the toxicity problem, and the N-terminally truncated transposase was produced at high yield, it possessed very low activity. Therefore, several other mutations were introduced to increase its activity (U.S. Pat. No. 5,965,443; U.S. Pat. No. 6,406,896 B1; U.S. Pat. No. 7,608,434). However, this did not solve all of the problems with the enzyme. For example, the mutated enzyme is stable only in high salt, such as 0.7M NaCl, (Steiniger et al., Nucl. Acids Res., 34: 2820-2832, 2006); it quickly loses its activity at the lower salt conditions that are required for the transposase reaction, with a half-life only 2.4 minutes in the reaction mixture. Thus, DNA fragmentation reactions using this transposase are typically performed in five minutes and very large amounts of enzyme are used. Despite the fact that high salt concentration is maintained throughout the purification process, the purified enzyme is largely inactive; thus, 9.4 times excess of enzyme over oligonucleotides is typically used to form Tn5 transposase-oligonucleotide complexes (Naumann and Reznikoff, J. Biol. Chem., 277:17623-17629, 2002). In addition, the transposase is prone to proteolytic degradation. To address this problem, the degradation-prone sites were mutated. Interestingly, these mutations resulted in drastic reduction of the in vivo activity of the enzyme, but had little effect on the in vitro activity (Twining et al., J. Biol. Chem., 276: 23135-23143, 2001). Overall, Tn5 transposase is difficult to produce, it is required in large amounts, and it is very expensive.
However, as yet, no one has provided an alternative technology. It is generally believed that native unmutated transposases are inherently inactive because high activity would be incompatible with the host cell survival in the environment (Reznikoff W S. Mol. Microbiol., 2003, 47, 1199-206). Because native transposases are believed to possess low activity, they would be unsuitable for NGS sample preparation. In view of the fact that it took many years of mutagenesis and biological selection to render purified Tn5 transposase active, the task of providing another transposase that has suitable activity seems problematic. For example, in an attempt to construct superactive SB transposase for modification of eukaryotic cells, almost every single amino acid in it was mutated, small blocks of amino acids from related transposases were imported, and systematic alanine scanning and rational replacement of selected amino acid residues were applied (Ivies and Izvak, Mobile DNA, 1:25, 1-15, 2010). However, this effort resulted in variants with only modest increases in activity. Only a high throughput approach for combining such variants resulted in a variant with desired activity. An additional difficulty in obtaining a suitable transposase is that, even assuming that a native transposase is sufficiently active for in vitro manipulations, transposase activity might be lost during its purification process when it is subjected to the unnatural environments that are typically employed during conventional protein purification, i.e., high or low salt, alkaline or acidic pH, detergents, attachment to resins, absence of putative co-factors, etc.
As discussed below, the inventor addressed and solved the problems in the art by devising a new process for obtaining purified, active transposases. The solution obviated the need for conventional transposase purification by first forming the complex of transposase with oligonucleotides in crude cell lysates, which is a more physiological environment than employed in prior schemes, and more sparing for transposase activity, and then purifying the complex. An advantage of this approach is that transposase complexes with oligonucleotides are formed prior to the transposase purification. Another advantage of this approach is that it avoids the expensive and time consuming process of transposase purification seen in other technologies. Furthermore, attaching a transposase complex to a solid support, such as plates or beads, provides a technical solution for high throughput plate or bead format sample preparation for NGS.