Different methods for isolating nucleic acids are well-known in the prior art. Such methods involve separating nucleic acids of interest from other sample components, such as for example protein contaminations or potentially also other nucleic acids, also often referred to as non-target nucleic acids. E.g. methods for isolating nucleic acids such as DNA from various sample materials by binding them to a silica material in the presence of a chaotropic salt are well-known and established in the prior art. Exemplary methods that are based on said principle are e.g. described in EP 0 389 063, WO 03/057910, EP 0 757 106, U.S. Pat. No. 6,037,465 and WO 2006/084753.
If it is intended to isolate a specific nucleic acid of interest from other nucleic acids the separation process is usually based on differences in parameters of the target and the non-target nucleic acid such as for example their topology (for example super-coiled DNA from linear DNA), their size (length) or chemical differences (e.g. DNA from RNA) and the like.
For certain applications differences in the size is an important criterion to distinguish target nucleic acids from non-target nucleic acids. E.g. size selective fractionation of DNA is an important step in the library construction for next generation sequencing (NGS) applications. Different NGS technologies and methods exist such as pyrosequencing, sequencing by synthesis or sequencing by ligation. Most NGS platforms share a common technological feature, namely the massively parallel sequencing of clonally amplified or single DNA molecules that are spatially separated in a flow cell or by generation of an oil-water emulsion.
In NGS, sequencing is performed by repeated cycles of polymerase-mediated nucleotide extensions or, in one format, by iterative cycles of oligonucleotide ligation. As a massively parallel process, NGS generates hundreds of megabases to gigabases of nucleotide-sequence output in a single instrument run, depending on the platform. The inexpensive production of large volumes of sequence data is the primary advantage over conventional methods. Therefore, NGS technologies have become a major driving force in genetic research. Several NGS technology platforms have found widespread use and include, for example, the following NGS platforms: Roche/454, Illumina Solexa Genome Analyzer, the Applied Biosystems SOLiD™ system, Ion Torrent™ semiconductor sequence analyzer, PacBio® real-time sequencing and Helicos™ Single Molecule Sequencing (SMS). NGS technologies, NGS platforms and common applications/fields for NGS technologies are e.g. reviewed in Voelkerding et al (Clinical Chemistry 55:4 641-658, 2009) and Metzker (Nature Reviews/Genetics Volume 11, January 2010, pages 31-46).
Besides the feature that sequencing is performed in a massively parallel manner in NGS technologies, NGS technology platforms have in common that they require the preparation of a sequencing library which is suitable for massive parallel sequencing. Examples of such sequencing libraries include fragment libraries, mate-paired libraries or barcoded fragment libraries. Most platforms adhere to a common library preparation procedure with minor modifications before a “run” on the instrument. This procedure includes fragmenting the DNA (which may also be obtained from cDNA), e.g. by mechanical shearing, such as sonification, hydro-shearing, ultrasound, nebulization or enzymatic fragmentation followed by DNA repair and end polishing (blunt end or A overhang) and, finally, platform-specific adaptor ligation. The preparation and design of such sequencing libraries is also described e.g. in Voelkerding, 2009 and Metzker, 2010.
In order to ensure high quality sequencing data, efficient library preparation-methods are needed. Furthermore, to reduce the background in the sequencing reads, it is important to remove DNA contaminants that might be present in the sequence library as a result of the library preparation. An example of such DNA contaminants are adapter monomers and adapter-adapter ligation products that are often present in the sequencing library after adapter ligation. These contaminating small DNA molecules must be removed prior to sequencing.
To ensure efficient adaptor ligation, adaptors are used in excess during adapter ligation. Thus, after adapter ligation, unligated adaptor monomers and adaptor-adaptor ligation products such as adapter dimers are present in addition to the adaptor ligated DNA molecules. It is important to remove unligated adaptor monomers and adaptor-adaptor ligation products from the adaptor ligated DNA molecules. Otherwise, unligated adaptor monomers and adaptor-adaptor ligation products will use up sequencing capacity, thereby diminishing the power available to investigate sequences of interest. If the sequencing library comprises considerable amounts of unligated adapter monomers and adapter dimers, valuable sequencing resources are wasted. Therefore, removing unligated adaptor monomers and adaptor-adaptor ligation products increases the value of the downstream sequencing. Unligated adaptor monomers and adaptor-adaptor ligation products are usually removed by a size selective purification of the larger adaptor ligated DNA molecules, which contain the DNA fragments to be sequenced.
Several approaches were developed in the prior art in order to isolate DNA of a specific target size, respectively a specific target size range. These size selection methods can be used in order to remove adapter dimers and monomers, as these DNA contaminations have a size that is smaller than the adapter ligated DNA molecules. A classic method for isolating DNA of a target size involves the separation of the DNA in a gel, cutting out the desired gel band(s) and then isolating the DNA of the target size from the gel fragment(s). Respective gel based size selection methods are often recommended in many next generations sequencing library preparation protocols in order to remove adapter monomers and dimers. However, respective methods are time consuming, as the portion of the gel containing the nucleic acids of interest must be manually cut out and then treated to degrade the gel or otherwise extract the DNA of the target size from the gel slice.
Another widely used technology is the size selective precipitation with polyethylene glycol based buffers (Lis and Schleif Nucleic Acids Res. 1975 March; 2(3):383-9) or the binding/precipitation on carboxyl-functionalized beads (DeAngelis et al, Nuc. Acid. Res. 1995, Vol 23(22), 4742-3; U.S. Pat. Nos. 5,898,071 and 5,705,628, commercialized by Beckman-Coulter (AmPure XP; SPRIselect) and U.S. Pat. No. 6,534,262). Even if it has been established as the “gold standard” in size selection in NGS, the procedure is time consuming and cumbersome especially when doing it manually. Polyethylene glycol based isolation methods are in particular disadvantageous because of the highly viscous polyethylene glycol which may hamper efficient washing. In addition there is a risk of bead carry-over which may have a disadvantageous impact on downstream reactions such as e.g. subsequent enzymatic reactions. Size selection methods that are based on the use of titratable anion exchange compositions and pH gradients are described e.g. in WO 03/080834.
The prior art shows that there is an increasing interest and need for methods allowing the size selective isolation of DNA molecules, in particular of DNA molecules having a certain minimum size. In particular, there is a need for simple, efficient methods for isolating DNA of a specific minimum size that can be integrated into existing next generation sequencing library preparation protocols. Furthermore, there is a need for fast, simple and reliable methods for removing unligated adapter monomers and adapter dimers from adapter ligation samples, in particular adapter ligation samples obtained in the preparation of a sequencing library.
Therefore, it is an object of the present invention to provide a method for isolating DNA of a target size or a target size range from a sample comprising DNA molecules of different sizes. In particular, it is the object of the present invention to provide a method that allows to separate adapter ligated DNA molecules from unligated adapter monomers and adapter-adapter ligation products based on the larger size of the adapter ligated DNA molecules. In particular, it is an object to provide respective methods that are fast, reliable and can be integrated into the work-flow of next generation sequencing library preparation protocols.