For many applications in molecular biology, and particularly applications in global gene expression analysis, it is often necessary to separate desired nucleic acid sequences from other non-desired sequences. Often, this separation is based on size differences in the nucleic acids. One such global gene expression analysis application—wherein smaller undesired nucleic acids must be separated from larger, desired nucleic acids—is in massive parallel sequencing, also known as next generation sequencing. Next generation sequencing (NGS) involves the sequencing of a large number of reads (as much as over 40 billion) per instrument run.
There are many different platforms that can be used for next generation sequencing, including Roche 454, Roche GS FLX Titanium, Illumina MiSeq, Illumina HiSeq, Illumina Genome Analyzer IIX, Life Technologies SOLiD4, Life Technologies Ion Proton, Complete Genomics, Helicos Biosciences Heliscope, and Pacific Biosciences SMRT. All of the different platforms for next generation sequencing follow the same general procedure, namely preparing the purified RNA or DNA into a sequencing library, followed by massive parallel sequencing of relatively short sequences and subsequent bioinformatics to de-multiplex samples, align, annotate and aggregate reads, among other things.
In order to generate a sequencing library, multiple enzymatic reactions are required to modify and/or amplify the original input nucleic acid (RNA or DNA). In particular, fragments of nucleic acids, usually in the form of RNA or DNA oligonucleotides, are added to both the 5′ and 3′ end of the template nucleic acid targeted for sequencing. The best quality libraries will ensure that all the resulting reads obtained are dedicated to the targeted RNA or DNA template. However, in many cases, the usable read number is drastically reduced due to various contaminants being incorporated into the library, including adapter monomers as well as adapter-adapter ligation products. In addition, as these contaminants are of nucleic acid-origin, their presence in the final library preparation could negatively affect the accuracy of library quantification and the subsequent amount of library loaded onto the sequencing platform.
Different purification technologies have been utilized to tackle the issue of removing unligated adapter monomers and adapter-adapter ligation products from NGS libraries. Some methods are directed at removing the excessive adapters prior to cDNA synthesis and PCR amplification, while others are directed at purifying the desired library containing the inserts of interest for sequencing from gels based on size. However, all of these current methods have shortcomings. For example, gel purification usually requires a lengthy workflow and subsequent purification and will add an extensive amount of time to the library preparation (sometimes overnight). In addition, most of the existing purification systems may not be able to resolve small size differences at the low nucleic acid molecular weight range (such as in the case of small RNA sequence ligation steps).