Ever since Frederick Sanger determined genomic sequence for the first time, DNA sequencing technologies have been developing fast over more than 30 years, and three generations of DNA sequencing technologies with distinct characteristics evolved over time. Among them, the second generation sequencing (or, next generation sequencing, abbreviated as NGS) technology is mainly characterized by highly increased throughput, remarkably reduced cost and greatly shortened cycle and is thus widely used in life science basic theory research and bio-industry applications.
The 21st century sees tremendous changes in the technologies of DNA sequencing, predominantly characterized by a sharp increase in sequencing throughput (sequencing data size) and a drastic decrease in the sequencing cost for each base in the raw data. The second generation sequencing technology, particularly represented by Roche, Illumina and Life Technologies, occupies most of the sequencing market thanks to sequencing by ligation or sequencing by synthesis, including pyrosequencing and reversible chain termination. Commercially available instruments from the three companies deliver several Gbp of DNA sequence per week in the form of short contiguous fragments or reads, resulting in the cost of sequencing being greatly decreased in comparison with that of the first generation sequencing technology.
With the ongoing development of sequencing technologies, the trend of development of the sequencing industry can be summarized to be higher throughput, higher accuracy and lower cost. Rapid advancement will certainly be achieved in the former two aspects with the development of technologies, while the cost of sequencing is still considerably higher than people's expectation of detecting human whole genome for 1000 US dollars (30×, about 10 US dollars per Gb), despite the annual drop in sequencing cost as technologies progress. Therefore, the step of sample library preparation on the upstream of the sequencing process would be a key factor in further greatly decreasing sequencing cost, and represents the main technological direction in further development of sequencing technologies.
The basic principle underlying NGS library construction involves randomly breaking DNA or RNA of interest into small fragments and ligating adaptor sequences suitable for sequencing platforms. Generally, the following central steps are included: fragmenting DNA or RNA and selecting fragments having the desired sizes; converting the fragments into double-stranded DNA; ligating adaptor sequences suitable for sequencing platforms; and conducting quality inspection on the resulting library. Library size is one of the most critical technical indexes in NGS library construction.
In general, there are mainly two types of methods used in NGS library fragmentation: physical method and enzymatic method.
The physical method mainly involves a Covaris disruptor based on proprietary Adaptive Focused Acoustics (AFA) technology, whereby geometrically focused acoustic energy is utilized under isothermal condition. Acoustic energy having a wavelength of 1 mm is focused to a sample by a spherical solid-state ultrasonic transducer of >400kHz. The method ensures the maintenance of the completeness of nucleic acid samples and achieves a high recovery rate. The Covaris disruptor includes the economical M series, the single-tube full-power S series and the higher throughput E and L series. The fragments obtained from the physical method exhibit a good fragment randomness. However, a number of Covaris disruptors are needed for the sake of throughput, and subsequent separate operations of terminal processing, adaptor ligation, PCR and various purifications are also required. Additionally, consumables associated with the Covaris disruptor need to be used, leaving only limited room for cost reduction.
The enzymatic method mainly involves NEB Next dsDNA Fragmentase available from NEB, or transposase from Nextera kit available from Epicentra (already purchased by Illumina). The former first utilizes DNase I to introduce random nicks into double-stranded DNA, then utilizes Fragmentase to recognize the nick positions to cleave the complementary DNA strands, thus achieving the aim of breaking the DNA. Such a reagent can be used in genome DNA, whole genome amplification products, PCR products etc. and provides good randomness. Nevertheless, it will generate some artificial short fragment insertions and deletions, and also inevitably entails subsequent separate operations of terminal processing, adaptor ligation, PCR and corresponding purifications. The latter, utilizing transposase, can achieve double-stranded DNA fragmentation and adaptor ligation at the same time, thus reducing the length of time for sample processing. However, the transposase embedded in the target sequence may inhibit subsequent enzymatic reactions, and possible purification steps would undoubtedly increase the cost and the time of library construction.