1. Field of the Invention
The present invention relates to a fast method for the determination of a sequence of a nucleic acid, DNA or RNA, which is useful, in particular, for the sequencing of an unknown nucleic acid or alternatively for the detection of a specific nucleic acid sequence for diagnosis.
2. Description of Related Art
Nowadays, the determination of nucleic acid sequence is at the heart of molecular biology. For example, a broad range of biological phenomena can be assessed by high-throughput DNA sequencing, e.g., genetic variation, RNA expression, protein-DNA interactions and chromosome conformation (see, for a few examples, Mitreva & Mardis, Methods Mol. Biol., 533:153-87, 2009; Mardis, Genome Med., 1(4): 40, 2009; Cloonan et al., Nat Methods, 5(7): 613-619, 2008; Valouev et al., Genome Res., 18(7):1051-63, 2008, Valouev et al., Nat. Methods., 5(9):829-34, 2008; Orscheln et al., Clin Infect Dis., 49(4):536-42, 2009; Walter et al., Proc Natl Acad Sci USA., 106(31):12950-5, 2009; Mardis et al., N Engl J. Med., 361(11):1058-66, 2009, Hutchinson, Nucl. Acids Res., 35(18): 6227-6237, 2007).
In addition, demonstration of the presence of a specific DNA sequence in a physiological sample constitutes, at the present time, the major line of development of diagnostic methods, e.g. for identifying the probability of bacteria of developing antibiotic resistance, genetic abnormalities, the risks of cancer associated with genetic modifications and viral infections, for example infections associated with HIV or with hepatitis viruses (see for example Zhang et al., Nature, 358: 591-593, 1992; Turner et al., J Bacteriol, 176(12):3708-3722, 1994; Weston et al., Infection and Immunity., 77(7):2840-2848, 2009).
Nucleic acid sequencing is nowadays carried out chiefly with capillary-based, semi-automated implementations of the Sanger biochemistry. The classical method comprises a step of amplification of the DNA of interest, followed by a step of ‘cycle sequencing’, wherein each round of primer extension is stochastically terminated by the incorporation of fluorescently labelled dideoxynucleotides (ddNTPs). Sequence is determined by high-resolution electrophoretic separation of the single-stranded, end-labelled extension products in a capillary based polymer gel. Simultaneous electrophoresis in 96 or 384 independent capillaries provides a limited level of parallelization.
The high demand for low-cost sequencing has driven the development of high-throughput sequencing technologies that parallelize the sequencing process, producing thousands or millions of sequences at once (Shendure & Ji, Nat. Biotechnol., 26(10):1135-45. 2008). High-throughput sequencing technologies are intended to lower the cost of DNA sequencing beyond what is possible with standard dye-terminator methods. At present this very high throughput is achieved with substantial sacrifices in length and accuracy of the individual reads when compared to Sanger sequencing. Examples of such new methods include the 454 and the Solexa technologies. These technologies allow shotgun sequencing of whole genomes without cloning in E. coli or any host cell. Libraries of short, adaptor-flanked DNA fragments captured on the surface of beads are amplified by emulsion PCR. Sequencing is carried out using primed synthesis by DNA polymerase. In the 454 method (also known as ‘pyrosequencing’), the array is presented with each of the four dNTPs, sequentially, and the amount of incorporation is monitored by luminometric detection of the pyrophosphate released. A key difference between this method and the Solexa is that the latter uses chain-terminating nucleotides. The fluorescent label on the terminating base can be removed to leave an unblocked 3′ terminus, making chain termination a reversible process. The SOLiD technology relies on the ligation of fluorescently labeled di-base probes to a sequencing primer hybridized to an adaptor sequence within the clonally-amplified library template. Specificity of the di-base probe is achieved by interrogating every 1st and 2nd base in each ligation reaction. Multiple cycles of ligation, detection and cleavage are performed with the number of cycles determining the eventual read length. In contrast to the three previous technologies, which all require a first step of amplification, the Helicos platform allows the sequencing of single DNA molecules. This technology is based on the use of a highly sensitive detection system of fluorescent nucleotides incorporation to directly interrogate single DNA molecules via sequencing by synthesis.
Such methods are described in e.g. U.S. Pat. No. 4,882,127, U.S. Pat. No. 4,849,077; U.S. Pat. No. 7,556,922; U.S. Pat. No. 6,723,513; PCT Patent Application No. WO 03/066896; PCT Patent Application No. WO2007111924; U.S. Patent Application No. US 2008/0020392; PCT Patent Application No. WO 2006/084132; U.S. Patent Application No. US 2009/0186349; U.S. Patent Application No. US 2009/0181860; U.S. Patent Application No. US 2009/0181385; U.S. Patent Application No. US 2006/0275782; European Patent EP-B1-1141399; Shendure & Ji, Nat. Biotechnol., 26(10):1135-45. 2008; Pihlak et al., Nat. Biotechnol., 26(6): 676-684, 2008; Fuller et al., Nature Biotechnol., 27(11): 1013-1023, 2009; Mardis, Genome Med., 1(4): 40, 2009; Metzker, Nature Rev. Genet., 11(1): 31-46, 2010.
However, all the methods developed so far suffer from serious drawbacks. In particular, they all make use of labelled nucleotides (e.g. fluorescent), thus contributing to seriously increasing the overall costs. Moreover, all these new methods bar one (the Helicos platform) require amplification of the target sequence prior to sequencing, which is time consuming on the one hand, increases the probability of errors on the other hand, and is highly prone to contamination.