DNA sequencing is an important analytical technique critical to generating genetic information from biological organisms. The increasing availability of rapid and accurate DNA sequencing methods has made possible the determination of the DNA sequences of entire genomes, including the human genome. DNA sequencing has revolutionized the field of molecular biological research. In addition, DNA sequencing has become an important diagnostic tool in the clinic, where the rapid detection of a single DNA base change or a few base changes can be used to detect for example, a genetic disease or cancer.
Most current methods of DNA sequencing are based on the method of Sanger (Proc. Natl. Acad Sci. U.S.A., 74, 5463 (1977)). This method relies on gel electrophoresis of single stranded nucleic acid fragments that are generated when a polymerization-extension reaction of a primer is terminated by incorporation of a radioactively labeled dideoxynucleotide-triphosphate. Short strands of DNA are synthesized under conditions that produce DNA fragments of variable length using a DNA polymerase and deoxynucleotide triphosphates (dNTP). A small amount of dideoxynucleotide triphosphates (ddNTP) is introduced into the DNA synthesis mixture so that chain terminating ddNTPs are sometimes integrated into a growing strand. Typically, four different extension reactions are performed side by side, each including a small amount of one ddNTP. Each extension reaction produces a mixture of DNA fragments of different lengths terminated by a known ddNTP. The ratio of ddNTPs to dNTPs is chosen so that the populations of DNA fragments in any given extension reaction includes fragments of all possible lengths (up to some maximum) terminating with the relevant ddNTP. The nucleic acid fragments are separated by length in the gel, typically utilizing a different lane in a polyacrylamide gel for each of the four terminating nucleotide bases being detected. However, such size exclusion chromatography is generally a low resolution method limited to reading short sequences.
A variation of this method utilizes dyes rather than radioactivity to label the ddNTPs. Different dyes are used to uniquely label each of the different ddNTPs (i.e., a different dye, may be associated with each of A, G, C, and T termination) (Smith et al. and Prober et al. Science 238:336-341, 1987). In the method of Smith,. fluorescent dyes are attached to the 3′ end of the dNTP converting it into a ddNTP. The use of four different dye labels allows the entire sequencing reaction to be conducted in a single reaction vessel and results in a more uniform signal response for the different DNA fragments. The dye-terminated dNTPs are also able to be electrophoresed in a single lane. The advent of capillary electrophoresis further increased the separation efficiency of this method, allowing shorter run times, longer reads, and higher sensitivity.
Despite these advances, DNA sequencing methods that rely on electrophoresis to resolve DNA fragments according to their size are limited by the rate of the electrophoresis and the number of bases that are detectable on the gel. In addition, real-time imaging of the gel is not possible. Accordingly, in order to increase the speed and reliability of the sequencing reaction, great effort has been made to automate these steps. Automated DNA sequencing machines are now available that are capable of high throughput sequencing for both genomic sequencing and routine clinical applications. However, these newer techniques remain cumbersome, requiring specialized chemicals and the intensive labor of skilled technicians.
One newer method of DNA sequencing, “pyrosequencing” or “sequencing-by-synthesis,” disclosed in WO 98/13523, is based on the concept of detecting inorganic pyrophosphate (PPi), which is released during a polymerase reaction. As in the Sanger method, a sequencing primer is hybridized to a single stranded DNA template and incubated with a DNA polymerase. In addition to the polymerase, the enzymes ATP sulfurylase, luciferase, and apyrase, and the substrates, adenine 5′ phosphosulfate (APS) and luciferin, are added to the reaction. Subsequently, individual nucleotides are added. When the added nucleotide is complementary to the next available base in the template strand, it is incorporated into the extension product. Such incorporation of a complementary base is accompanied by release of pyrophosphate (PPi), which is converted to ATP in the presence of adenosine 5′ phorphosulfate by apryase in a quantity equimolar to the amount of incorporated nucleotide. The ATP generated by the reaction with apyrase then drives the luciferase mediated conversion of luciferin to oxyluciferin, generating visible light in amounts that are proportional to the amount of ATP and thus the number of nucleotides incorporated into the growing DNA template. The light produced by the luciferase-catalyzed reaction is detected by a charge coupled device (CCD) camera and detected as a peak in pyrogram™.
In a pyrosequencing reaction, if the first nucleotide added to the reaction is not complementary to the next available nucleotide on the growing DNA strand there is no light generated. If no light is generated by the addition of the first nucleotide, a second of four dNTPs is added sequentially to the reaction to test whether it is the complementary nucleotide. This process is continued until a complementary nucleotide is added and detected by a positive light read-out. Whether or not a positive light reaction is generated, apyrase, a nucleotide-degrading enzyme, continuously degrades unincorporated dNTPs and excess ATP in the reaction mixture. When degradation is complete, another dNTP is added.
Although pyrosequencing is capable of generating high quality data in a relatively simple fashion, this method has several drawbacks. First, the productivity of the method is not high, reading only about 1 base per 100 seconds. The rate of the reaction is limited by the necessity of having to add new enzymes with each addition of the dNTPs in addition to the necessity of having to test each of the four dNTPs separately. In addition, it has been found that the dATP used in the chain extension reaction interferes in subsequent luciferase-based detection reactions by acting as a substrate for the luciferase enzyme. Finally, these reactions are expensive to run.
While pyrosequencing improves the ease and speed with which DNA sequencing is achieved, there exists the need for improved sequencing methods that allow more rapid detection. Preferred techniques would be amenable to automation and allow the sequence information to be revealed simultaneously with or shortly after the chain extension reaction.